To compute consensus sequences, we used the
UniProtKB reviewed (Swiss-Prot)
fasta file.
We filtered protein sequences to compute the relative abundance of
each amino-acids relative to a given organism. The corresponding (observed) abundance
in O-GlcNAcylated segments for each -5 to +5 position
was computed and the ratio embedded in a log2 function. At position 0
(S, T, or ST residues) the output was normalized on the plot ceiling. We used the
logomaker
python library to generate the sequence logos.