Research Article

Prediction of protein Post-Translational Modifi cation sites: An overview

Md. Mehedi Hasan* and Mst. Shamima Khatun

Published: 03/02/2018 | Volume 2 - Issue 1 | Pages: 049-057


Post-translational modification (PTM) refers to the covalent and enzymatic modification of proteins during or after protein biosynthesis. In the protein biosynthesis process, the ribosomal mRNA is translated into polypeptide chains, which may further undergo PTM to form the product of mature protein [1].

Read Full Article HTML DOI: 10.29328/journal.apb.1001005 Cite this Article


  1. Knorre DG, Kudryashova NV, Godovikova TS. Chemical and functional aspects of posttranslational modification of proteins. Acta Naturae. 2009; 1: 29-51. Ref.:
  2. Xie L, Liu W, Li Q, Chen S, Xu M, et al. First succinyl-proteome profiling of extensively drug-resistant Mycobacterium tuberculosis revealed involvement of succinylation in cellular physiology. J Proteome Res. 2015; 14: 107-119. Ref.:
  3. Yang M, Yang J, Zhang Y, Zhang W. Influence of succinylation on physicochemical property of yak casein micelles. Food Chem. 2016; 190: 836-842. Ref.:
  4. Rohira AD, Chen CY, Allen JR, Johnson DL. Covalent small ubiquitin-like modifier (SUMO) modification of Maf1 protein controls RNA polymerase III-dependent transcription repression. J Biol Chem. 2013; 288: 19288-19295. Ref.:
  5. Medzihradszky KF. Peptide sequence analysis. Methods Enzymol. 2005; 402: 209-244. Ref.:
  6. Agarwal KL, Kenner GW, Sheppard RC. Feline gastrin. An example of peptide sequence analysis by mass spectrometry. J Am Chem Soc. 1969; 91: 3096-3097. Ref.:
  7. Welsch DJ, Nelsestuen GL. Amino-terminal alanine functions in a calcium-specific process essential for membrane binding by prothrombin fragment 1. Biochemistry. 1988; 27: 4939-4945. Ref.:
  8. Slade DJ, Subramanian V, Fuhrmann J, Thompson PR. Chemical and biological methods to detect post-translational modifications of arginine. Biopolymers. 2014; 101: 133-143. Ref.:
  9. Umlauf D, Goto Y, Feil R. Site-specific analysis of histone methylation and acetylation. Methods Mol Biol, 2004; 287: 99-120. Ref.:
  10. Jaffrey SR, Erdjument-Bromage H, Ferris CD, Tempst P, Snyder SH. Protein S-nitrosylation: a physiological signal for neuronal nitric oxide. Nat Cell Biol. 2001; 3: 193-197. Ref.:
  11. Doll S, Burlingame AL. Mass spectrometry-based detection and assignment of protein posttranslational modifications. ACS Chem Biol. 2015; 10: 63-71. Ref.:
  12. Richards AL, Hebert AS, Ulbrich A, Bailey DJ, Coughlin EE, et al. One-hour proteome analysis in yeast. Nat Protoc. 2015; 10: 701-714. Ref.:
  13. Hebert AS, Richards AL, Bailey DJ, Ulbrich A, Coughlin EE, et al. The one hour yeast proteome. Mol Cell Proteomics. 2014; 13: 339-347. Ref.:
  14. Imamura H, Sugiyama N, Wakabayashi M, Ishihama Y. Large-scale identification of phosphorylation sites for profiling protein kinase selectivity. J Proteome Res. 2014;13: 3410-3419. Ref.:
  15. Masuda T, Sugiyama N, Tomita M, Ishihama Y. Microscale phosphoproteome analysis of 10,000 cells from human cancer cell lines. Anal Chem. 2011; 83: 7698-7703. Ref.:
  16. Trinidad JC, Barkan DT, Gulledge BF, Thalhammer A, Sali A, et al. Global identification and characterization of both O-GlcNAcylation and phosphorylation at the murine synapse. Mol Cell Proteomics. 2012; 11: 215-229. Ref.:
  17. Olsen JV, Vermeulen M, Santamaria A, Kumar C, Miller ML, et al. Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis. Sci Signal. 2010; 3: ra3. Ref.:
  18. Choudhary C, Kumar C, Gnad F, Nielsen ML, Rehman M, et al. Lysine acetylation targets protein complexes and co-regulates major cellular functions. Science. 2009; 325: 834-840. Ref.:
  19. Kim W, Bennett EJ, Huttlin EL, Guo A, Li J, et al. Systematic and quantitative assessment of the ubiquitin-modified proteome. Mol Cell. 2011; 44: 325-340. Ref.:
  20. Hendriks IA, D’Souza RC, Yang B, Verlaan-de Vries M, Mann M, et al. Uncovering global SUMOylation signaling networks in a site-specific manner. Nat Struct Mol Biol. 2014; 21: 927-936. Ref.:
  21. Syka JE, Coon JJ, Schroeder MJ, Shabanowitz J, Hunt DF. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci U S A. 2004;101: 9528-9533. Ref.:
  22. Myers SA, Daou S, Affar el B, Burlingame A. Electron transfer dissociation (ETD): the mass spectrometric breakthrough essential for O-GlcNAc protein site assignments-a study of the O-GlcNAcylated protein host cell factor C1. Proteomics. 2013; 13: 982-991. Ref.:
  23. Ramstrom M, Sandberg H. Characterization of gamma-carboxylated tryptic peptides by collision-induced dissociation and electron transfer dissociation mass spectrometry. Eur J Mass Spectrom (Chichester, Eng). 2011; 17: 497-506. Ref.:
  24. Moremen KW, Tiemeyer M, Nairn AV. Vertebrate protein glycosylation: diversity, synthesis and function. Nat Rev Mol Cell Biol. 2012; 13: 448-462. Ref.:
  25. Han X, Yang K, Gross RW. Multi-dimensional mass spectrometry-based shotgun lipidomics and novel strategies for lipidomic analyses. Mass Spectrom Rev. 2012; 31: 134-178. Ref.:
  26. Tan M, Peng C, Anderson KA, Chhoy P, Xie Z, et al. Lysine glutarylation is a protein posttranslational modification regulated by SIRT5. Cell Metab. 2014; 19: 605-617. Ref.:
  27. Basu A, Rose KL, Zhang J, Beavis RC, Ueberheide B, et al. Proteome-wide prediction of acetylation substrates. Proc Natl Acad Sci U S A. 2009; 106: 13785-13790. Ref.:
  28. Striebel F, Imkamp F, Sutter M, Steiner M, Mamedov A, et al. Bacterial ubiquitin-like modifier Pup is deamidated and conjugated to substrates by distinct but homologous enzymes. Nat Struct Mol Biol. 2009; 16: 647-651. Ref.:
  29. DeMartino GN. PUPylation: something old, something new, something borrowed, something Glu. Trends Biochem Sci. 2009; 34: 155-158. Ref.:
  30. Passerini A, Punta M, Ceroni A, Rost B, Frasconi P. Identifying cysteines and histidines in transition-metal-binding sites using support vector machines and neural networks. Proteins. 2006; 65: 305-316. Ref.:
  31. Youn E, Peters B, Radivojac P, Mooney SD. Evaluation of features for catalytic residue prediction in novel folds. Protein Sci. 2007; 16: 216-226. Ref.:
  32. Sharma A, Rastogi T, Bhartiya M, Shasany AK, Khanuja SP. Type 2 diabetes mellitus: phylogenetic motifs for predicting protein functional sites. J Biosci. 2007; 32: 999-1004. Ref.:
  33. Vandermarliere E, Martens L. Protein structure as a means to triage proposed PTM sites. Proteomics. 2013; 13: 1028-1035. Ref.:
  34. Ren J, Wen L, Gao X, Jin C, Xue Y, et al. CSS-Palm 2.0: an updated software for palmitoylation sites prediction. Protein Eng Des Sel. 2008; 21: 639-644. Ref.:
  35. Liu Z, Cao J, Ma Q, Gao X, Ren J, et al. GPS-YNO2: computational prediction of tyrosine nitration sites in proteins. Mol Biosyst. 2011; 7: 1197-1204. Ref.:
  36. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997; 25: 3389-3402. Ref.:
  37. Hasan MM, Khatun MS. Recent progress and challenges for protein pupylation sites prediction. EC Proteomics and Bioinformatics. 2017; 2.1: 36-45.
  38. Hasan MM, Zhou Y, Lu X, Li J, Song J, et al. Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs. PLoS One. 2015; 10: e0129635. Ref.:
  39. Gobel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins. 1994;18: 309-317. Ref.:
  40. Lockless SW, Ranganathan R. Evolutionarily conserved pathways of energetic connectivity in protein families. Science. 1999; 286: 295-299. Ref.:
  41. Dekker JP, Fodor A, Aldrich RW, Yellen G. A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments. Bioinformatics. 2004; 20: 1565-1572. Ref.:
  42. Hasan MM, Khatun MS, Mollah MNH, Yong C, Guo D. A systematic identification of species-specific protein succinylation sites using joint element features information. Int J Nanomedicine. 2017; 12: 6303-6315. Ref.:
  43. Halperin I, Glazer DS, Wu S, Altman RB. The FEATURE framework for protein function annotation: modeling new functions, improving performance, and extending to novel applications. BMC Genomics. 2008; 9 Suppl 2: S2. Ref.:
  44. Mooney SD, Liang MH, DeConde R, Altman RB. Structural characterization of proteins using residue environments. Proteins. 2005; 61: 741-747. Ref.:
  45. Amitai G, Shemesh A, Sitbon E, Shklar M, Netanely D, et al. Network analysis of protein structures identifies functional residues. J Mol Biol. 2004; 344: 1135-1146. Ref.:
  46. Rani P, Pudi V. RBNBC: Repeat Based Naive Bayes Classifier for Biological Sequences. Icdm 2008: Eighth Ieee International Conference on Data Mining, 2008; Proceedings: 989-994.
  47. David J. Hand KY. Idiot’s Bayes: Not So Stupid after All? International Statistical Review /Revue Internationale de Statistique, 2001; 69: 385-398.
  48. Shao J, Xu D, Tsai SN, Wang Y, Ngai SM. Computational identification of protein methylation sites through bi-profile Bayes feature extraction. PLoS One. 2009; 4: e4920. Ref.:
  49. Zhang SW, Pan Q, Zhang HC, Shao ZC, Shi JY. Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion. Amino Acids. 2006; 30: 461-468. Ref.:
  50. Sheppard S, Lawson ND, Zhu LJ. Accurate identification of polyadenylation sites from 3’ end deep sequencing using a naive Bayes classifier. Bioinformatics. 2013; 29: 2564-2571. Ref.:
  51. Yang P, Humphrey SJ, Fazakerley DJ, Prior MJ, Yang G, et al. Re-fraction: a machine learning approach for deterministic identification of protein homologues and splice variants in large-scale MS-based proteomics. J Proteome Res. 2012; 11: 3035-3045. Ref.:
  52. Simon P. Too Big to Ignore: The Business Case for Big Data. Wiley, 2013; 89.
  53. Breiman L. Random Forests. Machine Learning, 2001; 45: 5-32. Ref.:
  54. Maclin R, Opitz D. Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research. 1999; 11: 169-198. Ref.:
  55. Polikar R. Ensemble based systems in decision making. Circuits and systems magazine, IEEE. 2006; 6: 21-45. Ref.:
  56. Rokach L. Ensemble-based classifiers. Artificial Intelligence Review. 2010; 33: 1-39. Ref.:
  57. Brown G, Wyatt J, Harris R, Yao X. Diversity creation methods: a survey and categorisation. Information Fusion. 2005; 6: 5-20. Ref.:
  58. Adeva JJG, Beresi U, Calvo R. Accuracy and diversity in ensembles of text categorisers. CLEI Electronic Journal. 2005; 9: 1-12. Ref.:
  59. Liu ZP, Wu LY, Wang Y, Zhang XS, Chen L. Prediction of protein-RNA binding sites by a random forest method with combined features. Bioinformatics. 2010; 26: 1616-1622. Ref.:
  60. Kumar KK, Pugalenthi G, Suganthan PN. DNA-Prot: identification of DNA binding proteins from protein sequence information using random forest. J Biomol Struct Dyn. 2009; 26: 679-686. Ref.:
  61. Qi Y, Klein-Seetharaman J, Bar-Joseph Z. Random forest similarity for protein-protein interaction prediction from multiple sources. Pac Symp Biocomput. 2005; 531-542. Ref.:
  62. Hasan MM, Guo D, Kurata H. Computational identification of protein S-sulfenylation sites by incorporating the multiple sequence features information. Mol Biosyst. 2017; 13: 2545-2550. Ref.:
  63. Hasan MM, Yang S, Zhou Y, Mollah MN SuccinSite: a computational tool for the prediction of protein succinylation sites by exploiting the amino acid patterns and properties. Mol Biosyst, 2016; 12: 786-795. Ref.:
  64. Cornia C, Vapnik V. Support-vector networks. Machine Learning. 1995; 20: 273-297. Ref.:
  65. Chang CC. LIBSVM: A Library for Support Vector Machines. ACM transactions on intelligent systems and technology. 2011; 2. Ref.:
  66. Pavlidis P, Wapinski I, Noble WS. Support vector machine classification on the web. Bioinformatics. 2004; 20: 586-587. Ref.:
  67. Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004; 20: 2479-2481. Ref.:
  68. Chen X, Qiu JD, Shi SP, Suo SB, Liang RP. Systematic analysis and prediction of pupylation sites in prokaryotic proteins. PLoS One. 2013; 8: e74002. Ref.:
  69. Tung CW. Prediction of pupylation sites using the composition of k-spaced amino acid pairs. J Theor Biol. 2013; 336: 11-17. Ref.:
  70. Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics. 2008; 24: 924-931. Ref.:
  71. Yan RX, Si JN, Wang C, Zhang Z. DescFold: a web server for protein fold recognition. BMC Bioinformatics. 2009; 10: 416. Ref.:
  72. Guo J, Chen H, Sun Z, Lin Y. A novel method for protein secondary structure prediction using dual-layer SVM and profiles. Proteins. 2004; 54: 738-743. Ref.:
  73. Minsky MSP. An Introduction to Computational Geometry. 1969; ISBN 0-262-63022-2.
  74. Fukushima K. Cognitron: a self-organizing multilayered neural network. Biol Cybern, 1975; 20: 121-136. Ref.:Ref.:
  75. Tang YR, Chen YZ, Canchaya CA, Zhang Z. GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network. Protein Eng Des Sel. 2007; 20: 405-412. Ref.:
  76. Blom N, Sicheritz-Ponten T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 2004; 4: 1633-1649. Ref.:
  77. Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts P, et al. Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0. Bioinformatics. 2009; 25: 2537-2543. Ref.:
  78. Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol. 1999; 292: 195-202. Ref.:
  79. McGuffin LJ, Bryson K, Jones DT. The PSIPRED protein structure prediction server. Bioinformatics. 2000; 16: 404-405. Ref.:
  80. Bienkowska JR, Dalgin GS, Batliwalla F, Allaire N, Roubenoff R, et al. Convergent Random Forest predictor: methodology for predicting drug response from genome-scale data applied to anti-TNF response. Genomics. 2009; 94: 423-432. Ref.: