Statistical Extremes of Amino Acid Residue Composition of the Proteome Proteins Can Explain the Origin of the Universality of the Genetic Code
DOI:
https://doi.org/10.51094/jxiv.575Keywords:
amino acid composition, transmembrane domain, intrinsically disordered region, genetic code, optimized translation theory, Chargaff’s second parity rule, GC content, TA skew, GC skewAbstract
Organisms have evolved and diverged from a common ancestor, and today there are many different species in many different environments. Because these organisms share a nearly identical genetic code, it is believed that all species have changed little in their genetic code from that of the ancestor over the course of evolution. However, the reasons for this universality, why almost all organisms have never changed their genetic code, are not well understood.
In the present study, principal component analyses of the amino acid residue composition of proteome proteins from different species revealed that proteins with high amounts of transmembrane domains (TMDs) and proteins with high amounts of intrinsically disordered regions (IDRs) almost universally occupy the two extremes of each proteome plot of their first and second principal components. These TMD- and IDR-rich proteins correlated not only with the amino acid composition of the proteins, but also with the nucleic acid composition of their corresponding genes.
In my previous report, I showed that the genetic code itself has a structure that can assist the generation of TMDs and IDRs by exploiting the partial biases of nucleic acid composition in gene sequences. With the current statistical analyses, I also showed that TMD- and IDR-rich proteins always occupy the statistical extremes of amino acid composition in the proteomes of different organisms. If TMDs and IDRs are always the two largest domains/regions with extreme amino acid composition in the proteome, and if the genetic code has a structure that helps synthesize TMDs and IDRs, then I can conclude that the structure of the current genetic code may have been chosen to meet the requirements of the typical amino acid composition of these functional domains. If this assumption is true, it would be reasonable to assume that such a genetic code has become universal.
This is a new explanation for the universality of the genetic code, and I call it "The Optimized Translation Theory". This theory should provide a partial explanation for the origin of the standard genetic code in terms of its functions.
Conflicts of Interest Disclosure
The author declare no conflicts of interest associated with this manuscript.Downloads *Displays the aggregated results up to the previous day.
References
Crick, F. H. C. (1968). The origin of the genetic code. In Journal of Molecular Biology (Vol. 38, Issue 3, pp. 367–379). Elsevier BV. https://doi.org/10.1016/0022-2836(68)90392-6
Hamashima, K., & Kanai, A. (2013). Alternative genetic code for amino acids and transfer RNA revisited. In BioMolecular Concepts (Vol. 4, Issue 3, pp. 309–318). Walter de Gruyter GmbH. https://doi.org/10.1515/bmc-2013-0002
Kun, Á., & Radványi, Á. (2018). The evolution of the genetic code: Impasses and challenges. In Biosystems (Vol. 164, pp. 217–225). Elsevier BV. https://doi.org/10.1016/j.biosystems.2017.10.006
Seki, M. (2023). On the origin of the genetic code. In Genes & Genetic Systems (Vol. 98, Issue 1, pp. 9–24). Genetics Society of Japan. https://doi.org/10.1266/ggs.22-00085
Esumi, G. (2023). The standard genetic code is designed to generate transmembrane domains and intrinsically disordered regions as projections of the thymine density on the gene. Jxiv. https://doi.org/10.51094/jxiv.533
Esumi, G. (2023). The TA Skew of a Gene Primarily Determines the Type of Protein, Such as Membrane Protein or Intrinsically Disordered Protein. Jxiv. https://doi.org/10.51094/jxiv.446
"Quest for Orthologs" group. (2023) Reference proteomes - Primary proteome sets for the Quest For Orthologs, RELEASE 2023_03. https://www.ebi.ac.uk/reference_proteomes/ Accessed 1 Sep 2023
Bateman, A., Martin, M.-J., Orchard, S., Magrane, M., Ahmad, S., Alpi, E., Bowler-Barnett, E. H., Britto, R., Bye-A-Jee, H., Cukura, A., Denny, P., Dogan, T., Ebenezer, T., Fan, J., Garmiri, P., da Costa Gonzales, L. J., Hatton-Ellis, E., Hussein, A., … Zhang, J. (2022). UniProt: the Universal Protein Knowledgebase in 2023. In Nucleic Acids Research (Vol. 51, Issue D1, pp. D523–D531). Oxford University Press (OUP). https://doi.org/10.1093/nar/gkac1052 Accessed 1 Sep 2023
Esumi, G. (2023). The Synonymous Codon Usage of a Protein Gene Is Primarily Determined by the Guanine + Cytosine Content of the Individual Gene Rather Than the Species to Which It Belongs To Synthesize Proteins With a Balanced Amino Acid Composition. Jxiv. https://doi.org/10.51094/jxiv.561
Nirenberg, M. W., & Matthaei, J. H. (1961). The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. In Proceedings of the National Academy of Sciences (Vol. 47, Issue 10, pp. 1588–1602). Proceedings of the National Academy of Sciences. https://doi.org/10.1073/pnas.47.10.1588
Fariselli, P., Taccioli, C., Pagani, L., & Maritan, A. (2020). DNA sequence symmetries from randomness: the origin of the Chargaff’s second parity rule. In Briefings in Bioinformatics (Vol. 22, Issue 2, pp. 2172–2181). Oxford University Press (OUP). https://doi.org/10.1093/bib/bbaa041
Esumi, G. (2023). The Nucleic Acid Sequences of the Genome Are Highly Structured on a Genome-Wide Scale in Terms of Nucleic Acid Composition Indices Such as TA Skew and GC Skew. Jxiv. https://doi.org/10.51094/jxiv.436
Osawa, S., Ohama, T., Jukes, T. H., & Watanabe, K. (1989). Evolution of the mitochondrial genetic code I. Origin of AGR serine and stop codons in metazoan mitochondria. In Journal of Molecular Evolution (Vol. 29, Issue 3, pp. 202–207). Springer Science and Business Media LLC. https://doi.org/10.1007/bf02100203
Nikolaou, C., & Almirantis, Y. (2006). Deviations from Chargaff’s second parity rule in organellar DNA. In Gene (Vol. 381, pp. 34–41). Elsevier BV. https://doi.org/10.1016/j.gene.2006.06.010
Downloads
Posted
Submitted: 2023-12-17 20:56:15 UTC
Published: 2023-12-22 04:20:16 UTC
License
Copyright (c) 2023
Genshiro Esumi
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.