プレプリント / バージョン1

The standard genetic code is designed to generate transmembrane domains and intrinsically disordered regions as projections of the thymine density on the gene

##article.authors##

DOI:

https://doi.org/10.51094/jxiv.533

キーワード:

standard genetic code、 thymine composition、 transmembrane domains、 intrinsically disordered regions、 optimized translation

抄録

We know that the codon-amino acid correspondence in the genetic code is not random. However, there was no established theory as to whether this correspondence was designed for any purpose or function. In a previous report, I showed that the proteins with high amounts of transmembrane domains and the proteins with high amounts of intrinsically disordered regions correspond to the high and low TA (thymine adenine) skew of their gene, respectively, and I speculated that these reflect the purpose behind the design of the genetic code. However, since most protein genes use their synonymous codon selection to balance their GC (guanine cytosine) content, i.e., their TA content, I hypothesized that the amount of only one of these two nucleic acids, thymine or adenine, might actually originate the characteristics of the amino acid composition of these two functional domains/regions.

In this study, I examined the correspondence between these two functional domains/regions and the estimated composition of each nucleic acid of various protein genes from different organism proteomes by back-calculating the possible nucleic acid compositions of the gene from the amino acid residue composition of the protein.

The results showed that the proteins with high amounts of transmembrane domains and the proteins with high amounts of intrinsically disordered regions were indeed correlated with the higher and lower estimated thymine composition on the genes, respectively. Upon detailed analysis, the transmembrane domains correlated more strongly with the maximum estimated thymine composition and the intrinsically disordered regions correlated more strongly with the minimum estimated thymine composition.

Since the amino acid compositions of membrane proteins with higher thymine composition genes correspond to the maximum estimated thymine compositions, and the amino acid compositions of intrinsically disordered proteins with lower thymine composition genes correspond to the minimum estimated thymine compositions, it is more reasonable to assume that the characteristic amino acid compositions of the two domains/regions are both formed by the thymine densities of the genes, rather than these thymine density structures being formed by selective pressure on amino acid compositions.  Thus, the functions of these two functional domains/regions are thought to arise as projections of the thymine densities of their properly preformed gene sequences.

The results shown in this study suggest that the standard genetic code has an optimized structure that allows for optimized translation and synthesis of the functional domains of proteins. I conclude that the current genetic code must have been selected for this functional advantage, and I propose this as the "optimized translation" theory that explains the origin of the genetic code.

利益相反に関する開示

The author declare no conflicts of interest associated with this manuscript.

ダウンロード *前日までの集計結果を表示します

ダウンロード実績データは、公開の翌日以降に作成されます。

引用文献

Esumi, G. (2023). The TA Skew of a Gene Primarily Determines the Type of Protein, Such as Membrane Protein or Intrinsically Disordered Protein. Jxiv. https://doi.org/10.51094/jxiv.446

Esumi, G. (2022). Synonymous codon usage and its bias in the bacterial proteomes primarily offset guanine and cytosine content variation to maintain optimal amino acid compositions. Jxiv. https://doi.org/10.51094/jxiv.99

"Quest for Orthologs" group. (2023) Reference proteomes - Primary proteome sets for the Quest For Orthologs, RELEASE 2023_03. https://www.ebi.ac.uk/reference_proteomes/ Accessed 1 Sep 2023

The UniProt Consortium. (2023) UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 51:D523–D531. https://doi.org/10.1093/nar/gkac1052

Esumi, G. (2023). The Distributions of Amino Acid Compositions of Proteins in an Organism’s Proteome Uniformly Approximate Binomial Distributions. Jxiv. https://doi.org/10.51094/jxiv.408

Vakirlis, N., Acar, O., Hsu, B., Castilho Coelho, N., Van Oss, S. B., Wacholder, A., Medetgul-Ernar, K., Bowman, R. W., II, Hines, C. P., Iannotta, J., Parikh, S. B., McLysaght, A., Camacho, C. J., O’Donnell, A. F., Ideker, T., & Carvunis, A.-R. (2020). De novo emergence of adaptive membrane proteins from thymine-rich genomic sequences. In Nature Communications (Vol. 11, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1038/s41467-020-14500-z

Efimov, V. M., Efimov, K. V., Kovaleva, V. Yu., & Matushkin, Yu. G. (2021). Principal Components of Genetic Sequences: Correlations and Significance. In Mathematical Biology and Bioinformatics (Vol. 16, Issue 2, pp. 299–316). Institute of Mathematical Problems of Biology of RAS (IMPB RAS). https://doi.org/10.17537/2021.16.299

Esumi, G. (2023). The Nucleic Acid Sequences of the Genome Are Highly Structured on a Genome-Wide Scale in Terms of Nucleic Acid Composition Indices Such as TA Skew and GC Skew. Jxiv. https://doi.org/10.51094/jxiv.436

Crick, F. H. C. (1968). The origin of the genetic code. In Journal of Molecular Biology (Vol. 38, Issue 3, pp. 367–379). Elsevier BV. https://doi.org/10.1016/0022-2836(68)90392-6

Haig, D., & Hurst, L. D. (1991). A quantitative measure of error minimization in the genetic code. Journal of Molecular Evolution, 33(5), 412–417. https://doi.org/10.1007/BF02103132

Haig, D., & Hurst, L. D. (1991). A quantitative measure of error minimization in the genetic code. In Journal of Molecular Evolution (Vol. 33, Issue 5, pp. 412–417). Springer Science and Business Media LLC. https://doi.org/10.1007/bf02103132

Seki, M. (2023). On the origin of the genetic code. In Genes & Genetic Systems (Vol. 98, Issue 1, pp. 9–24). Genetics Society of Japan. https://doi.org/10.1266/ggs.22-00085

Tourancheau, A. B., Tsao, N., Klobutcher, L. A., Pearlman, R. E., & Adoutte, A. (1995). Genetic code deviations in the ciliates: evidence for multiple and independent events. The EMBO Journal, 14(13), 3262–3267. https://doi.org/10.1002/j.1460-2075.1995.tb07329.x

ダウンロード

公開済


投稿日時: 2023-10-20 05:35:01 UTC

公開日時: 2023-10-25 23:27:30 UTC
研究分野
生物学・生命科学・基礎医学