プレプリント / バージョン1

Not Only Residue Amino Acid Composition but Also Gene Thymine–Adenine Balance Reflect Protein Hydropathy

##article.authors##

DOI:

https://doi.org/10.51094/jxiv.1158

キーワード:

Hydropathy、 TA skew、 Nucleotide composition、 Genetic code、 Chargaff’s second parity rule、 Optimized translation hypothesis

抄録

Kyte and Doolittle’s landmark study established the concept that a protein’s hydropathy governs its conformation and membrane-spanning regions, and they also demonstrated that this hydropathy can be estimated by applying coefficients to the amino acid residue composition of the protein sequence. In contrast, the possibility of estimating protein hydropathy from the nucleotide composition of its gene sequence has rarely been explored. In my previous study, I showed that the balance of thymine and adenine in protein genes, termed “TA skew,” correlates positively with the proportion of hydrophobic transmembrane domains (TMD) and negatively with that of hydrophilic intrinsically disordered regions (IDR). Therefore, I hypothesized that a gene’s TA skew correlates with the hydropathy of its encoded protein sequence.

To test this hypothesis, I revisited the six example proteins examined in Kyte and Doolittle’s original study to determine whether the TA skew of their gene sequences corresponds to their hydropathic indices and the documented structural features of their corresponding residue sequences. Furthermore, using sufficiently large protein datasets, I analyzed whether each gene’s TA skew correlates with the GRAVY score (the average hydropathy of each entire protein) and with the proportions of two distinct protein domains (TMDs and IDRs).

Analysis of the proteins from that landmark study revealed strong correlations between TA skew, hydropathic indices, and their structural features. Moreover, in larger protein datasets, evident correlations between TA skew, the GRAVY score, and these representative protein domains were also observed. These findings reveal a previously unrecognized dimension of the correspondence between nucleotide composition and protein structures, suggesting the existence of an intricate function within the genetic code’s codon–amino acid correspondence.

利益相反に関する開示

The author declare no conflicts of interest associated with this manuscript.

ダウンロード *前日までの集計結果を表示します

ダウンロード実績データは、公開の翌日以降に作成されます。

引用文献

Anfinsen, C. B. (1973). Principles that Govern the Folding of Protein Chains. Science, 181(4096), 223–230. https://doi.org/10.1126/science.181.4096.223

Kyte, J., & Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology, 157(1), 105–132. https://doi.org/10.1016/0022-2836(82)90515-0

Esumi, G. (2023). The TA Skew of a Gene Primarily Determines the Type of Protein, Such as Membrane Protein or Intrinsically Disordered Protein [Preprint]. Jxiv. https://doi.org/10.51094/jxiv.446

Dyson, H. J., & Wright, P. E. (2005). Intrinsically unstructured proteins and their functions. Nature Reviews Molecular Cell Biology, 6(3), 197–208. https://doi.org/10.1038/nrm1589

EMBL-EBI. (2023). Reference Proteomes (Release 2023_03) [Database]. Retrieved September 1, 2023, from https://www.ebi.ac.uk/reference_proteomes/

HARTLEY, B. S. (1964). Amino-Acid Sequence of Bovine Chymotrypsinogen-A. Nature, 201(4926), 1284–1287. https://doi.org/10.1038/2011284a0

Eventoff, W., Rossmann, M. G., Taylor, S. S., Torff, H. J., Meyer, H., Keil, W., & Kiltz, H. H. (1977). Structural adaptations of lactate dehydrogenase isozymes. Proceedings of the National Academy of Sciences, 74(7), 2677–2681. https://doi.org/10.1073/pnas.74.7.2677

Tomita, M., & Marchesi, V. T. (1975). Amino-acid sequence and oligosaccharide attachment sites of human erythrocyte glycophorin. Proceedings of the National Academy of Sciences, 72(8), 2964–2968. https://doi.org/10.1073/pnas.72.8.2964

Strittmatter, P., Rogers, M. J., & Spatz, L. (1972). The Binding of Cytochrome b5 to Liver Microsomes. Journal of Biological Chemistry, 247(22), 7188–7194. https://doi.org/10.1016/S0021-9258(19)44612-7

Rose, J. K., Welch, W. J., Sefton, B. M., Esch, F. S., & Ling, N. C. (1980). Vesicular stomatitis virus glycoprotein is anchored in the viral membrane by a hydrophobic domain near the COOH terminus. Proceedings of the National Academy of Sciences, 77(7), 3884–3888. https://doi.org/10.1073/pnas.77.7.3884

Khorana, H. G., Gerber, G. E., Herlihy, W. C., Gray, C. P., Anderegg, R. J., Nihei, K., & Biemann, K. (1979). Amino acid sequence of bacteriorhodopsin. Proceedings of the National Academy of Sciences, 76(10), 5046–5050. https://doi.org/10.1073/pnas.76.10.5046

Hamashima, K., Kanai, A. (2014). Unexpected tRNAs that do not consistently obey the universal genetic code (In Japanese). Seikagaku. The Journal of Japanese Biochemical Society, 86(4), 483-488. https://www.jbsoc.or.jp/seika/wp-content/uploads/2015/03/86-04-10.pdf

National Center for Biotechnology Information. (n.d.). chymotrypsinogen A [Bos taurus] (NCBI Reference Sequence: XP_003587247.4). Retrieved March 7, 2025, from https://www.ncbi.nlm.nih.gov/protein/XP_003587247.4/

UniProt Consortium. (n.d.). L-lactate dehydrogenase A chain (Squalus acanthias) (UniProt accession No. P00341). Retrieved March 7, 2025, from https://www.uniprot.org/uniprotkb/P00341/entry

UniProt Consortium. (n.d.). Glycophorin-A (Homo sapiens) (UniProt accession No. P02724). Retrieved March 7, 2025, from https://www.uniprot.org/uniprotkb/P02724/entry

UniProt Consortium. (n.d.). Cytochrome b5 (Oryctolagus cuniculus) (UniProt accession No. P00169). Retrieved March 7, 2025, from https://www.uniprot.org/uniprotkb/P00169/entry

UniProt Consortium. (n.d.). Glycoprotein (Vesicular stomatitis Indiana virus) (UniProt accession No. P04884). Retrieved March 7, 2025, from https://www.uniprot.org/uniprotkb/P04884/entry

National Center for Biotechnology Information. (n.d.). bacteriorhodopsin [Halobacterium salinarum] (NCBI Reference Sequence: WP_136361479.1). Retrieved March 7, 2025, from https://www.ncbi.nlm.nih.gov/protein/WP_136361479.1

Vakirlis, N., Acar, O., Hsu, B., Castilho Coelho, N., van Oss, S. B., Wacholder, A., Medetgul-Ernar, K., Bowman, R. W., Hines, C. P., Iannotta, J., Parikh, S. B., McLysaght, A., Camacho, C. J., O’Donnell, A. F., Ideker, T., & Carvunis, A.-R. (2020). De novo emergence of adaptive membrane proteins from thymine-rich genomic sequences. Nature Communications, 11(1), 781. https://doi.org/10.1038/s41467-020-14500-z

Freeland, S. J., & Hurst, L. D. (1998). The Genetic Code Is One in a Million. Journal of Molecular Evolution, 47(3), 238–248. https://doi.org/10.1007/PL00006381

Esumi, G. (2024). The Standard Genetic Code Predominantly Assigns Uracil-Containing Codons to Amino Acids Enriched in Transmembrane Domains and Uracil-Free Codons to Amino Acids Enriched in Intrinsically Disordered Regions [Preprint]. Jxiv. https://doi.org/10.51094/jxiv.592

Esumi, G. (2023). The Synonymous Codon Usage of a Protein Gene Is Primarily Determined by the Guanine + Cytosine Content of the Individual Gene Rather Than the Species to Which It Belongs To Synthesize Proteins With a Balanced Amino Acid Composition [Preprint]. Jxiv. https://doi.org/10.51094/jxiv.561

Rudner, R., Karkas, J. D., & Chargaff, E. (1968). Separation of B. subtilis DNA into complementary strands. 3. Direct analysis. Proceedings of the National Academy of Sciences, 60(3), 921–922. https://doi.org/10.1073/pnas.60.3.921

Forsdyke, D. R., & Mortimer, J. R. (2000). Chargaff’s legacy. Gene, 261(1), 127–137. https://doi.org/10.1016/S0378-1119(00)00472-8

ダウンロード

公開済


投稿日時: 2025-03-18 05:24:54 UTC

公開日時: 2025-03-24 06:17:03 UTC
研究分野
生物学・生命科学・基礎医学