Preprint / Version 1

The TA Skew of a Gene Primarily Determines the Type of Protein, Such as Membrane Protein or Intrinsically Disordered Protein

##article.authors##

DOI:

https://doi.org/10.51094/jxiv.446

Keywords:

TA skew, nucleic acid composition, transmembrane domain, intrinsically disordered region, genetic code

Abstract

Proteins differ in function and cellular localization according to their amino acid composition; however, there are few reports on how these characteristic compositions are organized.

Principal component analysis of the amino acid composition of all proteins in the human proteome and plotting of all proteins by their first and second principal components revealed that proteins with high fractions of α-helical transmembrane domains and those with high fractions of intrinsically disordered regions were located at each extreme of the plot. At the same time, each functional domain fraction corresponded primarily to the high and low TA skew of the gene nucleotide composition.

The codon corresponding to each amino acid in the genetic code consists of four nucleic acids, but the nucleic acid composition in the codons for each amino acid is initially skewed. Therefore, the amino acid composition of a protein is inevitably affected by the nucleic acid composition of its genes. However, there are few reports on the consequences of this effect. In contrast, the present study showed that the largest source of diversity in the amino acid composition of proteins is the diversity in the nucleic acid composition of their genes, such as TA skew and GC content. Furthermore, this study showed that TA skew plays an important role in determining protein properties. Finally, I conclude that both the TA skew of the gene and the skewed assignment of the genetic code work together to maintain the correct properties of proteins in the proteome.

Conflicts of Interest Disclosure

The author declare no conflicts of interest associated with this manuscript.

Downloads *Displays the aggregated results up to the previous day.

Download data is not yet available.

References

Esumi, G. (2023). The α-helical transmembrane domains and intrinsically disordered regions on the human proteins are coded for by the skews of their genes' nucleic acid composition with the "universal" assignment of the genetic code table. Jxiv. https://doi.org/10.51094/jxiv.247

Genome dataset, “Homo sapiens (human) / Genome assembly T2T-CHM13v2.0”. NCBI website. https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_009914755.1/

Reference proteome, “Proteomes · Homo sapiens (Human)”. UniProt website. https://www.uniprot.org/proteomes/UP000005640

Esumi, G. (2022). Synonymous codon usage and its bias in the bacterial proteomes primarily offset guanine and cytosine content variation to maintain optimal amino acid compositions. Jxiv. https://doi.org/10.51094/jxiv.99

Esumi, G. (2022). Synthesis assistance of transmembrane domains is a fundamental function of the genetic code table assignment. Jxiv. https://doi.org/10.51094/jxiv.139

Downloads

Posted


Submitted: 2023-07-08 03:42:56 UTC

Published: 2023-07-13 06:08:28 UTC
Section
Biology, Life Sciences & Basic Medicine