Synonymous codon usage and its bias in the bacterial proteomes primarily offset guanine and cytosine content variation to maintain optimal amino acid compositions
DOI:
https://doi.org/10.51094/jxiv.99キーワード:
synonymous codon、 codon usage bias、 amino acid composition、 GC content抄録
Codon usage bias is the preferential or non-random synonymous codon usage among species. A recent review concluded that their biases are a complex phenomenon influenced by numerous factors, including genome composition, guanine and cytosine (GC) content, expression level, gene length, and recombination rates. In this paper, I present a new plot chart and show a more straightforward explanation of the primary function of synonymous codon usage and its bias.
First, I calculated each protein’s amino acid compositions and its gene’s nucleotide compositions from the publicly available proteome coding sequence dataset of 23 different bacteria. Next, I calculated the maximum and minimum GC contents of the possible gene variations of the amino acid composition of each protein. Finally, they were plotted together by their actual GC content on a scatter plot (scatter diagram).
The plot showed a clear tendency. Proteins with lower actual GC content genes are coded for by genes closer to the minimum possible GC content. On the other hand, proteins with higher actual GC content genes are coded for by genes closer to the maximum possible GC content. This tendency indicates that synonymous codon usage bias uniformly works toward offsetting the variation in GC content. Meanwhile, all plots of the maximum and minimum values were aligned in a row within a narrow band for each. Therefore, I considered that the optimal range of amino acid composition of the proteome is relatively limited, and that organisms use this GC offset function to meet the range conditions.
Synonymous codons are part of the genetic code table. Therefore, if synonymous codons and their usage bias have a GC offset function to maintain the optimal amino acid composition, it must be considered a fundamental function of the genetic code table assignment.
ダウンロード *前日までの集計結果を表示します
引用文献
Parvathy, S. T., Udayasuriyan, V., & Bhadana, V. (2022). Codon usage bias. Molecular Biology Reports, 49(1), 539–565. https://doi.org/10.1007/s11033-021-06749-4
Genome dataset, "Fusobacterium nucleatum subsp. nucleatum", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_003019295.1/
Genome dataset, "Mycoplasma genitalium G37", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000027325.1/
Genome dataset, "Dictyoglomus turgidum DSM 6724", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000021645.1/
Genome dataset, "Thermodesulfovibrio yellowstonii DSM 11347", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000020985.1/
Genome dataset, "Leptospira interrogans", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_001569005.1/
Genome dataset, "Helicobacter pylori 26695", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000307795.1/
Genome dataset, "Chlamydia trachomatis D/UW-3/CX", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000008725.1/
Genome dataset, "Bacteroides thetaiotaomicron", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_014131755.1/
Genome dataset, "Aquifex aeolicus VF5", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000008625.1/
Genome dataset, "Bacillus subtilis subsp. subtilis str. 168", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000009045.1/
Genome dataset, "Thermotoga maritima MSB8", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000230655.2/
Genome dataset, "Synechocystis sp. PCC 6803", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000009725.1/
Genome dataset, "Escherichia coli str. K-12 substr. MG1655", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000005845.2/
Genome dataset, "Neisseria meningitidis MC58", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000008805.1/
Genome dataset, "Chloroflexus aurantiacus J-10-fl", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000018865.1/
Genome dataset, "Rhodopirellula baltica SH 1", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000196115.1/
Genome dataset, "Geobacter sulfurreducens PCA", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000007985.2/
Genome dataset, "Gloeobacter violaceus PCC 7421", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000011385.1/
Genome dataset, "Bradyrhizobium diazoefficiens USDA 110", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000011365.1/
Genome dataset, "Mycobacterium tuberculosis H37Rv", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000195955.2/
Genome dataset, "Pseudomonas aeruginosa PAO1", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000006765.1/
Genome dataset, "Deinococcus radiodurans ATCC 13939", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_020546685.1/
Genome dataset, "Streptomyces coelicolor A3(2)", NCBI website, https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_008931305.1/
"Reference proteomes - Primary proteome sets for the Quest For Orthologs”, EMBL-EBI website. https://www.ebi.ac.uk/reference_proteomes/
Wan, X.-F., Xu, D., Kleinhofs, A., & Zhou, J. (2004). Quantitative relationship between synonymous codon usage bias and GC composition across unicellular genomes. BMC Evolutionary Biology, 4, 19. https://doi.org/10.1186/1471-2148-4-19
ダウンロード
公開済
投稿日時: 2022-06-26 21:06:34 UTC
公開日時: 2022-06-28 09:29:54 UTC — 2022-07-04 04:13:56 UTCに更新
バージョン
- 2022-07-04 04:13:56 UTC(2)
- 2022-06-28 09:29:54 UTC(1)
改版理由
This manuscript has been revised after English proofreading.ライセンス
Copyright(c)2022
Esumi, Genshiro
この作品は、Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licenseの下でライセンスされています。