Preprint / Version 1

Construction of a Phylogenetic Tree Based on the Average Amino Acid Composition of Exomes

##article.authors##

DOI:

https://doi.org/10.51094/jxiv.1044

Keywords:

Phylogenetic tree, Amino acid composition, Exome, Distance function, Evolution

Abstract

All extant cellular organisms are believed to have descended from a common ancestor, and numerous phylogenetic trees have been constructed using various genetic and molecular data. In this study, I hypothesized that the average amino acid composition of an exome—computed across all exons—could serve as an index reflecting an organism’s characteristic amino acid usage and could be used to construct a phylogenetic tree. To test this hypothesis, I analyzed publicly available exome data from 81 species. For each species, I counted the fractional composition of each amino acid in each exon, and then calculated the average amino acid composition. I measured the pairwise distances between species using the angular distance based on these average compositions. Hierarchical clustering with Ward’s method was applied to construct a phylogenetic tree. The resulting tree showed a reasonable degree of similarity to previously established phylogenies, suggesting that this exome-based amino acid composition approach may offer some utility for inferring evolutionary relationships. To my knowledge, this is the first demonstration of constructing a phylogenetic tree solely from average exome-wide amino acid composition.

Conflicts of Interest Disclosure

The author declare no conflicts of interest associated with this manuscript.

Downloads *Displays the aggregated results up to the previous day.

Download data is not yet available.

References

Kapli, P., Yang, Z., & Telford, M. J. (2020). Phylogenetic tree building in the genomic age. Nature Reviews Genetics, 21(7), 428–444. https://doi.org/10.1038/s41576-020-0233-0

EMBL-EBI. (2024). Reference Proteomes (Release 2024_02). Retrieved January 7, 2025, from https://www.ebi.ac.uk/reference_proteomes/

Siepel, A. (2009). Phylogenomics of primates and their ancestral populations. Genome Research, 19(11), 1929–1941. https://doi.org/10.1101/gr.084228.108

Esumi, G. (2023). The Distributions of Amino Acid Compositions of Proteins in an Organism’s Proteome Uniformly Approximate Binomial Distributions. Jxiv. https://doi.org/10.51094/jxiv.408

Du, M.-Z., Liu, S., Zeng, Z., Alemayehu, L. A., Wei, W., & Guo, F.-B. (2018). Amino acid compositions contribute to the proteins’ evolution under the influence of their abundances and genomic GC content. Scientific Reports, 8(1). https://doi.org/10.1038/s41598-018-25364-1

Downloads

Posted


Submitted: 2025-01-14 02:12:55 UTC

Published: 2025-01-15 07:27:22 UTC
Section
Biology, Life Sciences & Basic Medicine