GWASLab: a Python package for processing and visualizing GWAS summary statistics
キーワード:GWAS、 Python、 summary statistics、 visualization、 QC
GWASLab is a comprehensive Python toolkit for processing and visualizing summary statistics (SumStats) derived from genome-wide association studies (GWAS). GWASLab provides functions including quality control (QC) of statistics, standardization of chromosome and allele notation, variant normalization, harmonization for meta-analysis, and data visualization. Modular implementation of functions allows users to customize their own pipelines for utilizing SumStats. An expandable formatting library and standalone utilities persistently ensure seamless compatibility with many post-GWAS tools.
Availability and implementation: GWASLab is implemented in Python; the source code is publicly and freely available at https://github.com/Cloufield/gwaslab, and the documentation is available at https://cloufield.github.io/gwaslab/.
利益相反に関する開示The authors have declared no conflicts of interest.
Genomes Project Consortium et al. (2015) A global reference for human genetic variation. Nature, 526, 68–74.
Bulik-Sullivan,B.K. et al. (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet, 47, 291–295.
Buniello,A. et al. (2019) The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res, 47, D1005–D1012.
Hayhurst,J. et al. (2022) A community driven GWAS summary statistics standard. 2022.07.15.500230.
Lyon,M.S. et al. (2021) The variant call format provides efficient and robust storage of GWAS summary statistics. Genome Biology, 22, 32.
MacArthur,J.A.L. et al. (2021) Workshop proceedings: GWAS summary statistics standards and sharing. Cell Genomics, 1, 100004.
Malone,J. et al. (2010) Modeling sample variables with an Experimental Factor Ontology. Bioinformatics, 26, 1112–1118.
Matushyn,M. et al. (2022) SumStatsRehab: an efficient algorithm for GWAS summary statistics assessment and restoration. BMC Bioinformatics, 23, 443.
Mbatchou,J. et al. (2020) Computationally efficient whole genome regression for quantitative and binary traits. Nature Genetics.
Murphy,A.E. et al. (2021) MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics. Bioinformatics, 37, 4593–4596.
Purcell,S. et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet, 81, 559–575.
Reales,G. and Wallace,C. (2022) Sharing GWAS summary statistics results in more citations: evidence from the GWAS catalog. 2022.09.27.509657.
Sherry,S.T. et al. (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res, 29, 308–311.
Tan,A. et al. (2015) Unified representation of genetic variants. Bioinformatics, 31, 2202–2204.
Turner,S.D. (2018) qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. Journal of Open Source Software, 3, 731.
Winkler,T.W. et al. (2015) EasyStrata: evaluation and visualization of stratified genome-wide association meta-analysis data. Bioinformatics, 31, 259–261.
Yengo,L. et al. (2022) A saturated map of common genetic variants associated with human height. Nature, 610, 704–712.
Yin,L. et al. (2021) rMVP: A Memory-efficient, Visualization-enhanced, and Parallel-accelerated Tool for Genome-wide Association Study. Genomics Proteomics Bioinformatics, 19, 619–628.
Zhou,W. et al. (2018) Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet, 50, 1335–1341.
Zhou,W. et al. (2022) Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease. Cell Genom, 2, 100192.
投稿日時: 2023-04-28 08:36:45 UTC
公開日時: 2023-05-01 09:59:49 UTC
この作品は、Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licenseの下でライセンスされています。