Background Given its high mortality and broad societal impacts, the COVID-19 pandemic is a particularly notable global outbreak of a respiratory illness in the 21st century. Although previous studies have identified several genes associated with COVID-19 susceptibility, relatively little is known about the genes contributing to severe COVID-19, including their evolutionary histories. In the current study, we analyzed IL-4, TLR2, CCL2, and SLC11A1—four immunity genes that have been implicated in severe COVID-19 and other immune-related diseases—in globally diverse populations from the 1000 Genomes Project. We also tested for associations between genetic variation in these genes and clinical COVID-19 phenotypes in more than 4,000 laboratory-confirmed COVID-19–positive individuals from Italy.
Results Based on our analyses, we identified 72 single nucleotide polymorphisms (SNPs) across these genes as targets of positive selection, including several derived alleles shared with archaic Neanderthal and/or Denisovan genomes—a finding not previously reported in the literature. Furthermore, we found that common SNPs—implicated in respiratory diseases such as tuberculosis and chronic obstructive pulmonary disorder—were also under selection. Functional predictions based on in silico analyses revealed that a subset of selected alleles map to transcription factor binding sites and are predicted to affect binding affinity. In addition, our genetic association analyses uncovered significant correlations between derived alleles in the coding region of TLR2 and COVID-19 severity. Interestingly, these candidate alleles occurred at relatively low frequency in western European and East Asian populations but were absent in populations of African and South Asian descent.
Conclusions Overall, our study provides new insights into the evolution of biologically relevant immunity genes in the modern human lineage and highlights genetic variants that may underlie differential risk for severe COVID-19.
Competing Interest StatementThe authors have declared no competing interest.
Funding StatementThis work was supported by funds from the National Geographic Society [grant HJ-116ER-17] to C.N.C, the U.S. National Science Foundation (NSF) grant IOS-1355034, Howard University College of Medicine, and the District of Columbia Center for AIDS Research, an NIH funded program [P30AI117970] to T.H., as well as U.S. NSF grant BCS-2221924 and U.S. NSF grant BCS-2221920 to M.C.C.
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The genetic diversity data underlying this article are available in the 1000 Genomes Consortium Project at https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/. The clinical datasets analyzed for the current study are available from GEN-COVID upon reasonable request. The GEN-COVID data were individual-level data that had been de-identified prior to use in the study.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
AbbreviationsACBAfrican Caribbeans in BarbadosACE2Angiotensin-Converting Enzyme 2ANPEPAlanyl AminopeptidaseAPCAntigen-Presenting CellARGAncestral Recombination GraphASWAfrican Ancestry in Southwest USBEBBengali in BangladeshbpBase PairCCL2C-C Motif Chemokine Ligand 2CCL3C-C Motif Chemokine Ligand 3CCBBCenter for Computational Biology and BioinformaticsCDXChinese Dai in Xishuangbanna, ChinaCHBHan Chinese in Beijing, ChinaCHSSouthern Han ChineseCLUES2Composite Likelihood for Estimating Selection, version 2COVID-19Coronavirus Disease 2019CPAPContinuous Positive Airway PressureCTSBCathepsin BCTSLCathepsin LCXCL10C-X-C Motif Chemokine Ligand 10DPP4Dipeptidyl Peptidase-4DNADeoxyribonucleic AcidDTTajima’s D StatisticEHHExtended Haplotype HomozygosityEREndoplasmic ReticulumESNEsan in NigeriaFINFinnish in FinlandFSTFixation IndexFURINFurin ProteaseGBRBritish in England and ScotlandGEN-COVIDGenetic and Clinical Data Collection for COVID-19GIHGujarati Indian in Houston, TexasGRCh37Genome Reference Consortium Human Build 37GRMGenetic Relationship MatrixGWDGambian in Western Division, The GambiaHFay and Wu’s H StatisticHBVHepatitis B VirusHCVHepatitis C VirusHIVHuman Immunodeficiency VirusHLAHuman Leukocyte AntigenIBDInflammatory Bowel DiseaseIBSIberian population in SpainICUIntensive Care UnitIFN-αInterferon-alphaIFN-γInterferon-gammaiHSIntegrated Haplotype ScoreILInterleukinIL-4Interleukin-4IL-6Interleukin-6IL-10Interleukin-10ITUIndian Telugu in the UKJPTJapanese in Tokyo, JapanKHVKinh in Ho Chi Minh City, VietnamLDLinkage DisequilibriumLWKLuhya in Webuye, KenyaMAFMinor Allele FrequencyMSLMende in Sierra LeonemRNAMessenger RNAnSLNumber of Segregating Sites by Length (haplotype-based test)ORFOpen Reading FramePCAPrincipal Component AnalysisPJLPunjabi in Lahore, PakistanRNARibonucleic AcidSARS-CoV-2Severe Acute Respiratory Syndrome Coronavirus 2SLC11A1Solute Carrier Family 11 Member 1SNPSingle Nucleotide PolymorphismSPASaddle Point ApproximationSTUSri Lankan Tamil in the UKTBTuberculosisTFTranscription FactorTFBSTranscription Factor Binding SiteTLRToll-like ReceptorTLR2Toll-like Receptor 2TMPRSS2Transmembrane Protease, Serine 2TMPRSS11DTransmembrane Protease, Serine 11DTSIToscani in ItalyUTRUntranslated RegionVCFVariant Call FormatYRIYoruba in Ibadan, Nigeria
Comments (0)