scholarly journals Shorter androgen receptor polyQ alleles protect against life-threatening COVID-19 disease in males

Author(s):  
Margherita Baldassarri ◽  
Nicola Picchiotti ◽  
Francesca Fava ◽  
Chiara Fallerini ◽  
Elisa Benetti ◽  
...  

ABSTRACTBackgroundCOVID-19 presentation ranges from asymptomatic to fatal. The variability in severity may be due in part to impaired Interferon type I response due to specific mutations in the host genome or to autoantibodies, explaining about 15% of the cases when combined. Exploring the host genome is thus warranted to further elucidate disease variability.MethodsWe developed a synthetic approach to genetic data representation using machine learning methods to investigate complementary genetic variability in COVID-19 infected patients that may explain disease severity, due to poly-amino acids repeat polymorphisms. Using host whole-exome sequencing data, we compared extreme phenotypic presentations (338 severe versus 300 asymptomatic cases) of the entire (men and women) Italian GEN-COVID cohort of 1178 subjects infected with SARS-CoV-2. We then applied the LASSO Logistic Regression model on Boolean gene-based representation of the poly-amino acids variability.FindingsShorter polyQ alleles (≤22) in the androgen receptor (AR) conferred protection against a more severe outcome in COVID-19 infection. In the subgroup of males with age <60 years, testosterone was higher in subjects with AR long-polyQ (≥23), possibly indicating receptor resistance (p=0.004 Mann-Whitney U test). Inappropriately low testosterone levels for the long-polyQ alleles predicted the need for intensive care in COVID-19 infected men. In agreement with the known anti-inflammatory action of testosterone, patients with long-polyQ (≥23) and age>60 years had increased levels of C Reactive Protein (p=0.018).InterpretationOur results may contribute to design reliable clinical and public health measures and provide a rationale to test testosterone treatment as adjuvant therapy in symptomatic COVID-19 men expressing AR polyQ longer than 23 repeats.FundingMIUR project “Dipartimenti di Eccellenza 2018-2020” to Department of Medical Biotechnologies University of Siena, Italy (Italian D.L. n.18 March 17, 2020). Private donors for COVID research and charity funds from Intesa San Paolo.BoxesEvidence before this studyWe searched on Medline, EMBASE, and Pubmed for articles published from January 2020 to August 2020 using various combinations of the search terms “sex-difference”, “gender” AND SARS-Cov-2, or COVID. Epidemiological studies indicate that men and women are similarly infected by COVID-19, but the outcome is less favorable in men, independently of age. Several studies also showed that patients with hypogonadism tend to be more severely affected. A prompt intervention directed toward the most fragile subjects with SARS-Cov2 infection is currently the only strategy to reduce mortality. glucocorticoid treatment has been found cost-effective in improving the outcome of severe cases. Clinical algorithms have been proposed, but little is known on the ability of genetic profiling to predict outcome and disclose novel therapeutic strategies.Added-value of this studyIn a cohort of 1178 men and women with COVID-19, we used a supervised machine learning approach on a synthetic representation of the uncovered variability of the human genome due to poly-amino acid repeats. Comparing the genotype of patients with extreme manifestations (severe vs. asymptomatic), we found that the poly-glutamine repeat of the androgen receptor (AR) gene is relevant for COVID-19 disease and defective AR signaling identifies an association between male sex, testosterone exposure, and COVID-19 outcome. Failure of the endocrine feedback to overcome AR signaling defect by increasing testosterone levels during the infection leads to the fact that polyQ becomes dominant to T levels for the clinical outcome.Implications of all the available evidenceWe identify the first genetic polymorphism predisposing some men to develop a more severe disease irrespectively of age. Based on this, we suggest that sizing the AR poly-glutamine repeat has important implications in the diagnostic pipeline of patients affected by life-threatening COVID-19 infection. Most importantly, our studies open to the potential of using testosterone as adjuvant therapy for severe COVID-19 patients having defective androgen signaling, defined by this study as ≥23 PolyQ repeats and inappropriate levels of circulating androgens.

Sepsis is a life-threatening disease that causes tissue damage, organ failure and results in the death of millions of people. Sepsis is one of the highest risky diseases identified globally. A large proportion of these deaths occur in developing countries due to inaccessibility of hospitals or lack of resources. Blood samples are taken to confirm sepsis, but it requires the presence of laboratory and is time-consuming. The aim and objective of this study is to develop a practical, non-invasive sepsis prediction model that can be used to detect sepsis using supervised machine Learning algorithms. For this retrospective analysis, we used the data available from Physio-Net database.


2020 ◽  
Vol 14 (2) ◽  
pp. 140-159
Author(s):  
Anthony-Paul Cooper ◽  
Emmanuel Awuni Kolog ◽  
Erkki Sutinen

This article builds on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning Precision, Recall and F-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, means that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interacts with, and talks about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.


2020 ◽  
Author(s):  
Azhagiya Singam Ettayapuram Ramaprasad ◽  
Phum Tachachartvanich ◽  
Denis Fourches ◽  
Anatoly Soshilov ◽  
Jennifer C.Y. Hsieh ◽  
...  

Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) pose a substantial threat as endocrine disruptors, and thus early identification of those that may interact with steroid hormone receptors, such as the androgen receptor (AR), is critical. In this study we screened 5,206 PFASs from the CompTox database against the different binding sites on the AR using both molecular docking and machine learning techniques. We developed support vector machine models trained on Tox21 data to classify the active and inactive PFASs for AR using different chemical fingerprints as features. The maximum accuracy was 95.01% and Matthew’s correlation coefficient (MCC) was 0.76 respectively, based on MACCS fingerprints (MACCSFP). The combination of docking-based screening and machine learning models identified 29 PFASs that have strong potential for activity against the AR and should be considered priority chemicals for biological toxicity testing.


2017 ◽  
Author(s):  
Sabrina Jaeger ◽  
Simone Fulle ◽  
Samo Turk

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.


2020 ◽  
Vol 15 (2) ◽  
pp. 121-134 ◽  
Author(s):  
Eunmi Kwon ◽  
Myeongji Cho ◽  
Hayeon Kim ◽  
Hyeon S. Son

Background: The host tropism determinants of influenza virus, which cause changes in the host range and increase the likelihood of interaction with specific hosts, are critical for understanding the infection and propagation of the virus in diverse host species. Methods: Six types of protein sequences of influenza viral strains isolated from three classes of hosts (avian, human, and swine) were obtained. Random forest, naïve Bayes classification, and knearest neighbor algorithms were used for host classification. The Java language was used for sequence analysis programming and identifying host-specific position markers. Results: A machine learning technique was explored to derive the physicochemical properties of amino acids used in host classification and prediction. HA protein was found to play the most important role in determining host tropism of the influenza virus, and the random forest method yielded the highest accuracy in host prediction. Conserved amino acids that exhibited host-specific differences were also selected and verified, and they were found to be useful position markers for host classification. Finally, ANOVA analysis and post-hoc testing revealed that the physicochemical properties of amino acids, comprising protein sequences combined with position markers, differed significantly among hosts. Conclusion: The host tropism determinants and position markers described in this study can be used in related research to classify, identify, and predict the hosts of influenza viruses that are currently susceptible or likely to be infected in the future.


2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


Sign in / Sign up

Export Citation Format

Share Document