scholarly journals Mapping biology from mouse to man using transfer learning

Author(s):  
Patrick S. Stumpf ◽  
Doris Du ◽  
Haruka Imanishi ◽  
Yuya Kunisaki ◽  
Yuichiro Semba ◽  
...  

Biomedical research often involves conducting experiments on model organisms in the anticipation that the biology learnt from these experiments will transfer to the human. Yet, it is commonly the case that biology does not transfer effectively, often for unknown reasons. Despite its importance to translational research this transfer process is not currently rigorously quantified. Here, we show that transfer learning – the branch of machine learning that concerns passing information from one domain to another – can be used to efficiently map biology from mouse to man, using the bone marrow (BM) as a representative example of a complex tissue. We first trained an artificial neural network (ANN) to accurately recognize various different cell types in mouse BM using data obtained from single-cell RNA-sequencing (scRNA-Seq) experiments. We found that this ANN, trained exclusively on mouse data, was able to identify individual human cells obtained from comparable scRNA-Seq experiments of human BM with 83% overall accuracy. However, while some human cell types were easily identified, others were not, indicating important differences in biology. To obtain a more accurate map of the human BM we then retrained the mouse ANN using scRNA-Seq data from a limited sample of human BM cells. Typically, less than 10 human cells of a given type were needed to accurately learn its representation in the updated model. In some cases, human cell identities could be inferred directly from the mouse ANN without retraining, via a process of biologically-guided zero-shot learning. These results show how machine learning can be used to reconstruct complex biology from limited data and have broad implications for biomedical research.

2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Patrick S. Stumpf ◽  
Xin Du ◽  
Haruka Imanishi ◽  
Yuya Kunisaki ◽  
Yuichiro Semba ◽  
...  

AbstractBiomedical research often involves conducting experiments on model organisms in the anticipation that the biology learnt will transfer to humans. Previous comparative studies of mouse and human tissues were limited by the use of bulk-cell material. Here we show that transfer learning—the branch of machine learning that concerns passing information from one domain to another—can be used to efficiently map bone marrow biology between species, using data obtained from single-cell RNA sequencing. We first trained a multiclass logistic regression model to recognize different cell types in mouse bone marrow achieving equivalent performance to more complex artificial neural networks. Furthermore, it was able to identify individual human bone marrow cells with 83% overall accuracy. However, some human cell types were not easily identified, indicating important differences in biology. When re-training the mouse classifier using data from human, less than 10 human cells of a given type were needed to accurately learn its representation. In some cases, human cell identities could be inferred directly from the mouse classifier via zero-shot learning. These results show how simple machine learning models can be used to reconstruct complex biology from limited data, with broad implications for biomedical research.


2004 ◽  
Vol 6 (14) ◽  
pp. 1-14 ◽  
Author(s):  
Anne Corbett ◽  
Rachel Exley ◽  
Sandrine Bourdoulous ◽  
Christoph M. Tang

Neisseria meningitidis is the leading cause of bacterial meningitis, a potentially fatal condition that particularly affects children. Multiple steps are involved during the pathogenesis of infection, including the colonisation of healthy individuals and invasion of the bacterium into the cerebrospinal fluid. The bacterium is capable of adhering to, and entering into, a range of human cell types, which facilitates its ability to cause disease. This article summarises the molecular basis of host–pathogen interactions at the cellular level during meningococcal carriage and disease.


1974 ◽  
Vol 144 (1) ◽  
pp. 161-164 ◽  
Author(s):  
Alec Jeffreys ◽  
Ian Craig

The proteins synthesized in the mitochondria of mouse and human cells grown in tissue culture were examined by electrophoresis in polyacrylamide gels. The proteins were labelled by incubating the cells in the presence of [35S]methionine and an inhibitor of cytoplasmic protein synthesis (emetine or cycloheximide). A detailed comparison between the labelled products of mouse and human mitochondrial protein synthesis was made possible by developing radioautograms after exposure to slab-electrophoresis gels. Patterns obtained for different cell types of the same species were extremely similar, whereas reproducible differences were observed on comparison of the profiles obtained for mouse and human cells. Four human–mouse somatic cell hybrids were examined, and in each one only components corresponding to mouse mitochondrially synthesized proteins were detected.


2021 ◽  
Author(s):  
Thomas Beder ◽  
Olufemi Aromolaran ◽  
Juergen Doenitz ◽  
Sofia Tapanelli ◽  
Eunice Oluwatobiloba Adedeji ◽  
...  

Identifying essential genes on a genome scale is resource intensive and has been performed for only a few eukaryotes. For less studied organisms essentiality might be predicted by gene homology. However, this approach cannot be applied to non-conserved genes. Additionally, divergent essentiality information is obtained from studying single cells or whole, multi-cellular organisms, and particularly when derived from human cell line screens and human population studies. We employed machine learning across six model eukaryotes and 60,381 genes, using 41,635 features derived from sequence, gene functions and network topology. Within a leave-one-organism-out cross-validation, the classifiers showed a high generalizability with an average accuracy close to 80% in the left-out species. As a case study, we applied the method to Tribolium castaneum and validated predictions experimentally yielding similar performance. Finally, using the classifier based on the studied model organisms enabled linking the essentiality information of human cell line screens and population studies.


2020 ◽  
Author(s):  
Yan Gao ◽  
Yan Cui

AbstractAs artificial intelligence (AI) is increasingly applied to biomedical research and clinical decisions, developing unbiased AI models that work equally well for all racial and ethnic groups is of crucial importance to health disparity prevention and reduction. However, the biomedical data inequality between different racial and ethnic groups is set to generate new health care disparities through data-driven, algorithm-based biomedical research and clinical decisions. Using an extensive set of machine learning experiments on cancer omics data, we found that current prevalent schemes of multiethnic machine learning are prone to generating significant model performance disparities between racial groups. We showed that these performance disparities are caused by data inequality and data distribution discrepancies between racial groups. We also found that transfer learning can improve machine learning model performance for data-disadvantaged racial groups, and thus provides a novel approach to reduce health care disparities arising from data inequality among racial groups.


2021 ◽  
Vol 22 (2) ◽  
pp. 840
Author(s):  
Monika Richter-Laskowska ◽  
Paulina Trybek ◽  
Piotr Bednarczyk ◽  
Agata Wawrzkiewicz-Jałowiecka

(1) Background: In this work, we focus on the activity of large-conductance voltage- and Ca2+-activated potassium channels (BK) from the inner mitochondrial membrane (mitoBK). The characteristic electrophysiological features of the mitoBK channels are relatively high single-channel conductance (ca. 300 pS) and types of activating and deactivating stimuli. Nevertheless, depending on the isoformal composition of mitoBK channels in a given membrane patch and the type of auxiliary regulatory subunits (which can be co-assembled to the mitoBK channel protein) the characteristics of conformational dynamics of the channel protein can be altered. Consequently, the individual features of experimental series describing single-channel activity obtained by patch-clamp method can also vary. (2) Methods: Artificial intelligence approaches (deep learning) were used to classify the patch-clamp outputs of mitoBK activity from different cell types. (3) Results: Application of the K-nearest neighbors algorithm (KNN) and the autoencoder neural network allowed to perform the classification of the electrophysiological signals with a very good accuracy, which indicates that the conformational dynamics of the analyzed mitoBK channels from different cell types significantly differs. (4) Conclusion: We displayed the utility of machine-learning methodology in the research of ion channel gating, even in cases when the behavior of very similar microbiosystems is analyzed. A short excerpt from the patch-clamp recording can serve as a “fingerprint” used to recognize the mitoBK gating dynamics in the patches of membrane from different cell types.


2022 ◽  
Vol 23 (2) ◽  
pp. 855
Author(s):  
Dinko Mitrečić ◽  
Valentina Hribljan ◽  
Denis Jagečić ◽  
Jasmina Isaković ◽  
Federica Lamberto ◽  
...  

From the first success in cultivation of cells in vitro, it became clear that developing cell and/or tissue specific cultures would open a myriad of new opportunities for medical research. Expertise in various in vitro models has been developing over decades, so nowadays we benefit from highly specific in vitro systems imitating every organ of the human body. Moreover, obtaining sufficient number of standardized cells allows for cell transplantation approach with the goal of improving the regeneration of injured/disease affected tissue. However, different cell types bring different needs and place various types of hurdles on the path of regenerative neurology and regenerative cardiology. In this review, written by European experts gathered in Cost European action dedicated to neurology and cardiology-Bioneca, we present the experience acquired by working on two rather different organs: the brain and the heart. When taken into account that diseases of these two organs, mostly ischemic in their nature (stroke and heart infarction), bring by far the largest burden of the medical systems around Europe, it is not surprising that in vitro models of nervous and heart muscle tissue were in the focus of biomedical research in the last decades. In this review we describe and discuss hurdles which still impair further progress of regenerative neurology and cardiology and we detect those ones which are common to both fields and some, which are field-specific. With the goal to elucidate strategies which might be shared between regenerative neurology and cardiology we discuss methodological solutions which can help each of the fields to accelerate their development.


2021 ◽  
Vol 12 ◽  
Author(s):  
Amira Elbakry ◽  
Markus Löbrich

Homologous recombination (HR) is an essential pathway for DNA double-strand break (DSB) repair, which can proceed through various subpathways that have distinct elements and genetic outcomes. In this mini-review, we highlight the main features known about HR subpathways operating at DSBs in human cells and the factors regulating subpathway choice. We examine new developments that provide alternative models of subpathway usage in different cell types revise the nature of HR intermediates involved and reassess the frequency of repair outcomes. We discuss the impact of expanding our understanding of HR subpathways and how it can be clinically exploited.


2021 ◽  
Vol 3 (4) ◽  
Author(s):  
Thomas Beder ◽  
Olufemi Aromolaran ◽  
Jürgen Dönitz ◽  
Sofia Tapanelli ◽  
Eunice O Adedeji ◽  
...  

Abstract Identifying essential genes on a genome scale is resource intensive and has been performed for only a few eukaryotes. For less studied organisms essentiality might be predicted by gene homology. However, this approach cannot be applied to non-conserved genes. Additionally, divergent essentiality information is obtained from studying single cells or whole, multi-cellular organisms, and particularly when derived from human cell line screens and human population studies. We employed machine learning across six model eukaryotes and 60 381 genes, using 41 635 features derived from the sequence, gene function information and network topology. Within a leave-one-organism-out cross-validation, the classifiers showed high generalizability with an average accuracy close to 80% in the left-out species. As a case study, we applied the method to Tribolium castaneum and Bombyx mori and validated predictions experimentally yielding similar performances. Finally, using the classifier based on the studied model organisms enabled linking the essentiality information of human cell line screens and population studies.


2020 ◽  
Author(s):  
Gabriel Wardi ◽  
Morgan Carlile ◽  
Andre Holder ◽  
Supreeth Shashikumar ◽  
Stephen R Hayden ◽  
...  

ABSTRACTObjectiveMachine-learning (ML) algorithms allow for improved prediction of sepsis syndromes in the ED using data from electronic medical records. Transfer learning, a new subfield of ML, allows for generalizability of an algorithm across clinical sites. We aimed to validate the Artificial Intelligence Sepsis Expert (AISE) for the prediction of delayed septic shock in a cohort of patients treated in the ED and demonstrate the feasibility of transfer learning to improve external validity at a second site.MethodsObservational cohort study utilizing data from over 180,000 patients from two academic medical centers between 2014 and 2019 using multiple definitions of sepsis. The AISE algorithm was trained using 40 input variables at the development site to predict delayed septic shock (occurring greater than 4 hours after ED triage) at varying prediction windows. We then validated the AISE algorithm at a second site using transfer learning to demonstrate generalizability of the algorithm.ResultsWe identified 9354 patients with severe sepsis of which 723 developed septic shock at least 4 hours after triage. The AISE algorithm demonstrated excellent area under the receiver operating curve (>0.8) at 8 and 12 hours for the prediction of delayed septic shock. Transfer learning significantly improved the test characteristics of the AISE algorithm and yielded comparable performance at the validation site.ConclusionsThe AISE algorithm accurately predicted the development of delayed septic shock. The use of transfer learning allowed for significantly improved external validity and generalizability at a second site. Future prospective studies are indicated to evaluate the clinical utility of this model.


Sign in / Sign up

Export Citation Format

Share Document