Mining Early Life Risk and Resiliency Factors and Their Influences in Human Populations from PubMed: A Machine Learning Approach to Discover DOHaD Evidence

The Developmental Origins of Health and Disease (DOHaD) framework aims to understand how early life exposures shape lifecycle health. To date, no comprehensive list of these exposures and their interactions has been developed, which limits our ability to predict trajectories of risk and resiliency in humans. To address this gap, we developed a model that uses text-mining, machine learning, and natural language processing approaches to automate search, data extraction, and content analysis from DOHaD-related research articles available in PubMed. Our first model captured 2469 articles, which were subsequently categorised into topics based on word frequencies within the titles and abstracts. A manual screening validated 848 of these as relevant, which were used to develop a revised model that finally captured 2098 articles that largely fell under the most prominently researched domains related to our specific DOHaD focus. The articles were clustered according to latent topic extraction, and 23 experts in the field independently labelled the perceived topics. Consensus analysis on this labelling yielded mostly from fair to substantial agreement, which demonstrates that automated models can be developed to successfully retrieve and classify research literature, as a first step to gather evidence related to DOHaD risk and resilience factors that influence later life human health.

Download Full-text

Early life risk and resiliency factors and their influences on developmental outcomes and disease pathways: a rapid evidence review of systematic reviews and meta-analyses

Journal of Developmental Origins of Health and Disease ◽

10.1017/s2040174420000689 ◽

2020 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Ayah Abdul-Hussein ◽

Ayesha Kareem ◽

Shrankhala Tewari ◽

Julie Bergeron ◽

Laurent Briollais ◽

...

Keyword(s):

Health Outcomes ◽

Systematic Reviews ◽

Social Determinants ◽

Early Life ◽

Later Life ◽

Risk And Resilience ◽

Screening Process ◽

Evidence Review ◽

Resiliency Factors ◽

Meta Analyses

Abstract The Developmental Origins of Health and Disease (DOHaD) framework aims to understand how environmental exposures in early life shape lifecycle health. Our understanding and the ability to prevent poor health outcomes and enrich for resiliency remain limited, in part, because exposure–outcome relationships are complex and poorly defined. We, therefore, aimed to determine the major DOHaD risk and resilience factors. A systematic approach with a 3-level screening process was used to conduct our Rapid Evidence Review following the established guidelines. Scientific databases using DOHaD-related keywords were searched to capture articles between January 1, 2009 and April 19, 2019. A final total of 56 systematic reviews/meta-analyses were obtained. Studies were categorized into domains based on primary exposures and outcomes investigated. Primary summary statistics and extracted data from the studies are presented in Graphical Overview for Evidence Reviews diagrams. There was substantial heterogeneity within and between studies. While global trends showed an increase in DOHaD publications over the last decade, the majority of data reported were from high-income countries. Articles were categorized under six exposure domains: Early Life Nutrition, Maternal/Paternal Health, Maternal/Paternal Psychological Exposure, Toxicants/Environment, Social Determinants, and Others. Studies examining social determinants of health and paternal influences were underrepresented. Only 23% of the articles explored resiliency factors. We synthesized major evidence on relationships between early life exposures and developmental and health outcomes, identifying risk and resiliency factors that influence later life health. Our findings provide insight into important trends and gaps in knowledge within many exposures and outcome domains.

Download Full-text

A Systematic Review of Deep Learning Approaches to Educational Data Mining

Complexity ◽

10.1155/2019/1306039 ◽

2019 ◽

Vol 2019 ◽

pp. 1-22 ◽

Cited By ~ 15

Author(s):

Antonio Hernández-Blanco ◽

Boris Herrera-Flores ◽

David Tomás ◽

Borja Navarro-Colorado

Keyword(s):

Machine Learning ◽

Data Mining ◽

Deep Learning ◽

Language Processing ◽

Educational Data Mining ◽

Research Field ◽

Machine Learning Techniques ◽

Mining Machine ◽

Learning Approaches ◽

Learning Techniques

Educational Data Mining (EDM) is a research field that focuses on the application of data mining, machine learning, and statistical methods to detect patterns in large collections of educational data. Different machine learning techniques have been applied in this field over the years, but it has been recently that Deep Learning has gained increasing attention in the educational domain. Deep Learning is a machine learning method based on neural network architectures with multiple layers of processing units, which has been successfully applied to a broad set of problems in the areas of image recognition and natural language processing. This paper surveys the research carried out in Deep Learning techniques applied to EDM, from its origins to the present day. The main goals of this study are to identify the EDM tasks that have benefited from Deep Learning and those that are pending to be explored, to describe the main datasets used, to provide an overview of the key concepts, main architectures, and configurations of Deep Learning and its applications to EDM, and to discuss current state-of-the-art and future directions on this area of research.

Download Full-text

Framework for Infectious Disease Analysis: A comprehensive and integrative multi-modeling approach to disease prediction and management

Health Informatics Journal ◽

10.1177/1460458217747112 ◽

2017 ◽

Vol 25 (4) ◽

pp. 1170-1187 ◽

Cited By ~ 11

Author(s):

Madhav Erraguntla ◽

Josef Zapletal ◽

Mark Lawley

Keyword(s):

Infectious Disease ◽

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Support Vector ◽

Human Populations ◽

Illustrative Case ◽

The Impact ◽

Disease Analysis

The impact of infectious disease on human populations is a function of many factors including environmental conditions, vector dynamics, transmission mechanics, social and cultural behaviors, and public policy. A comprehensive framework for disease management must fully connect the complete disease lifecycle, including emergence from reservoir populations, zoonotic vector transmission, and impact on human societies. The Framework for Infectious Disease Analysis is a software environment and conceptual architecture for data integration, situational awareness, visualization, prediction, and intervention assessment. Framework for Infectious Disease Analysis automatically collects biosurveillance data using natural language processing, integrates structured and unstructured data from multiple sources, applies advanced machine learning, and uses multi-modeling for analyzing disease dynamics and testing interventions in complex, heterogeneous populations. In the illustrative case studies, natural language processing from social media, news feeds, and websites was used for information extraction, biosurveillance, and situation awareness. Classification machine learning algorithms (support vector machines, random forests, and boosting) were used for disease predictions.

Download Full-text

Dopamine Genotype Interacts with Inter-Individual Licking Received on Later-Life Licking Provisioning in Female Rat Offspring

10.1101/2019.12.29.890467 ◽

2019 ◽

Author(s):

Samantha C. Lauby ◽

David G. Ashbrook ◽

Hannan R. Malik ◽

Diptendu Chatterjee ◽

Pauline Pan ◽

...

Keyword(s):

Genetic Variation ◽

Maternal Care ◽

Early Life ◽

Later Life ◽

Human Populations ◽

Female Rat ◽

Systematic Analysis ◽

Dopaminergic Activity ◽

Rat Offspring ◽

The Relationship

AbstractIn most mammals, mothers exhibit natural variations in care that propagate between generations of female offspring. However, there is limited information on genetic variation that influences this propagation. We assessed early-life maternal care received by individual female rat offspring in relation to genetic polymorphisms linked to dopaminergic activity, maternal care provisioning, and dopaminergic activity in the maternal brain. We also conducted a systematic analysis of other genetic variants potentially related to maternal behavior in our Long-Evans rat population. We found that dopamine receptor 2 (rs107017253) variation interacted with the relationship between early-life maternal care received and dopamine levels in the nucleus accumbens which, in turn, were associated with later-life maternal care provisioning. We also discovered and validated new variants that were predicted by our systematic analysis. Our findings suggest that genetic variation influences the relationship between maternal care received and maternal care provisioning, similar to findings in human populations.

Download Full-text

Data Extraction and Sentimental Analysis from “Twitter” using Web Scrapping.

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a2226.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 6451-6455

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Data Collection ◽

Language Processing ◽

Presidential Elections ◽

Data Extraction ◽

Textual Data ◽

Spell Check ◽

By Products

In this paper , we attempt to do the sentimental analysis of the 2016 US presidential elections. Sentimental analysis requires the data to be extracted from websites or sources where people present their opinions, views ,complaints about the subjects that need to analyzed .Furthermore, it is necessary to ensure that the sample size of the data is large enough to get conclusive results .It is also necessary to ensure that the data is cleaned before it is used to make predictions. Cleaning is done using common techniques like tokenization, spell check ,etc. Sentimental Analysis is one of the by-products of Natural Language Processing . This paper includes data collection as well as classification of textual data based on machine learning .

Download Full-text

Deep Learning Based High-Resolution Remote Sensing Image classification

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i10.384 ◽

2017 ◽

Vol 7 (10) ◽

pp. 22

Author(s):

Sumit Kaur

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Deep Learning ◽

Image Classification ◽

Language Processing ◽

Object Perception ◽

Remote Sensing Image ◽

Research Area ◽

Remote Sensing Image Classification ◽

Unsupervised Algorithms

Abstract- Deep learning is an emerging research area in machine learning and pattern recognition field which has been presented with the goal of drawing Machine Learning nearer to one of its unique objectives, Artificial Intelligence. It tries to mimic the human brain, which is capable of processing and learning from the complex input data and solving different kinds of complicated tasks well. Deep learning (DL) basically based on a set of supervised and unsupervised algorithms that attempt to model higher level abstractions in data and make it self-learning for hierarchical representation for classification. In the recent years, it has attracted much attention due to its state-of-the-art performance in diverse areas like object perception, speech recognition, computer vision, collaborative filtering and natural language processing. This paper will present a survey on different deep learning techniques for remote sensing image classification.

Download Full-text

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

10.26434/chemrxiv.5513581.v1 ◽

2017 ◽

Author(s):

Sabrina Jaeger ◽

Simone Fulle ◽

Samo Turk

Keyword(s):

Machine Learning ◽

Language Processing ◽

Supervised Machine Learning ◽

Learning Approach ◽

Learning Approaches ◽

Unsupervised Machine Learning ◽

Feature Representations ◽

Machine Learning Approach ◽

The Individual ◽

Vector Representations

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.

Download Full-text

Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing - FeatureEng '05

10.3115/1610230 ◽

2005 ◽

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Feature Engineering

Download Full-text

A Machine Learning Application for Raising WASH Awareness in the Times of COVID-19 Pandemic (Preprint)

10.2196/preprints.25320 ◽

2020 ◽

Cited By ~ 1

Author(s):

Rohan Pandey ◽

Vaibhav Gautam ◽

Ridam Pal ◽

Harsh Bandhey ◽

Lovedeep Singh Dhingra ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

User Feedback ◽

Who Guidelines ◽

The Times ◽

The Right ◽

Local Languages

BACKGROUND The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this ‘Infodemic’ requires strong health messaging systems that are engaging, vernacular, scalable, effective and continuously learn the new patterns of misinformation. OBJECTIVE We created WashKaro, a multi-pronged intervention for mitigating misinformation through conversational AI, machine translation and natural language processing. WashKaro provides the right information matched against WHO guidelines through AI, and delivers it in the right format in local languages. METHODS We theorize (i) an NLP based AI engine that could continuously incorporate user feedback to improve relevance of information, (ii) bite sized audio in the local language to improve penetrance in a country with skewed gender literacy ratios, and (iii) conversational but interactive AI engagement with users towards an increased health awareness in the community. RESULTS A total of 5026 people who downloaded the app during the study window, among those 1545 were active users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot “Satya” increased thus proving the usefulness of an mHealth platform to mitigate health misinformation. CONCLUSIONS We conclude that a multi-pronged machine learning application delivering vernacular bite-sized audios and conversational AI is an effective approach to mitigate health misinformation. CLINICALTRIAL Not Applicable

Download Full-text

Race and Gender

The Oxford Handbook of Ethics of AI ◽

10.1093/oxfordhb/9780190067397.013.16 ◽

2020 ◽

pp. 251-269 ◽

Cited By ~ 2

Author(s):

Timnit Gebru

Keyword(s):

Machine Learning ◽

Language Processing ◽

The United States ◽

Error Rates ◽

Political Factors ◽

Recidivism Rates ◽

Race And Gender ◽

Decision Tools ◽

And Gender ◽

Technical Solutions

This chapter discusses the role of race and gender in artificial intelligence (AI). The rapid permeation of AI into society has not been accompanied by a thorough investigation of the sociopolitical issues that cause certain groups of people to be harmed rather than advantaged by it. For instance, recent studies have shown that commercial automated facial analysis systems have much higher error rates for dark-skinned women, while having minimal errors on light-skinned men. Moreover, a 2016 ProPublica investigation uncovered that machine learning–based tools that assess crime recidivism rates in the United States are biased against African Americans. Other studies show that natural language–processing tools trained on news articles exhibit societal biases. While many technical solutions have been proposed to alleviate bias in machine learning systems, a holistic and multifaceted approach must be taken. This includes standardization bodies determining what types of systems can be used in which scenarios, making sure that automated decision tools are created by people from diverse backgrounds, and understanding the historical and political factors that disadvantage certain groups who are subjected to these tools.

Download Full-text