scholarly journals Human–Machine Scientific Discovery

2021 ◽  
pp. 297-315
Author(s):  
Alireza Tamaddoni-Nezhad ◽  
David Bohan ◽  
Ghazal Afroozi Milani ◽  
Alan Raybould ◽  
Stephen Muggleton

Humanity is facing existential, societal challenges related to food security, ecosystem conservation, antimicrobial resistance, etc, and Artificial Intelligence (AI) is already playing an important role in tackling these new challenges. Most current AI approaches are limited when it comes to ‘knowledge transfer’ with humans, i.e. it is difficult to incorporate existing human knowledge and also the output knowledge is not human comprehensible. In this chapter we demonstrate how a combination of comprehensible machine learning, text-mining and domain knowledge could enhance human-machine collaboration for the purpose of automated scientific discovery where humans and computers jointly develop and evaluate scientific theories. As a case study, we describe a combination of logic-based machine learning (which included human-encoded ecological background knowledge) and text-mining from scientific publications (to verify machine-learned hypotheses) for the purpose of automated discovery of ecological interaction networks (food-webs) to detect change in agricultural ecosystems using the Farm Scale Evaluations (FSEs) of genetically modified herbicide-tolerant (GMHT) crops dataset. The results included novel food-web hypotheses, some confirmed by subsequent experimental studies (e.g. DNA analysis) and published in scientific journals. These machine-leaned food-webs were also used as the basis of a recent study revealing resilience of agro-ecosystems to changes in farming management using GMHT crops.

2021 ◽  
Vol 11 ◽  
Author(s):  
Zeyu Zhang ◽  
Lei Yao ◽  
Wenlong Wang ◽  
Bo Jiang ◽  
Fada Xia ◽  
...  

IntroductionThyroid cancer (TC) is the most common neck malignancy. However, a large number of publications of TC have not been well summarized and discussed with more comprehensive methods. The purpose of this bibliometric study is to summarize scientific publications during the past three decades in the field of TC using a machine learning method.Material and MethodsScientific publications focusing on TC from 1990 to 2020 were searched in PubMed using the MeSH term “thyroid neoplasms”. Full associated data were downloaded in the format of PubMed, and extracted in the R platform. Latent Dirichlet allocation (LDA) was adopted to identify the research topics from the abstract of each publication using Python.ResultsA total of 34,692 publications related to TC from the last three decades were found and included in this study with an average of 1,119.1 publications per year. Clinical studies and experimental studies shared the most proportion of publications, while the proportion of clinical trials remained at a relatively small level (5.87% as the highest in 2004). Thyroidectomy was the lead MeSH term, followed by prognosis, differential diagnosis, and fine-needle biopsy. The LDA analyses showed the study topics were divided into four clusters, including treatment management, basic research, diagnosis research, epidemiology, and cancer risk. However, a relatively weak connection was shown between treatment managements and basic researches. Top 10 most cited publications in recent years particularly highlighted the applications of active surveillance in TC.ConclusionThyroidectomy, differential diagnosis, genomic analysis, active surveillance are the most concerning topics in TC researches. Although the BRAF-targeted therapy is under development with promising results, there is still an urgent need for conversions from basic studies to clinical practice.


PLoS ONE ◽  
2011 ◽  
Vol 6 (12) ◽  
pp. e29028 ◽  
Author(s):  
David A. Bohan ◽  
Geoffrey Caron-Lormier ◽  
Stephen Muggleton ◽  
Alan Raybould ◽  
Alireza Tamaddoni-Nezhad

Author(s):  
Alireza Tamaddoni-Nezhad ◽  
Ghazal Afroozi Milani ◽  
Alan Raybould ◽  
Stephen Muggleton ◽  
David A. Bohan

Author(s):  
Aleksey Klokov ◽  
Evgenii Slobodyuk ◽  
Michael Charnine

The object of the research when writing the work was the body of text data collected together with the scientific advisor and the algorithms for processing the natural language of analysis. The stream of hypotheses has been tested against computer science scientific publications through a series of simulation experiments described in this dissertation. The subject of the research is algorithms and the results of the algorithms, aimed at predicting promising topics and terms that appear in the course of time in the scientific environment. The result of this work is a set of machine learning models, with the help of which experiments were carried out to identify promising terms and semantic relationships in the text corpus. The resulting models can be used for semantic processing and analysis of other subject areas.


2021 ◽  
Vol 22 (10) ◽  
pp. 5056
Author(s):  
Tulio L. Campos ◽  
Pasi K. Korhonen ◽  
Neil D. Young

Experimental studies of Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular and cellular processes in metazoans at large. Since the publication of their genomes, functional genomic investigations have identified genes that are essential or non-essential for survival in each species. Recently, a range of features linked to gene essentiality have been inferred using a machine learning (ML)-based approach, allowing essentiality predictions within a species. Nevertheless, predictions between species are still elusive. Here, we undertake a comprehensive study using ML to discover and validate features of essential genes common to both C. elegans and D. melanogaster. We demonstrate that the cross-species prediction of gene essentiality is possible using a subset of features linked to nucleotide/protein sequences, protein orthology and subcellular localisation, single-cell RNA-seq, and histone methylation markers. Complementary analyses showed that essential genes are enriched for transcription and translation functions and are preferentially located away from heterochromatin regions of C. elegans and D. melanogaster chromosomes. The present work should enable the cross-prediction of essential genes between model and non-model metazoans.


2021 ◽  
pp. 002224292199708
Author(s):  
Raji Srinivasan ◽  
Gülen Sarial-Abi

Algorithms increasingly used by brands sometimes fail to perform as expected or even worse, cause harm, causing brand harm crises. Unfortunately, algorithm failures are increasing in frequency. Yet, we know little about consumers’ responses to brands following such brand harm crises. Extending developments in the theory of mind perception, we hypothesize that following a brand harm crisis caused by an algorithm error (vs. human error), consumers will respond less negatively to the brand. We further hypothesize that consumers’ lower mind perception of agency of the algorithm (vs. human) for the error that lowers their perceptions of the algorithm’s responsibility for the harm caused by the error will mediate this relationship. We also hypothesize four moderators of this relationship: two algorithm characteristics, anthropomorphized algorithm and machine learning algorithm and two task characteristics where the algorithm is deployed, subjective (vs. objective) task and interactive (vs. non-interactive) task. We find support for the hypotheses in eight experimental studies including two incentive-compatible studies. We examine the effects of two managerial interventions to manage the aftermath of brand harm crises caused by algorithm errors. The research’s findings advance the literature on brand harm crises, algorithm usage, and algorithmic marketing and generate managerial guidelines to address the aftermath of such brand harm crises.


2021 ◽  
pp. 000370282110345
Author(s):  
Tatu Rojalin ◽  
Dexter Antonio ◽  
Ambarish Kulkarni ◽  
Randy P. Carney

Surface-enhanced Raman scattering (SERS) is a powerful technique for sensitive label-free analysis of chemical and biological samples. While much recent work has established sophisticated automation routines using machine learning and related artificial intelligence methods, these efforts have largely focused on downstream processing (e.g., classification tasks) of previously collected data. While fully automated analysis pipelines are desirable, current progress is limited by cumbersome and manually intensive sample preparation and data collection steps. Specifically, a typical lab-scale SERS experiment requires the user to evaluate the quality and reliability of the measurement (i.e., the spectra) as the data are being collected. This need for expert user-intuition is a major bottleneck that limits applicability of SERS-based diagnostics for point-of-care clinical applications, where trained spectroscopists are likely unavailable. While application-agnostic numerical approaches (e.g., signal-to-noise thresholding) are useful, there is an urgent need to develop algorithms that leverage expert user intuition and domain knowledge to simplify and accelerate data collection steps. To address this challenge, in this work, we introduce a machine learning-assisted method at the acquisition stage. We tested six common algorithms to measure best performance in the context of spectral quality judgment. For adoption into future automation platforms, we developed an open-source python package tailored for rapid expert user annotation to train machine learning algorithms. We expect that this new approach to use machine learning to assist in data acquisition can serve as a useful building block for point-of-care SERS diagnostic platforms.


Sign in / Sign up

Export Citation Format

Share Document