Recent Progress in Machine Learning-based Prediction of Peptide Activity for Drug Discovery

2019 ◽  
Vol 19 (1) ◽  
pp. 4-16 ◽  
Author(s):  
Qihui Wu ◽  
Hanzhong Ke ◽  
Dongli Li ◽  
Qi Wang ◽  
Jiansong Fang ◽  
...  

Over the past decades, peptide as a therapeutic candidate has received increasing attention in drug discovery, especially for antimicrobial peptides (AMPs), anticancer peptides (ACPs) and antiinflammatory peptides (AIPs). It is considered that the peptides can regulate various complex diseases which are previously untouchable. In recent years, the critical problem of antimicrobial resistance drives the pharmaceutical industry to look for new therapeutic agents. Compared to organic small drugs, peptide- based therapy exhibits high specificity and minimal toxicity. Thus, peptides are widely recruited in the design and discovery of new potent drugs. Currently, large-scale screening of peptide activity with traditional approaches is costly, time-consuming and labor-intensive. Hence, in silico methods, mainly machine learning approaches, for their accuracy and effectiveness, have been introduced to predict the peptide activity. In this review, we document the recent progress in machine learning-based prediction of peptides which will be of great benefit to the discovery of potential active AMPs, ACPs and AIPs.

2019 ◽  
Vol 73 (12) ◽  
pp. 983-989 ◽  
Author(s):  
Alberto Fabrizio ◽  
Benjamin Meyer ◽  
Raimon Fabregat ◽  
Clemence Corminboeuf

In this account, we demonstrate how statistical learning approaches can be leveraged across a range of different quantum chemical areas to transform the scaling, nature, and complexity of the problems that we are tackling. Selected examples illustrate the power brought by kernel-based approaches in the large-scale screening of homogeneous catalysis, the prediction of fundamental quantum chemical properties and the free-energy landscapes of flexible organic molecules. While certainly non-exhaustive, these examples provide an intriguing glimpse into our own research efforts.


2019 ◽  
Vol 78 (5) ◽  
pp. 617-628 ◽  
Author(s):  
Erika Van Nieuwenhove ◽  
Vasiliki Lagou ◽  
Lien Van Eyck ◽  
James Dooley ◽  
Ulrich Bodenhofer ◽  
...  

ObjectivesJuvenile idiopathic arthritis (JIA) is the most common class of childhood rheumatic diseases, with distinct disease subsets that may have diverging pathophysiological origins. Both adaptive and innate immune processes have been proposed as primary drivers, which may account for the observed clinical heterogeneity, but few high-depth studies have been performed.MethodsHere we profiled the adaptive immune system of 85 patients with JIA and 43 age-matched controls with indepth flow cytometry and machine learning approaches.ResultsImmune profiling identified immunological changes in patients with JIA. This immune signature was shared across a broad spectrum of childhood inflammatory diseases. The immune signature was identified in clinically distinct subsets of JIA, but was accentuated in patients with systemic JIA and those patients with active disease. Despite the extensive overlap in the immunological spectrum exhibited by healthy children and patients with JIA, machine learning analysis of the data set proved capable of discriminating patients with JIA from healthy controls with ~90% accuracy.ConclusionsThese results pave the way for large-scale immune phenotyping longitudinal studies of JIA. The ability to discriminate between patients with JIA and healthy individuals provides proof of principle for the use of machine learning to identify immune signatures that are predictive to treatment response group.


2018 ◽  
Vol 14 (4) ◽  
pp. 734-747 ◽  
Author(s):  
Constance de Saint Laurent

There has been much hype, over the past few years, about the recent progress of artificial intelligence (AI), especially through machine learning. If one is to believe many of the headlines that have proliferated in the media, as well as in an increasing number of scientific publications, it would seem that AI is now capable of creating and learning in ways that are starting to resemble what humans can do. And so that we should start to hope – or fear – that the creation of fully cognisant machine might be something we will witness in our life time. However, much of these beliefs are based on deep misconceptions about what AI can do, and how. In this paper, I start with a brief introduction to the principles of AI, machine learning, and neural networks, primarily intended for psychologists and social scientists, who often have much to contribute to the debates surrounding AI but lack a clear understanding of what it can currently do and how it works. I then debunk four common myths associated with AI: 1) it can create, 2) it can learn, 3) it is neutral and objective, and 4) it can solve ethically and/or culturally sensitive problems. In a third and last section, I argue that these misconceptions represent four main dangers: 1) avoiding debate, 2) naturalising our biases, 3) deresponsibilising creators and users, and 4) missing out some of the potential uses of machine learning. I finally conclude on the potential benefits of using machine learning in research, and thus on the need to defend machine learning without romanticising what it can actually do.


Author(s):  
Ying Qin

This study extracts the comments from a large scale of Chinese EFL learners' translation corpus to study the taxonomy of translation errors. Two unsupervised machine learning approaches are used to obtain the computational evidences of translation error taxonomy. After manually revision, ten types of English to Chinese (E2C) and eight types Chinese to English (C2E) translation errors are finally confirmed. There probably exists three categories of top-level errors according to the hierarchical clustering results. In addition, three supervised learning methods are applied to automatically recognize the types of errors, among which the highest performance reaches F1 = 0.85 on E2C and F1 = 0.90 on C2E translation. Further comparison to the intuitive or theoretical studies on translation taxonomy shows some phenomenon accompanied by language skill improvement of Chinese learners. Analysis on translation problems based on machine learning provides the objective insight and understanding on the students' translations.


Author(s):  
Bradford William Hesse

The presence of large-scale data systems can be felt, consciously or not, in almost every facet of modern life, whether through the simple act of selecting travel options online, purchasing products from online retailers, or navigating through the streets of an unfamiliar neighborhood using global positioning system (GPS) mapping. These systems operate through the momentum of big data, a term introduced by data scientists to describe a data-rich environment enabled by a superconvergence of advanced computer-processing speeds and storage capacities; advanced connectivity between people and devices through the Internet; the ubiquity of smart, mobile devices and wireless sensors; and the creation of accelerated data flows among systems in the global economy. Some researchers have suggested that big data represents the so-called fourth paradigm in science, wherein the first paradigm was marked by the evolution of the experimental method, the second was brought about by the maturation of theory, the third was marked by an evolution of statistical methodology as enabled by computational technology, while the fourth extended the benefits of the first three, but also enabled the application of novel machine-learning approaches to an evidence stream that exists in high volume, high velocity, high variety, and differing levels of veracity. In public health and medicine, the emergence of big data capabilities has followed naturally from the expansion of data streams from genome sequencing, protein identification, environmental surveillance, and passive patient sensing. In 2001, the National Committee on Vital and Health Statistics published a road map for connecting these evidence streams to each other through a national health information infrastructure. Since then, the road map has spurred national investments in electronic health records (EHRs) and motivated the integration of public surveillance data into analytic platforms for health situational awareness. More recently, the boom in consumer-oriented mobile applications and wireless medical sensing devices has opened up the possibility for mining new data flows directly from altruistic patients. In the broader public communication sphere, the ability to mine the digital traces of conversation on social media presents an opportunity to apply advanced machine learning algorithms as a way of tracking the diffusion of risk communication messages. In addition to utilizing big data for improving the scientific knowledge base in risk communication, there will be a need for health communication scientists and practitioners to work as part of interdisciplinary teams to improve the interfaces to these data for professionals and the public. Too much data, presented in disorganized ways, can lead to what some have referred to as “data smog.” Much work will be needed for understanding how to turn big data into knowledge, and just as important, how to turn data-informed knowledge into action.


2020 ◽  
Vol 142 (8) ◽  
pp. 3814-3822 ◽  
Author(s):  
George S. Fanourgakis ◽  
Konstantinos Gkagkas ◽  
Emmanuel Tylianakis ◽  
George E. Froudakis

2019 ◽  
Vol 20 (3) ◽  
pp. 185-193 ◽  
Author(s):  
Natalie Stephenson ◽  
Emily Shane ◽  
Jessica Chase ◽  
Jason Rowland ◽  
David Ries ◽  
...  

Background:Drug discovery, which is the process of discovering new candidate medications, is very important for pharmaceutical industries. At its current stage, discovering new drugs is still a very expensive and time-consuming process, requiring Phases I, II and III for clinical trials. Recently, machine learning techniques in Artificial Intelligence (AI), especially the deep learning techniques which allow a computational model to generate multiple layers, have been widely applied and achieved state-of-the-art performance in different fields, such as speech recognition, image classification, bioinformatics, etc. One very important application of these AI techniques is in the field of drug discovery.Methods:We did a large-scale literature search on existing scientific websites (e.g, ScienceDirect, Arxiv) and startup companies to understand current status of machine learning techniques in drug discovery.Results:Our experiments demonstrated that there are different patterns in machine learning fields and drug discovery fields. For example, keywords like prediction, brain, discovery, and treatment are usually in drug discovery fields. Also, the total number of papers published in drug discovery fields with machine learning techniques is increasing every year.Conclusion:The main focus of this survey is to understand the current status of machine learning techniques in the drug discovery field within both academic and industrial settings, and discuss its potential future applications. Several interesting patterns for machine learning techniques in drug discovery fields are discussed in this survey.


2017 ◽  
Vol 3 (1) ◽  
Author(s):  
Giorgos Borboudakis ◽  
Taxiarchis Stergiannakos ◽  
Maria Frysali ◽  
Emmanuel Klontzas ◽  
Ioannis Tsamardinos ◽  
...  

2021 ◽  
Author(s):  
◽  
Gareth Jones

Two of the most common forms of arterial disease are stenosis and aneurysm, estimated to affect between 1% and 20% of the population. Ruptured abdominal aortic aneurysms alone are estimated to be the cause of between 6,000 and 8,000 deaths a year within the United Kingdom. Patients with stenosis have been shown to have a mortality hazard ratio of 1.42 compared to a control population [2], and an unadjusted death rate of 3.35 per 100 person-years compared to 1.23 per 100 person-years in a control population [97]. Current methods for the detection of arterial disease are generally impractical for large scale screening, expensive, or both. If an inexpensive method for the detection of both stenosis and aneurysm is created, that minimises the need for invasive measurements, the cost effectiveness of large scale screening could be improved making both continuous monitoring and screening feasible. One such method is to use easily acquirable haemodynamic measurements at accessible peripheral locations within the circulatory system for diagnosis. Within this thesis an initial exploratory study into the potential of using machine learning classification algorithms to detect arterial disease from such measurements is presented.It is likely that the indicative biomarkers of arterial disease held within pressure and flow-rate profiles consist of micro inter- and intra- measurement details. To facilitate the use of a data driven approach to the discovery of any biomarkers a framework for the creation of virtual patients, through the employment of a mathematical model of blood flow, is presented. This framework is utilised to create a series of virtual patient databases, as the balance between simplicity and realism progresses through the thesis. The most realistic of these databases is made publicly available (https://doi.org/10.5281/zenodo.4549764). The aforementioned framework for the creation of virtual patients is a major contribution of this thesis, and can be applied to a wide range of biological systems given a mathematical description.The synthetic data sets are used to train and subsequently test a series of machine learning classifiers, to predict the presence of both stenosis and aneurysm, using various combinations of pressure and flow-rate measurements. It is shown that the inclusion of a diseased vessel (either stenosis or aneurysm) produces consistent and significant biomarkers in haemodynamic profiles, irrespective of a patients unique underlying arterial network. These biomarkers are found to be differentiable from the natural variability present across a large cohort of patients, showing that arterial disease has a clear and unique effect on pressure and flow-rate profiles. This suggests strong potential in the use of haemodynamic measurements to detect arterial disease.


Sign in / Sign up

Export Citation Format

Share Document