scholarly journals Recent advances in the application of data science and machine/deep learning methods to research in chemical and biological sciences : a review

2020 ◽  
Author(s):  
Ben Geoffrey

The rise in application of methods of data science and machine/deep learning in chemical and biological sciences must be discussed in the light of the fore-running disciplines of bio/chem-informatics and computational chemistry and biology which helped in the accumulation ofenormous research data because of which successful application of data-driven approaches have been made possible now. Many of the tasks and goals of Ab initio methods in computational chemistry such as determination of optimized structure and other molecular properties of atoms, molecules, and compounds are being carried out with much lesser computational cost with data-driven machine/deep learning-based predictions. One observes a similar trend in computational biology, wherein, data-driven machine/deep learning methods are being proposed to predict the structure and dynamical of interactions of biological macromolecules such as proteins and DNA over computational expensive molecular dynamics based methods. In the cheminformatics space,one sees the rise of deep neural network-based methods that have scaled traditional structure-property/structure-activity to handle big data to design new materials with desired property and drugs with required activity in deep learning-based de novo molecular design methods. In thebioinformatics space, data-driven machine/deep learning approaches to genomic and proteomic data have led to interesting applications in fields such as precision medicine, prognosis prediction, and more. Thus the success story of the application of data science, machine/deep learning, andartificial intelligence to the disciple of chem/bio-informatics, and computational chemistry and biology has been told in light of how these fore-running disciplines had created huge repositories of data for data-driven approaches to be successful in these disciplines.

2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Filip Ferdinand ◽  
...  

Cancers ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 2764
Author(s):  
Xin Yu Liew ◽  
Nazia Hameed ◽  
Jeremie Clos

A computer-aided diagnosis (CAD) expert system is a powerful tool to efficiently assist a pathologist in achieving an early diagnosis of breast cancer. This process identifies the presence of cancer in breast tissue samples and the distinct type of cancer stages. In a standard CAD system, the main process involves image pre-processing, segmentation, feature extraction, feature selection, classification, and performance evaluation. In this review paper, we reviewed the existing state-of-the-art machine learning approaches applied at each stage involving conventional methods and deep learning methods, the comparisons within methods, and we provide technical details with advantages and disadvantages. The aims are to investigate the impact of CAD systems using histopathology images, investigate deep learning methods that outperform conventional methods, and provide a summary for future researchers to analyse and improve the existing techniques used. Lastly, we will discuss the research gaps of existing machine learning approaches for implementation and propose future direction guidelines for upcoming researchers.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 667
Author(s):  
Wei Chen ◽  
Qiang Sun ◽  
Xiaomin Chen ◽  
Gangcai Xie ◽  
Huiqun Wu ◽  
...  

The automated classification of heart sounds plays a significant role in the diagnosis of cardiovascular diseases (CVDs). With the recent introduction of medical big data and artificial intelligence technology, there has been an increased focus on the development of deep learning approaches for heart sound classification. However, despite significant achievements in this field, there are still limitations due to insufficient data, inefficient training, and the unavailability of effective models. With the aim of improving the accuracy of heart sounds classification, an in-depth systematic review and an analysis of existing deep learning methods were performed in the present study, with an emphasis on the convolutional neural network (CNN) and recurrent neural network (RNN) methods developed over the last five years. This paper also discusses the challenges and expected future trends in the application of deep learning to heart sounds classification with the objective of providing an essential reference for further study.


2018 ◽  
Vol 1 (1) ◽  
pp. 181-205 ◽  
Author(s):  
Pierre Baldi

Since the 1980s, deep learning and biomedical data have been coevolving and feeding each other. The breadth, complexity, and rapidly expanding size of biomedical data have stimulated the development of novel deep learning methods, and application of these methods to biomedical data have led to scientific discoveries and practical solutions. This overview provides technical and historical pointers to the field, and surveys current applications of deep learning to biomedical data organized around five subareas, roughly of increasing spatial scale: chemoinformatics, proteomics, genomics and transcriptomics, biomedical imaging, and health care. The black box problem of deep learning methods is also briefly discussed.


Energies ◽  
2019 ◽  
Vol 12 (14) ◽  
pp. 2692 ◽  
Author(s):  
Juncheng Zhu ◽  
Zhile Yang ◽  
Monjur Mourshed ◽  
Yuanjun Guo ◽  
Yimin Zhou ◽  
...  

Load forecasting is one of the major challenges of power system operation and is crucial to the effective scheduling for economic dispatch at multiple time scales. Numerous load forecasting methods have been proposed for household and commercial demand, as well as for loads at various nodes in a power grid. However, compared with conventional loads, the uncoordinated charging of the large penetration of plug-in electric vehicles is different in terms of periodicity and fluctuation, which renders current load forecasting techniques ineffective. Deep learning methods, empowered by unprecedented learning ability from extensive data, provide novel approaches for solving challenging forecasting tasks. This research proposes a comparative study of deep learning approaches to forecast the super-short-term stochastic charging load of plug-in electric vehicles. Several popular and novel deep-learning based methods have been utilized in establishing the forecasting models using minute-level real-world data of a plug-in electric vehicle charging station to compare the forecasting performance. Numerical results of twelve cases on various time steps show that deep learning methods obtain high accuracy in super-short-term plug-in electric load forecasting. Among the various deep learning approaches, the long-short-term memory method performs the best by reducing over 30% forecasting error compared with the conventional artificial neural network model.


2019 ◽  
Vol 21 (5) ◽  
pp. 1609-1627 ◽  
Author(s):  
Tianlin Zhang ◽  
Jiaxu Leng ◽  
Ying Liu

AbstractDrug–drug interactions (DDIs) are crucial for drug research and pharmacovigilance. These interactions may cause adverse drug effects that threaten public health and patient safety. Therefore, the DDIs extraction from biomedical literature has been widely studied and emphasized in modern biomedical research. The previous rules-based and machine learning approaches rely on tedious feature engineering, which is labourious, time-consuming and unsatisfactory. With the development of deep learning technologies, this problem is alleviated by learning feature representations automatically. Here, we review the recent deep learning methods that have been applied to the extraction of DDIs from biomedical literature. We describe each method briefly and compare its performance in the DDI corpus systematically. Next, we summarize the advantages and disadvantages of these deep learning models for this task. Furthermore, we discuss some challenges and future perspectives of DDI extraction via deep learning methods. This review aims to serve as a useful guide for interested researchers to further advance bioinformatics algorithms for DDIs extraction from the literature.


Geophysics ◽  
2020 ◽  
pp. 1-61
Author(s):  
Janaki Vamaraju ◽  
Jeremy Vila ◽  
Mauricio Araya-Polo ◽  
Debanjan Datta ◽  
Mohamed Sidahmed ◽  
...  

Migration techniques are an integral part of seismic imaging workflows. Least-squares reverse time migration (LSRTM) overcomes some of the shortcomings of conventional migration algorithms by compensating for illumination and removing sampling artifacts to increase spatial resolution. However, the computational cost associated with iterative LSRTM is high and convergence can be slow in complex media. We implement pre-stack LSRTM in a deep learning framework and adopt strategies from the data science domain to accelerate convergence. The proposed hybrid framework leverages the existing physics-based models and machine learning optimizers to achieve better and cheaper solutions. Using a time-domain formulation, we show that mini-batch gradients can reduce the computation cost by using a subset of total shots for each iteration. Mini-batch approach does not only reduce source cross-talk but also is less memory intensive. Combining mini-batch gradients with deep learning optimizers and loss functions can improve the efficiency of LSRTM. Deep learning optimizers such as the adaptive moment estimation are generally well suited for noisy and sparse data. We compare different optimizers and demonstrate their efficacy in mitigating migration artifacts. To accelerate the inversion, we adopt the regularised Huber loss function in conjunction. We apply these techniques to 2D Marmousi and 3D SEG/EAGE salt models and show improvements over conventional LSRTM baselines. The proposed approach achieves higher spatial resolution in less computation time measured by various qualitative and quantitative evaluation metrics.


2018 ◽  
Author(s):  
Bhavna J. Antony ◽  
Stefan Maetschke ◽  
Rahil Garnavi

AbstractSpectral-domain optical coherence tomography (SDOCT) is a non-invasive imaging modality that generates high-resolution volumetric images. This modality finds widespread usage in ophthalmology for the diagnosis and management of various ocular conditions. The volumes generated can contain 200 or more B-scans. Manual inspection of such large quantity of scans is time consuming and error prone in most clinical settings. Here, we present a method for the generation of visual summaries of SDOCT volumes, wherein a small set of B-scans that highlight the most clinically relevant features in a volume are extracted. The method was trained and evaluated on data acquired from age-related macular degeneration patients, and “relevance” was defined as the presence of visibly discernible structural abnormalities. The summarisation system consists of a detection module, where relevant B-scans are extracted from the volume, and a set of rules that determines which B-scans are included in the visual summary. Two deep learning approaches are presented and compared for the classification of B-scans - transfer learning and de novo learning. Both approaches performed comparably with AUCs of 0.97 and 0.96, respectively, obtained on an independent test set. The de novo network, however, was 98% smaller than the transfer learning approach, and had a run-time that was also significantly shorter.


AI Magazine ◽  
2022 ◽  
Vol 42 (3) ◽  
pp. 7-18
Author(s):  
Harald Steck ◽  
Linas Baltrunas ◽  
Ehtsham Elahi ◽  
Dawen Liang ◽  
Yves Raimond ◽  
...  

Deep learning has profoundly impacted many areas of machine learning. However, it took a while for its impact to be felt in the field of recommender systems. In this article, we outline some of the challenges encountered and lessons learned in using deep learning for recommender systems at Netflix. We first provide an overview of the various recommendation tasks on the Netflix service. We found that different model architectures excel at different tasks. Even though many deep-learning models can be understood as extensions of existing (simple) recommendation algorithms, we initially did not observe significant improvements in performance over well-tuned non-deep-learning approaches. Only when we added numerous features of heterogeneous types to the input data, deep-learning models did start to shine in our setting. We also observed that deep-learning methods can exacerbate the problem of offline–online metric (mis-)alignment. After addressing these challenges, deep learning has ultimately resulted in large improvements to our recommendations as measured by both offline and online metrics. On the practical side, integrating deep-learning toolboxes in our system has made it faster and easier to implement and experiment with both deep-learning and non-deep-learning approaches for various recommendation tasks. We conclude this article by summarizing our take-aways that may generalize to other applications beyond Netflix.


Forecasting ◽  
2021 ◽  
Vol 4 (1) ◽  
pp. 1-25
Author(s):  
Thabang Mathonsi ◽  
Terence L. van Zyl

Hybrid methods have been shown to outperform pure statistical and pure deep learning methods at forecasting tasks and quantifying the associated uncertainty with those forecasts (prediction intervals). One example is Exponential Smoothing Recurrent Neural Network (ES-RNN), a hybrid between a statistical forecasting model and a recurrent neural network variant. ES-RNN achieves a 9.4% improvement in absolute error in the Makridakis-4 Forecasting Competition. This improvement and similar outperformance from other hybrid models have primarily been demonstrated only on univariate datasets. Difficulties with applying hybrid forecast methods to multivariate data include (i) the high computational cost involved in hyperparameter tuning for models that are not parsimonious, (ii) challenges associated with auto-correlation inherent in the data, as well as (iii) complex dependency (cross-correlation) between the covariates that may be hard to capture. This paper presents Multivariate Exponential Smoothing Long Short Term Memory (MES-LSTM), a generalized multivariate extension to ES-RNN, that overcomes these challenges. MES-LSTM utilizes a vectorized implementation. We test MES-LSTM on several aggregated coronavirus disease of 2019 (COVID-19) morbidity datasets and find our hybrid approach shows consistent, significant improvement over pure statistical and deep learning methods at forecast accuracy and prediction interval construction.


Sign in / Sign up

Export Citation Format

Share Document