scholarly journals CSAO-1. Interrogative Biology: Unraveling insights into causal disease drivers by use of a dynamic systems biology and Bayesian AI to identify the intersect of disease and healthy signatures

2021 ◽  
Vol 3 (Supplement_2) ◽  
pp. ii1-ii1
Author(s):  
Niven Narain ◽  
Michael Kiebish ◽  
Vivek Vishnudas ◽  
Vladimir Tolstikov ◽  
Gregory Miller ◽  
...  

Abstract The past decade has been witness to an explosive proliferation of data analytics modalities, all seeking to unravel insight into large-scale data sets. Machine learning and AI methodologies now occupy a central role in analyses of data sets that range in nature from genomics, “omics”, clinical, real-world evidence, and demographic data. Despite advances in data analytics/machine learning, access to complex population level clinical and related datasets, translating information into actionable guidance in human health and disease remains a challenge. Interrogative Biology, a systems biology/AI platform generates an unbiased, data-informed network for identifying targets (disease drivers) and biomarkers for disease interception at the point of transition to dysregulation, preceding clinical phenotype. The data topology is enabled by a systematic acquisition and interrogation of longitudinal bio-samples of clinically annotated human matrices (e.g. blood, urine, saliva, tissues) subjected to comprehensive multi-omic (genomic, proteomics, lipidomics and metabolomics) profiling over time. The molecular profiles are integrated with clinical health information using Bayesian artificial intelligence analytics, bAIcis, to generate causal network maps of overall health. Differentials between “health” and “disease” network maps identifies drivers (targets and biomarkers) of disease and are rapidly validated in orthogonal wet-lab disease specific perturbed model systems. Target information imputed into the bAIcis framework can define therapeutic strategies including identification of existing drugs and bio-actives for corrective response. Using a combination of clinic based sampling and dried blood spot analysis for longitudinal dynamic monitoring of markers of health-disease status provides opportunity for proactive clinical management and intervention for corrective response in advance of major deterioration of health status. Taken together, the approach herein allows for health surveillance based on in-depth biological profiling of alterations in the patient narrative to guide treatment modalities and strategies in a longitudinal and dynamic manner to identify, track, intercept, and arrest human disease.

2021 ◽  
Vol 16 ◽  
Author(s):  
Yuqing Qian ◽  
Hao Meng ◽  
Weizhong Lu ◽  
Zhijun Liao ◽  
Yijie Ding ◽  
...  

Background: The identification of DNA binding proteins (DBP) is an important research field. Experiment-based methods are time-consuming and labor-intensive for detecting DBP. Objective: To solve the problem of large-scale DBP identification, some machine learning methods are proposed. However, these methods have insufficient predictive accuracy. Our aim is to develop a sequence-based machine learning model to predict DBP. Methods: In our study, we extract six types of features (including NMBAC, GE, MCD, PSSM-AB, PSSM-DWT, and PsePSSM) from protein sequences. We use Multiple Kernel Learning based on Hilbert-Schmidt Independence Criterion (MKL-HSIC) to estimate the optimal kernel. Then, we construct a hypergraph model to describe the relationship between labeled and unlabeled samples. Finally, Laplacian Support Vector Machines (LapSVM) is employed to train the predictive model. Our method is tested on PDB186, PDB1075, PDB2272 and PDB14189 data sets. Result: Compared with other methods, our model achieves best results on benchmark data sets. Conclusion: The accuracy of 87.1% and 74.2% are achieved on PDB186 (Independent test of PDB1075) and PDB2272 (Independent test of PDB14189), respectively.


2017 ◽  
pp. 83-99
Author(s):  
Sivamathi Chokkalingam ◽  
Vijayarani S.

The term Big Data refers to large-scale information management and analysis technologies that exceed the capability of traditional data processing technologies. Big Data is differentiated from traditional technologies in three ways: volume, velocity and variety of data. Big data analytics is the process of analyzing large data sets which contains a variety of data types to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. Since Big Data is new emerging field, there is a need for development of new technologies and algorithms for handling big data. The main objective of this paper is to provide knowledge about various research challenges of Big Data analytics. A brief overview of various types of Big Data analytics is discussed in this paper. For each analytics, the paper describes process steps and tools. A banking application is given for each analytics. Some of research challenges and possible solutions for those challenges of big data analytics are also discussed.


2019 ◽  
Vol 31 (2) ◽  
pp. 329-338 ◽  
Author(s):  
Jian Hu ◽  
Haiwan Zhu ◽  
Yimin Mao ◽  
Canlong Zhang ◽  
Tian Liang ◽  
...  

Landslide hazard prediction is a difficult, time-consuming process when traditional methods are used. This paper presents a method that uses machine learning to predict landslide hazard levels automatically. Due to difficulties in obtaining and effectively processing rainfall in landslide hazard prediction, and to the existing limitation in dealing with large-scale data sets in the M-chameleon algorithm, a new method based on an uncertain DM-chameleon algorithm (developed M-chameleon) is proposed to assess the landslide susceptibility model. First, this method designs a new two-phase clustering algorithm based on M-chameleon, which effectively processes large-scale data sets. Second, the new E-H distance formula is designed by combining the Euclidean and Hausdorff distances, and this enables the new method to manage uncertain data effectively. The uncertain data model is presented at the same time to effectively quantify triggering factors. Finally, the model for predicting landslide hazards is constructed and verified using the data from the Baota district of the city of Yan’an, China. The experimental results show that the uncertain DM-chameleon algorithm of machine learning can effectively improve the accuracy of landslide prediction and has high feasibility. Furthermore, the relationships between hazard factors and landslide hazard levels can be extracted based on clustering results.


2017 ◽  
Author(s):  
Christoph Sommer ◽  
Rudolf Hoefler ◽  
Matthias Samwer ◽  
Daniel W. Gerlich

AbstractSupervised machine learning is a powerful and widely used method to analyze high-content screening data. Despite its accuracy, efficiency, and versatility, supervised machine learning has drawbacks, most notably its dependence on a priori knowledge of expected phenotypes and time-consuming classifier training. We provide a solution to these limitations with CellCognition Explorer, a generic novelty detection and deep learning framework. Application to several large-scale screening data sets on nuclear and mitotic cell morphologies demonstrates that CellCognition Explorer enables discovery of rare phenotypes without user training, which has broad implications for improved assay development in high-content screening.


Author(s):  
Balasree K ◽  
Dharmarajan K

In rapid development of Big Data technology over the recent years, this paper discussing about the Machine Learning (ML) playing role that is based on methods and algorithms to Big Data Processing and Big Data Analytics. In evolutionary fields and computing fields of developments that both are complementing each other. Big Data: The rapid growth of such data solutions needed to be studied and provided to handle then to gain the knowledge from datasets and extracting values due to the data sets are very high in velocity and variety. The Big data analytics are involving and indicating the appropriate data storage and computational outline that enhanced by using Scalable Machine Learning Algorithms and Big Data Analytics then the analytics to reveal the massive amounts of hidden data’s and secret correlations. This type of Analytic information useful for organizations and companies to gain deeper knowledge, development and getting advantages over the competition. When using this Analytics we can predict the accurate implementation over the data. This paper presented about the detailed review of state-of-the-art developments and overview of advantages and challenges in Machine Learning Algorithms over big data analytics.


2020 ◽  
Vol 10 (4) ◽  
pp. 180
Author(s):  
Gizem Damla Yalcin ◽  
Nurseda Danisik ◽  
Rana Can Baygin ◽  
Ahmet Acar

Over the past decade, we have witnessed an increasing number of large-scale studies that have provided multi-omics data by high-throughput sequencing approaches. This has particularly helped with identifying key (epi)genetic alterations in cancers. Importantly, aberrations that lead to the activation of signaling networks through the disruption of normal cellular homeostasis is seen both in cancer cells and also in the neighboring tumor microenvironment. Cancer systems biology approaches have enabled the efficient integration of experimental data with computational algorithms and the implementation of actionable targeted therapies, as the exceptions, for the treatment of cancer. Comprehensive multi-omics data obtained through the sequencing of tumor samples and experimental model systems will be important in implementing novel cancer systems biology approaches and increasing their efficacy for tailoring novel personalized treatment modalities in cancer. In this review, we discuss emerging cancer systems biology approaches based on multi-omics data derived from bulk and single-cell genomics studies in addition to existing experimental model systems that play a critical role in understanding (epi)genetic heterogeneity and therapy resistance in cancer.


2022 ◽  
pp. 59-79
Author(s):  
Dragorad A. Milovanovic ◽  
Vladan Pantovic

Multimedia-related things is a new class of connected objects that can be searched, discovered, and composited on the internet of media things (IoMT). A huge amount of data sets come from audio-visual sources or have a multimedia nature. However, multimedia data is currently not incorporated in the big data (BD) frameworks. The research projects, standardization initiatives, and industrial activities for integration are outlined in this chapter. MPEG IoMT interoperability and network-based media processing (NBMP) framework as an instance of the big media (BM) reference model are explored. Conceptual model of IoT and big data integration for analytics is proposed. Big data analytics is rapidly evolving both in terms of functionality and the underlying model. The authors pointed out that IoMT analytics is closely related to big data analytics, which facilitates the integration of multimedia objects in big media applications in large-scale systems. These two technologies are mutually dependent and should be researched and developed jointly.


2019 ◽  
Vol 32 (1) ◽  
pp. 45-55 ◽  
Author(s):  
Bharat Mishra ◽  
Nilesh Kumar ◽  
M. Shahid Mukhtar

Systems biology is an inclusive approach to study the static and dynamic emergent properties on a global scale by integrating multiomics datasets to establish qualitative and quantitative associations among multiple biological components. With an abundance of improved high throughput -omics datasets, network-based analyses and machine learning technologies are playing a pivotal role in comprehensive understanding of biological systems. Network topological features reveal most important nodes within a network as well as prioritize significant molecular components for diverse biological networks, including coexpression, protein–protein interaction, and gene regulatory networks. Machine learning techniques provide enormous predictive power through specific feature extraction from biological data. Deep learning, a subtype of machine learning, has plausible future applications because a domain expert for feature extraction is not needed in this algorithm. Inspired by diverse domains of biology, we here review classic systems biology techniques applied in plant immunity thus far. We also discuss additional advanced approaches in both graph theory and machine learning, which may provide new insights for understanding plant–microbe interactions. Finally, we propose a hybrid approach in plant immune systems that harnesses the power of both network biology and machine learning, with a potential to be applicable to both model systems and agronomically important crop plants.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ali Lashkaripour ◽  
Christopher Rodriguez ◽  
Noushin Mehdipour ◽  
Rizki Mardian ◽  
David McIntyre ◽  
...  

AbstractDroplet-based microfluidic devices hold immense potential in becoming inexpensive alternatives to existing screening platforms across life science applications, such as enzyme discovery and early cancer detection. However, the lack of a predictive understanding of droplet generation makes engineering a droplet-based platform an iterative and resource-intensive process. We present a web-based tool, DAFD, that predicts the performance and enables design automation of flow-focusing droplet generators. We capitalize on machine learning algorithms to predict the droplet diameter and rate with a mean absolute error of less than 10 μm and 20 Hz. This tool delivers a user-specified performance within 4.2% and 11.5% of the desired diameter and rate. We demonstrate that DAFD can be extended by the community to support additional fluid combinations, without requiring extensive machine learning knowledge or large-scale data-sets. This tool will reduce the need for microfluidic expertise and design iterations and facilitate adoption of microfluidics in life sciences.


Sign in / Sign up

Export Citation Format

Share Document