scholarly journals Multiomics Data Collection, Visualization, and Utilization for Guiding Metabolic Engineering

Author(s):  
Somtirtha Roy ◽  
Tijana Radivojevic ◽  
Mark Forrer ◽  
Jose Manuel Marti ◽  
Vamshi Jonnalagadda ◽  
...  

Biology has changed radically in the past two decades, growing from a purely descriptive science into also a design science. The availability of tools that enable the precise modification of cells, as well as the ability to collect large amounts of multimodal data, open the possibility of sophisticated bioengineering to produce fuels, specialty and commodity chemicals, materials, and other renewable bioproducts. However, despite new tools and exponentially increasing data volumes, synthetic biology cannot yet fulfill its true potential due to our inability to predict the behavior of biological systems. Here, we showcase a set of computational tools that, combined, provide the ability to store, visualize, and leverage multiomics data to predict the outcome of bioengineering efforts. We show how to upload, visualize, and output multiomics data, as well as strain information, into online repositories for several isoprenol-producing strain designs. We then use these data to train machine learning algorithms that recommend new strain designs that are correctly predicted to improve isoprenol production by 23%. This demonstration is done by using synthetic data, as provided by a novel library, that can produce credible multiomics data for testing algorithms and computational tools. In short, this paper provides a step-by-step tutorial to leverage these computational tools to improve production in bioengineered strains.

2020 ◽  
Author(s):  
Somtirtha Roy ◽  
Tijana Radivojevic ◽  
Mark Forrer ◽  
Jose Manuel Marti ◽  
Vamshi Jonnalagadda ◽  
...  

AbstractBiology has changed radically in the past two decades, growing from a purely descriptive science into also a design science. The availability of tools that enable the precise modification of cells, as well as the ability to collect large amounts of multimodal data, open the possibility of sophisticated bioengineering to produce fuels, specialty and commodity chemicals, materials, and other renewable bioproducts. However, despite new tools and exponentially increasing data volumes, synthetic biology cannot yet fulfill its true potential due to our inability to predict the behavior of biological systems. Here, we present a set of tools that, combined, provide the ability to store, visualize and leverage these data to predict the outcome of bioengineering efforts.


Risks ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 4 ◽  
Author(s):  
Christopher Blier-Wong ◽  
Hélène Cossette ◽  
Luc Lamontagne ◽  
Etienne Marceau

In the past 25 years, computer scientists and statisticians developed machine learning algorithms capable of modeling highly nonlinear transformations and interactions of input features. While actuaries use GLMs frequently in practice, only in the past few years have they begun studying these newer algorithms to tackle insurance-related tasks. In this work, we aim to review the applications of machine learning to the actuarial science field and present the current state of the art in ratemaking and reserving. We first give an overview of neural networks, then briefly outline applications of machine learning algorithms in actuarial science tasks. Finally, we summarize the future trends of machine learning for the insurance industry.


2008 ◽  
Vol 47 (01) ◽  
pp. 70-75 ◽  
Author(s):  
V. Jakkula ◽  
D. J. Cook

Summary Objectives: To many people, home is a sanctuary. With the maturing of smart home technologies, many people with cognitive and physical disabilities can lead independent lives in their own homes for extended periods of time. In this paper, we investigate the design of machine learning algorithms that support this goal. We hypothesize that machine learning algorithms can be designed to automatically learn models of resident behavior in a smart home, and that the results can be used to perform automated health monitoring and to detect anomalies. Methods: Specifically, our algorithms draw upon the temporal nature of sensor data collected in a smart home to build a model of expected activities and to detect unexpected, and possibly health-critical, events in the home. Results: We validate our algorithms using synthetic data and real activity data collected from volunteers in an automated smart environment. Conclusions: The results from our experiments support our hypothesis that a model can be learned from observed smart home data and used to report anomalies, as they occur, in a smart home.


Author(s):  
Andreas Tsamados ◽  
Nikita Aggarwal ◽  
Josh Cowls ◽  
Jessica Morley ◽  
Huw Roberts ◽  
...  

AbstractResearch on the ethics of algorithms has grown substantially over the past decade. Alongside the exponential development and application of machine learning algorithms, new ethical problems and solutions relating to their ubiquitous use in society have been proposed. This article builds on a review of the ethics of algorithms published in 2016 (Mittelstadt et al. Big Data Soc 3(2), 2016). The goals are to contribute to the debate on the identification and analysis of the ethical implications of algorithms, to provide an updated analysis of epistemic and normative concerns, and to offer actionable guidance for the governance of the design, development and deployment of algorithms.


2019 ◽  
Vol 44 (3) ◽  
pp. 348-361 ◽  
Author(s):  
Jiangang Hao ◽  
Tin Kam Ho

Machine learning is a popular topic in data analysis and modeling. Many different machine learning algorithms have been developed and implemented in a variety of programming languages over the past 20 years. In this article, we first provide an overview of machine learning and clarify its difference from statistical inference. Then, we review Scikit-learn, a machine learning package in the Python programming language that is widely used in data science. The Scikit-learn package includes implementations of a comprehensive list of machine learning methods under unified data and modeling procedure conventions, making it a convenient toolkit for educational and behavior statisticians.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Qiang Zhao

The archeological sites are a heritage that we have gained from our ancestors. These sites are crucial for understanding the past and the way of life of people during those times. The monuments and the immovable relics of ancient times are a getaway to the past. The critical cultural relics however actually over the years have faced the brunt of nature. The environmental conditions have deteriorated the condition of many important immovable relics over the years since these could not be just shifted away. People also move around the ancient cultural relics that may also deform these relics. The machine learning algorithms were used to identify the location of the relics. The data from the satellite images were used and implemented machine learning algorithm to maintain and monitor the relics. This research study dwells into the importance of the area from a research point of view and utilizes machine learning techniques called CaffeNet and deep convolutional neural network. The result showed that 96% accuracy of predicting the image, which can be used for tracking human activity, protects heritage sites in a unique way.


BMJ Open ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. e053603
Author(s):  
Lotus McDougal ◽  
Nabamallika Dehingia ◽  
Nandita Bhan ◽  
Abhishek Singh ◽  
Julian McAuley ◽  
...  

ObjectivesSexual violence against women is pervasive in India. Most of this violence is experienced in the context of marriage, and rates of marital sexual violence (MSV) have been relatively stagnant over the past decade. This paper machine learning algorithms paired with qualitative thematic analysis to identify new and potentially modifiable factors influencing MSV in India.Design, setting and participantsThis cross-sectional analysis of secondary data used data from in-person interviews with ever-married women aged 15–49 who responded to gender-based violence questions in the nationally representative 2015–2016 National Family Health Survey (N=66 013), collected between 20 January 2015 and 4 December 2016. Analyses included iterative thematic analysis (L-1 regularised regression followed by iterative qualitative thematic coding of L-2 regularised regression results) and neural network modelling.Outcome measureParticipants reported their experiences of sexual violence perpetrated by their current (or most recent) husband in the previous 12 months. These responses were aggregated into any vs no recent MSV.ResultsNearly 7% of women experienced MSV in the past 12 months. Major themes associated with MSV through iterative thematic analysis included experiences of/exposure to violence, sexual behaviour, decision making and freedom of movement, sociodemographics, access to media, health knowledge, health system interaction, partner control, economic agency, reproductive and maternal history, and health status. A neural network model identified variables that largely corresponded to these themes.ConclusionsThis analysis identified several themes that may be promising avenues to identify and support women experiencing MSV, and to mitigate these traumatic experiences. In particular, amplifying screening activities at health encounters, especially among women who appear to have compromised health or restricted agency, may enable a greater number of women access to essential physical and emotional support services, and merits further consideration.


2021 ◽  
Author(s):  
Omar Alfarisi ◽  
Zeyar Aung ◽  
Mohamed Sassi

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.


Author(s):  
Norah AL-Harbi ◽  
◽  
Amirrudin Bin Kamsin

Terrorist groups in the Arab world are using social networking sites like Twitter and Facebook to rapidly spread terror for the past few years. Detection and suspension of such accounts is a way to control the menace to some extent. This research is aimed at building an effective text classifier, using machine learning to identify the polarity of the tweets automatically. Five classifiers were chosen, which are AdB_SAMME, AdB_SAMME.R, Linear SVM, NB, and LR. These classifiers were applied on three features namely S1 (one word, unigram), S2 (word pair, bigram), and S3 (word triplet, trigram). All five classifiers evaluated samples S1, S2, and S3 in 346 preprocessed tweets. Feature extraction process utilized one of the most widely applied weighing schemes tf-idf (term frequency-inverse document frequency).The results were validated by four experts in Arabic language (three teachers and an educational supervisor in Saudi Arabia) through a questionnaire. The study found that the Linear SVM classifier yielded the best results of 99.7 % classification accuracy on S3 among all the other classifiers used. When both classification accuracy and time were considered, the NB classifier demonstrated the performance on S1 with 99.4% accuracy, which was comparable with Linear SVM. The Arab world has faced massive terrorist attacks in the past, and therefore, the research is highly significant and relevant due to its specific focus on detecting terrorism messages in Arabic. The state-of-the-art methods developed so far for tweets classification are mostly focused on analyzing English text, and hence, there was a dire need for devising machine learning algorithms for detecting Arabic terrorism messages. The innovative aspect of the model presented in the current study is that the five best classifiers were selected and applied on three language models S1, S2, and S3. The comparative analysis based on classification accuracy and time constraints proposed the best classifiers for sentiment analysis in the Arabic language.


2019 ◽  
Author(s):  
Georg Kustatscher ◽  
Piotr Grabowski ◽  
Juri Rappsilber

Gene co-expression analysis is a widespread method to identify the potential biological function of uncharacterised genes. Recent evidence suggests that proteome profiling may provide more accurate results than transcriptome profiling. However, it is unclear which statistical measure is best suited to detect proteins that are co-regulated. We have previously shown that expression similarities calculated using treeClust, an unsupervised machine-learning algorithm, outperformed correlation-based analysis of a large proteomics dataset. The reason for this improvement is unknown. Here we systematically explore the characteristics of treeClust similarities. Leveraging synthetic data, we find that tree-based similarities are exceptionally robust against outliers and detect only close-fitting, linear protein – protein associations. We then use proteomics data to demonstrate that both of these features contribute to the improved performance of treeClust relative to Pearson, Spearman and robust correlation. Our results suggest that, for large proteomics datasets, unsupervised machine-learning algorithms such as treeClust may significantly improve the detection of biologically relevant protein – protein associations relative to correlation metrics.


Sign in / Sign up

Export Citation Format

Share Document