ChemML: A Machine Learning and Informatics Program Package for the Analysis, Mining, and Modeling of Chemical and Materials Data

2019 ◽  
Author(s):  
Mojtaba Haghighatlari ◽  
Gaurav Vishwakarma ◽  
Doaa Altarawy ◽  
Ramachandran Subramanian ◽  
Bhargava Urala Kota ◽  
...  

<div><i>ChemML</i> is an open machine learning and informatics program suite that is designed to support and advance the data-driven research paradigm that is currently emerging in the chemical and materials domain. <i>ChemML</i> allows its users to perform various data science tasks and execute machine learning workflows that are adapted specifically for the chemical and materials context. Key features are automation, general-purpose utility, versatility, and user-friendliness in order to make the application of modern data science a viable and widely accessible proposition in the broader chemistry and materials community. <i>ChemML</i> is also designed to facilitate methodological innovation, and it is one of the cornerstones of the software ecosystem for data-driven <i>in silico</i> research outlined in our recent publication<sup>1</sup>.</div>

Author(s):  
Mojtaba Haghighatlari ◽  
Gaurav Vishwakarma ◽  
Doaa Altarawy ◽  
Ramachandran Subramanian ◽  
Bhargava Urala Kota ◽  
...  

<div><i>ChemML</i> is an open machine learning and informatics program suite that is designed to support and advance the data-driven research paradigm that is currently emerging in the chemical and materials domain. <i>ChemML</i> allows its users to perform various data science tasks and execute machine learning workflows that are adapted specifically for the chemical and materials context. Key features are automation, general-purpose utility, versatility, and user-friendliness in order to make the application of modern data science a viable and widely accessible proposition in the broader chemistry and materials community. <i>ChemML</i> is also designed to facilitate methodological innovation, and it is one of the cornerstones of the software ecosystem for data-driven <i>in silico</i> research outlined in our recent publication<sup>1</sup>.</div>


Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 193 ◽  
Author(s):  
Sebastian Raschka ◽  
Joshua Patterson ◽  
Corey Nolet

Smarter applications are making better use of the insights gleaned from data, having an impact on every industry and research discipline. At the core of this revolution lies the tools and the methods that are driving it, from processing the massive piles of data generated each day to learning from and taking useful action. Deep neural networks, along with advancements in classical machine learning and scalable general-purpose graphics processing unit (GPU) computing, have become critical components of artificial intelligence, enabling many of these astounding breakthroughs and lowering the barrier to adoption. Python continues to be the most preferred language for scientific computing, data science, and machine learning, boosting both performance and productivity by enabling the use of low-level libraries and clean high-level APIs. This survey offers insight into the field of machine learning with Python, taking a tour through important topics to identify some of the core hardware and software paradigms that have enabled it. We cover widely-used libraries and concepts, collected together for holistic comparison, with the goal of educating the reader and driving the field of Python machine learning forward.


10.2196/16607 ◽  
2019 ◽  
Vol 21 (11) ◽  
pp. e16607 ◽  
Author(s):  
Christian Lovis

Data-driven science and its corollaries in machine learning and the wider field of artificial intelligence have the potential to drive important changes in medicine. However, medicine is not a science like any other: It is deeply and tightly bound with a large and wide network of legal, ethical, regulatory, economical, and societal dependencies. As a consequence, the scientific and technological progresses in handling information and its further processing and cross-linking for decision support and predictive systems must be accompanied by parallel changes in the global environment, with numerous stakeholders, including citizen and society. What can be seen at the first glance as a barrier and a mechanism slowing down the progression of data science must, however, be considered an important asset. Only global adoption can transform the potential of big data and artificial intelligence into an effective breakthroughs in handling health and medicine. This requires science and society, scientists and citizens, to progress together.


2020 ◽  
Vol 73 (4) ◽  
pp. 285-295 ◽  
Author(s):  
Dongwoo Chae

Machine learning (ML) is revolutionizing anesthesiology research. Unlike classical research methods that are largely inference-based, ML is geared more towards making accurate predictions. ML is a field of artificial intelligence concerned with developing algorithms and models to perform prediction tasks in the absence of explicit instructions. Most ML applications, despite being highly variable in the topics that they deal with, generally follow a common workflow. For classification tasks, a researcher typically tests various ML models and compares the predictive performance with the reference logistic regression model. The main advantage of ML lies in its ability to deal with many features with complex interactions and its specific focus on maximizing predictive performance. However, emphasis on data-driven prediction can sometimes neglect mechanistic understanding. This article mainly focuses on the application of supervised ML to electronic health record (EHR) data. The main limitation of EHR-based studies is in the difficulty of establishing causal relationships. However, the associated low cost and rich information content provide great potential to uncover hitherto unknown correlations. In this review, the basic concepts of ML are introduced along with important terms that any ML researcher should know. Practical tips regarding the choice of software and computing devices are also provided. Towards the end, several examples of successful ML applications in anesthesiology are discussed. The goal of this article is to provide a basic roadmap to novice ML researchers working in the field of anesthesiology.


2019 ◽  
Author(s):  
Christian Lovis

UNSTRUCTURED Data-driven science and its corollaries in machine learning and the wider field of artificial intelligence have the potential to drive important changes in medicine. However, medicine is not a science like any other: It is deeply and tightly bound, with a large and wide network of legal, ethical, regulatory, economical, and societal dependencies. As a consequence, the scientific and technological progresses in handling information and its further processing and cross-linking for decision support and predictive systems must be accompanied by parallel changes in the global environment, with numerous stakeholders, including citizen and society. What can be seen at the first glance as a barrier and mechanism slowing down the progression of data science must, however, be considered an important asset. Only global adoption can transform the potential of big data and artificial intelligence into an effective breakthroughs in handling health and medicine. This requires science and society, scientists and citizens, to progress together.


2020 ◽  
Vol 480 ◽  
pp. 229103
Author(s):  
Marc Duquesnoy ◽  
Teo Lombardo ◽  
Mehdi Chouchane ◽  
Emiliano N. Primo ◽  
Alejandro A. Franco

Data ◽  
2021 ◽  
Vol 6 (7) ◽  
pp. 77
Author(s):  
Kassim S. Mwitondi ◽  
Raed A. Said

Data-driven solutions to societal challenges continue to bring new dimensions to our daily lives. For example, while good-quality education is a well-acknowledged foundation of sustainable development, innovation and creativity, variations in student attainment and general performance remain commonplace. Developing data -driven solutions hinges on two fronts-technical and application. The former relates to the modelling perspective, where two of the major challenges are the impact of data randomness and general variations in definitions, typically referred to as concept drift in machine learning. The latter relates to devising data-driven solutions to address real-life challenges such as identifying potential triggers of pedagogical performance, which aligns with the Sustainable Development Goal (SDG) #4-Quality Education. A total of 3145 pedagogical data points were obtained from the central data collection platform for the United Arab Emirates (UAE) Ministry of Education (MoE). Using simple data visualisation and machine learning techniques via a generic algorithm for sampling, measuring and assessing, the paper highlights research pathways for educationists and data scientists to attain unified goals in an interdisciplinary context. Its novelty derives from embedded capacity to address data randomness and concept drift by minimising modelling variations and yielding consistent results across samples. Results show that intricate relationships among data attributes describe the invariant conditions that practitioners in the two overlapping fields of data science and education must identify.


2020 ◽  
Author(s):  
Monik Raj Behera ◽  
sudhir upadhyay ◽  
Robert Otter ◽  
Suresh Shetty

Federated learning has become one of the most recent and widely researched areas of machine learning. Several machine-learning frameworks, such as Tensorflow Federated and PySyft and others have gained momentum in recent past and continue to evolve. Some of the frameworks involve techniques such as differential privacy, secure multi-party computation, gradient descent calculation over the network to achieve privacy of underlying data in federated learning. While these frameworks serve the need for a general-purpose federated learning model as per certain framework, in this paper we present a solution based on distributed messaging with appropriate entitlements that enterprises can leverage in a managed and permissioned network. The solution implements access controls on message source and destination in a decentralized network, which can implement any given data science model in the federated network to facilitate secure federated learning.


2020 ◽  
Author(s):  
Monik Raj Behera ◽  
sudhir upadhyay ◽  
Robert Otter ◽  
Suresh Shetty

Federated learning has become one of the most recent and widely researched areas of machine learning. Several machine-learning frameworks, such as Tensorflow Federated and PySyft and others have gained momentum in recent past and continue to evolve. Some of the frameworks involve techniques such as differential privacy, secure multi-party computation, gradient descent calculation over the network to achieve privacy of underlying data in federated learning. While these frameworks serve the need for a general-purpose federated learning model as per certain framework, in this paper we present a solution based on distributed messaging with appropriate entitlements that enterprises can leverage in a managed and permissioned network. The solution implements access controls on message source and destination in a decentralized network, which can implement any given data science model in the federated network to facilitate secure federated learning.


2021 ◽  
Vol 8 ◽  
Author(s):  
Hans-Christoph Burmeister ◽  
Manfred Constapel

In this survey, results from an investigation on collision avoidance and path planning methods developed in recent research are provided. In particular, existing methods based on Artificial Intelligence, data-driven methods based on Machine Learning, and other Data Science approaches are investigated to provide a comprehensive overview of maritime collision avoidance techniques applicable to Maritime Autonomous Surface Ships. Relevant aspects of those methods and approaches are summarized and put into suitable perspectives. As autonomous systems are expected to operate alongside or in place of conventionally manned vessels, they must comply with the COLREGs for robust decision-support/-making. Thus, the survey specifically covers how COLREGs are addressed by the investigated methods and approaches. A conclusion regarding their utilization in industrial implementations is drawn.


Sign in / Sign up

Export Citation Format

Share Document