scholarly journals Galaxy-ML: An accessible, reproducible, and scalable machine learning toolkit for biomedicine

2021 ◽  
Vol 17 (6) ◽  
pp. e1009014
Author(s):  
Qiang Gu ◽  
Anup Kumar ◽  
Simon Bray ◽  
Allison Creason ◽  
Alireza Khanteymoori ◽  
...  

Supervised machine learning is an essential but difficult to use approach in biomedical data analysis. The Galaxy-ML toolkit (https://galaxyproject.org/community/machine-learning/) makes supervised machine learning more accessible to biomedical scientists by enabling them to perform end-to-end reproducible machine learning analyses at large scale using only a web browser. Galaxy-ML extends Galaxy (https://galaxyproject.org), a biomedical computational workbench used by tens of thousands of scientists across the world, with a suite of tools for all aspects of supervised machine learning.

2020 ◽  
Author(s):  
Qiang Gu ◽  
Anup Kumar ◽  
Simon Bray ◽  
Allison Creason ◽  
Alireza Khanteymoori ◽  
...  

AbstractSupervised machine learning, where the goal is to predict labels of new instances by training on labeled data, has become an essential tool in biomedical data analysis. To make supervised machine learning more accessible to biomedical scientists, we have developed Galaxy-ML, a platform that enables scientists to perform end-to-end reproducible machine learning analyses at large scale using only a web browser. Galaxy-ML extends Galaxy, a biomedical computational workbench used by tens of thousands of scientists across the world, with a machine learning tool suite that supports end-to-end analysis.


2021 ◽  
Vol 12 (2) ◽  
pp. 49-66
Author(s):  
Janmenjoy Nayak ◽  
Bighnaraj Naik ◽  
Pandit Byomakesha Dash ◽  
Danilo Pelusi

Biomedical data is often more unstructured in nature, and biomedical data processing task is becoming more complex day by day. Thus, biomedical informatics requires competent data analysis and data mining techniques for designing decision support system's framework to solve clinical and heathcare-related issues. Due to increasingly large and complex data sets and demand of biomedical informatics research, researchers are attracted towards automated machine learning models. This paper is proposed to design an efficient machine learning model based on fuzzy c-means with meta-heuristic optimizations for biomedical data analysis and clustering. The main contributions of this paper are 1) projecting an efficient machine learning model based on fuzzy c-means and meta-heuristic optimization for biomedical data classification, 2) employing benchmark validation techniques and critical hypothesises testing, and 3) providing a background for biomedical data processing with a view of data processing and mining.


2019 ◽  
Vol 9 (21) ◽  
pp. 4676
Author(s):  
Federico Divina ◽  
Francisco Gómez-Vela

In our world, increasing amounts of data are produced everyday [...]


2020 ◽  
Author(s):  
R. Suganya ◽  
R.Arunadevi ◽  
Seyed M.Buhari

Abstract Coronavirus disease 2019 (COVID-19) is an infectious disease caused by severe respiratory syndrome coronavirus 2 (SARS-CoV-2). It was first identified in December 2019 in Wuhan, the capital of China’s Hubei province. The objective of this research is to propose a forecasting model using the COVID-19 available dataset from top affected regions across the world using machine learning algorithms. Machine Learning algorithms help us achieve this objective. Regression models are one of the supervised machine learning techniques to classify large-scale data. This research aims to apply Multivariate Linear Regression to predict the number of confirmed and death COVID-19 cases for a span of one and two weeks. The experimental results explain 99\% variability in prediction with the R-squared statistics scores of 0.992. The algorithms are evaluated using the error matrix such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and accuracy for top affected regions across the world.


Author(s):  
Joaquin Vanschoren ◽  
Ugo Vespier ◽  
Shengfa Miao ◽  
Marvin Meeng ◽  
Ricardo Cachucho ◽  
...  

Sensors are increasingly being used to monitor the world around us. They measure movements of structures such as bridges, windmills, and plane wings, human’s vital signs, atmospheric conditions, and fluctuations in power and water networks. In many cases, this results in large networks with different types of sensors, generating impressive amounts of data. As the volume and complexity of data increases, their effective use becomes more challenging, and novel solutions are needed both on a technical as well as a scientific level. Founded on several real-world applications, this chapter discusses the challenges involved in large-scale sensor data analysis and describes practical solutions to address them. Due to the sheer size of the data and the large amount of computation involved, these are clearly “Big Data” applications.


Sign in / Sign up

Export Citation Format

Share Document