Injury Prediction in Competitive Runners With Machine Learning

Purpose: Staying injury free is a major factor for success in sports. Although injuries are difficult to forecast, novel technologies and data-science applications could provide important insights. Our purpose was to use machine learning for the prediction of injuries in runners, based on detailed training logs. Methods: Prediction of injuries was evaluated on a new data set of 74 high-level middle- and long-distance runners, over a period of 7 years. Two analytic approaches were applied. First, the training load from the previous 7 days was expressed as a time series, with each day’s training being described by 10 features. These features were a combination of objective data from a global positioning system watch (eg, duration, distance), together with subjective data about the exertion and success of the training. Second, a training week was summarized by 22 aggregate features, and a time window of 3 weeks before the injury was considered. Results: A predictive system based on bagged XGBoost machine-learning models resulted in receiver operating characteristic curves with average areas under the curves of 0.724 and 0.678 for the day and week approaches, respectively. The results of the day approach especially reflect a reasonably high probability that our system makes correct injury predictions. Conclusions: Our machine-learning-based approach predicts a sizable portion of the injuries, in particular when the model is based on training-load data in the days preceding an injury. Overall, these results demonstrate the possible merits of using machine learning to predict injuries and tailor training programs for athletes.

Download Full-text

Comparative Analysis of Machine Learning Techniques Using Predictive Modeling

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200904164539 ◽

2020 ◽

Vol 13 ◽

Author(s):

Ritu Khandelwal ◽

Hemlata Goyal ◽

Rajveer Singh Shekhawat

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Data Science ◽

Training Data ◽

Machine Learning Techniques ◽

Future Trends ◽

Data Set ◽

Learning Stage ◽

Learning Techniques ◽

Different Types

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.

Download Full-text

Advancing data science in drug development through an innovative computational framework for data sharing and statistical analysis

BMC Medical Research Methodology ◽

10.1186/s12874-021-01409-4 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Ann-Marie Mallon ◽

Dieter A. Häring ◽

Frank Dahlke ◽

Piet Aarden ◽

Soroosh Afyouni ◽

...

Keyword(s):

Machine Learning ◽

Clinical Trial ◽

Drug Development ◽

Phase Ii ◽

Data Science ◽

Clinical Trial Data ◽

Trial Data ◽

Computational Framework ◽

Data Set ◽

Collaborative Development

Abstract Background Novartis and the University of Oxford’s Big Data Institute (BDI) have established a research alliance with the aim to improve health care and drug development by making it more efficient and targeted. Using a combination of the latest statistical machine learning technology with an innovative IT platform developed to manage large volumes of anonymised data from numerous data sources and types we plan to identify novel patterns with clinical relevance which cannot be detected by humans alone to identify phenotypes and early predictors of patient disease activity and progression. Method The collaboration focuses on highly complex autoimmune diseases and develops a computational framework to assemble a research-ready dataset across numerous modalities. For the Multiple Sclerosis (MS) project, the collaboration has anonymised and integrated phase II to phase IV clinical and imaging trial data from ≈35,000 patients across all clinical phenotypes and collected in more than 2200 centres worldwide. For the “IL-17” project, the collaboration has anonymised and integrated clinical and imaging data from over 30 phase II and III Cosentyx clinical trials including more than 15,000 patients, suffering from four autoimmune disorders (Psoriasis, Axial Spondyloarthritis, Psoriatic arthritis (PsA) and Rheumatoid arthritis (RA)). Results A fundamental component of successful data analysis and the collaborative development of novel machine learning methods on these rich data sets has been the construction of a research informatics framework that can capture the data at regular intervals where images could be anonymised and integrated with the de-identified clinical data, quality controlled and compiled into a research-ready relational database which would then be available to multi-disciplinary analysts. The collaborative development from a group of software developers, data wranglers, statisticians, clinicians, and domain scientists across both organisations has been key. This framework is innovative, as it facilitates collaborative data management and makes a complicated clinical trial data set from a pharmaceutical company available to academic researchers who become associated with the project. Conclusions An informatics framework has been developed to capture clinical trial data into a pipeline of anonymisation, quality control, data exploration, and subsequent integration into a database. Establishing this framework has been integral to the development of analytical tools.

Download Full-text

Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence

Information ◽

10.3390/info11040193 ◽

2020 ◽

Vol 11 (4) ◽

pp. 193 ◽

Cited By ~ 7

Author(s):

Sebastian Raschka ◽

Joshua Patterson ◽

Corey Nolet

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Science ◽

Gpu Computing ◽

Graphics Processing Unit ◽

General Purpose ◽

Processing Unit ◽

The Core ◽

Critical Components ◽

High Level

Smarter applications are making better use of the insights gleaned from data, having an impact on every industry and research discipline. At the core of this revolution lies the tools and the methods that are driving it, from processing the massive piles of data generated each day to learning from and taking useful action. Deep neural networks, along with advancements in classical machine learning and scalable general-purpose graphics processing unit (GPU) computing, have become critical components of artificial intelligence, enabling many of these astounding breakthroughs and lowering the barrier to adoption. Python continues to be the most preferred language for scientific computing, data science, and machine learning, boosting both performance and productivity by enabling the use of low-level libraries and clean high-level APIs. This survey offers insight into the field of machine learning with Python, taking a tour through important topics to identify some of the core hardware and software paradigms that have enabled it. We cover widely-used libraries and concepts, collected together for holistic comparison, with the goal of educating the reader and driving the field of Python machine learning forward.

Download Full-text

Bayesian networks: Theory, applications and sensitivity issues

Encyclopedia with Semantic Computing and Robotic Intelligence ◽

10.1142/s2425038416300147 ◽

2017 ◽

Vol 01 (01) ◽

pp. 1630014 ◽

Cited By ~ 4

Author(s):

Ron S. Kenett

Keyword(s):

Bayesian Networks ◽

Expert Opinion ◽

Data Science ◽

Information Quality ◽

Theoretical Background ◽

Hospital Environment ◽

Statistical Tool ◽

Data Set ◽

Operational Variables ◽

High Level

This chapter is about an important tool in the data science workbench, Bayesian networks (BNs). Data science is about generating information from a given data set using applications of statistical methods. The quality of the information derived from data analysis is dependent on various dimensions, including the communication of results, the ability to translate results into actionable tasks and the capability to integrate various data sources [R. S. Kenett and G. Shmueli, On information quality, J. R. Stat. Soc. A 177(1), 3 (2014).] This paper demonstrates, with three examples, how the application of BNs provides a high level of information quality. It expands the treatment of BNs as a statistical tool and provides a wider scope of statistical analysis that matches current trends in data science. For more examples on deriving high information quality with BNs see [R. S. Kenett and G. Shmueli, Information Quality: The Potential of Data and Analytics to Generate Knowledge (John Wiley and Sons, 2016), www.wiley.com/go/information_quality.] The three examples used in the chapter are complementary in scope. The first example is based on expert opinion assessments of risks in the operation of health care monitoring systems in a hospital environment. The second example is from the monitoring of an open source community and is a data rich application that combines expert opinion, social network analysis and continuous operational variables. The third example is totally data driven and is based on an extensive customer satisfaction survey of airline customers. The first section is an introduction to BNs, Sec. 2 provides a theoretical background on BN. Examples are provided in Sec. 3. Section 4 discusses sensitivity analysis of BNs, Sec. 5 lists a range of software applications implementing BNs. Section 6 concludes the chapter.

Download Full-text

Calabi-Yau Spaces in the String Landscape

Oxford Research Encyclopedia of Physics ◽

10.1093/acrefore/9780190871994.013.60 ◽

2020 ◽

Author(s):

Yang-Hui He

Keyword(s):

Machine Learning ◽

Computer Science ◽

Data Science ◽

Theoretical Physics ◽

Superstring Theory ◽

Data Set ◽

Pure Mathematics ◽

String Landscape ◽

Natural Solution ◽

Vacuum Solutions

Calabi-Yau spaces, or Kähler spaces admitting zero Ricci curvature, have played a pivotal role in theoretical physics and pure mathematics for the last half century. In physics, they constituted the first and natural solution to compactification of superstring theory to our 4-dimensional universe, primarily due to one of their equivalent definitions being the admittance of covariantly constant spinors. Since the mid-1980s, physicists and mathematicians have joined forces in creating explicit examples of Calabi-Yau spaces, compiling databases of formidable size, including the complete intersecion (CICY) data set, the weighted hypersurfaces data set, the elliptic-fibration data set, the Kreuzer-Skarke toric hypersurface data set, generalized CICYs, etc., totaling at least on the order of 1010 manifolds. These all contribute to the vast string landscape, the multitude of possible vacuum solutions to string compactification. More recently, this collaboration has been enriched by computer science and data science, the former in bench-marking the complexity of the algorithms in computing geometric quantities, and the latter in applying techniques such as machine learning in extracting unexpected information. These endeavours, inspired by the physics of the string landscape, have rendered the investigation of Calabi-Yau spaces one of the most exciting and interdisciplinary fields.

Download Full-text

Deep Learning to Detect Skin Cancer using Google Colab

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8587.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 2176-2183 ◽

Cited By ~ 1

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Deep Learning ◽

Skin Cancer ◽

Domain Knowledge ◽

Error Rates ◽

Data Set ◽

Training Models ◽

Learning Machine ◽

High Level

Different mathematical models, Artificial Intelligence approach and Past recorded data set is combined to formulate Machine Learning. Machine Learning uses different learning algorithms for different types of data and has been classified into three types. The advantage of this learning is that it uses Artificial Neural Network and based on the error rates, it adjusts the weights to improve itself in further epochs. But, Machine Learning works well only when the features are defined accurately. Deciding which feature to select needs good domain knowledge which makes Machine Learning developer dependable. The lack of domain knowledge affects the performance. This dependency inspired the invention of Deep Learning. Deep Learning can detect features through self-training models and is able to give better results compared to using Artificial Intelligence or Machine Learning. It uses different functions like ReLU, Gradient Descend and Optimizers, which makes it the best thing available so far. To efficiently apply such optimizers, one should have the knowledge of mathematical computations and convolutions running behind the layers. It also uses different pooling layers to get the features. But these Modern Approaches need high level of computation which requires CPU and GPUs. In case, if, such high computational power, if hardware is not available then one can use Google Colaboratory framework. The Deep Learning Approach is proven to improve the skin cancer detection as demonstrated in this paper. The paper also aims to provide the circumstantial knowledge to the reader of various practices mentioned above.

Download Full-text

Application of Machine Learning Techniques As a Means of Mooring Integrity Monitoring

Volume 3: Structures, Safety, and Reliability ◽

10.1115/omae2019-96411 ◽

2019 ◽

Author(s):

Jonathan M. Gumley ◽

Hayden Marcollo ◽

Stuart Wales ◽

Andrew E. Potts ◽

Christopher J. Carra

Keyword(s):

Machine Learning ◽

Data Science ◽

Single Point ◽

Original System ◽

Training Data ◽

Machine Learning Techniques ◽

Mooring Line ◽

Artificial Noise ◽

Data Set ◽

Learning Techniques

Abstract There is growing importance in the offshore floating production sector to develop reliable and robust means of continuously monitoring the integrity of mooring systems for FPSOs and FPUs, particularly in light of the upcoming introduction of API-RP-2MIM. Here, the limitations of the current range of monitoring techniques are discussed, including well established technologies such as load cells, sonar, or visual inspection, within the context of the growing mainstream acceptance of data science and machine learning. Due to the large fleet of floating production platforms currently in service, there is a need for a readily deployable solution that can be retrofitted to existing platforms to passively monitor the performance of floating assets on their moorings, for which machine learning based systems have particular advantages. An earlier investigation conducted in 2016 on a shallow water, single point moored FPSO employed host facility data from in-service field measurements before and after a single mooring line failure event. This paper presents how the same machine learning techniques were applied to a deep water, semi taut, spread moored system where there was no host facility data available, therefore requiring a calibrated hydrodynamic numerical model to be used as the basis for the training data set. The machine learning techniques applied to both real and synthetically generated data were successful in replicating the response of the original system, even with the latter subjected to different variations of artificial noise. Furthermore, utilizing a probability-based approach, it was demonstrated that replicating the response of the underlying system was a powerful technique for predicting changes in the mooring system.

Download Full-text

Exploring the Efficiency of Various Supervised Machine Learning Techniques to Predict the Heart Disease using Risk Factors

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a1063.1191s19 ◽

2019 ◽

Vol 9 (1S) ◽

pp. 309-312

Keyword(s):

Machine Learning ◽

Health Care ◽

Heart Disease ◽

Major Part ◽

Data Science ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Data Set

Data Science in healthcare is a innovative and capable for industry implementing the data science applications. Data analytics is recent science in to discover the medical data set to explore and discover the disease. It’s a beginning attempt to identify the disease with the help of large amount of medical dataset. Using this data science methodology, it makes the user to find their disease without the help of health care centres. Healthcare and data science are often linked through finances as the industry attempts to reduce its expenses with the help of large amounts of data. Data science and medicine are rapidly developing, and it is important that they advance together. Health care information is very effective in the society. In a human life day to day heart disease had increased. Based on the heart disease to monitor different factors in human body to analyse and prevent the heart disease. To classify the factors using the machine learning algorithms and to predict the disease is major part. Major part of involves machine level based supervised learning algorithm such as SVM, Naviebayes, Decision Trees and Random forest.

Download Full-text

Earthquake Prediction using Machine Learning Algorithm

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e9110.018620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 4684-4688

Keyword(s):

Machine Learning ◽

Structural Damage ◽

Data Science ◽

Learning Algorithm ◽

Economic Loss ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Science Data ◽

Data Set

Per the statistics received from BBC, data varies for every earthquake occurred till date. Approximately, up to thousands are dead, about 50,000 are injured, around 1-3 Million are dislocated, while a significant amount go missing and homeless. Almost 100% structural damage is experienced. It also affects the economic loss, varying from 10 to 16 million dollars. A magnitude corresponding to 5 and above is classified as deadliest. The most life-threatening earthquake occurred till date took place in Indonesia where about 3 million were dead, 1-2 million were injured and the structural damage accounted to 100%. Hence, the consequences of earthquake are devastating and are not limited to loss and damage of living as well as nonliving, but it also causes significant amount of change-from surrounding and lifestyle to economic. Every such parameter desiderates into forecasting earthquake. A couple of minutes’ notice and individuals can act to shield themselves from damage and demise; can decrease harm and monetary misfortunes, and property, characteristic assets can be secured. In current scenario, an accurate forecaster is designed and developed, a system that will forecast the catastrophe. It focuses on detecting early signs of earthquake by using machine learning algorithms. System is entitled to basic steps of developing learning systems along with life cycle of data science. Data-sets for Indian sub-continental along with rest of the World are collected from government sources. Pre-processing of data is followed by construction of stacking model that combines Random Forest and Support Vector Machine Algorithms. Algorithms develop this mathematical model reliant on “training data-set”. Model looks for pattern that leads to catastrophe and adapt to it in its building, so as to settle on choices and forecasts without being expressly customized to play out the task. After forecast, we broadcast the message to government officials and across various platforms. The focus of information to obtain is keenly represented by the 3 factors – Time, Locality and Magnitude.

Download Full-text

Python the game changer in the field of Machine Learning, Data Science and IoT: A Review

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37668 ◽

2021 ◽

Vol 9 (8) ◽

pp. 1827-1837

Author(s):

Prithwish Parial

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Experience ◽

Object Oriented Programming ◽

Machine Learning Techniques ◽

Learning Techniques ◽

High Level ◽

Learning Data ◽

Game Changer ◽

Oriented Programming Language

Abstract: Python is the finest, easily adoptable object-oriented programming language developed by Guido van Rossum, and first released on February 20, 1991 It is a powerful high-level language in the recent software world. In this paper, our discussion will be an introduction to the various Python tools applicable for Machine learning techniques, Data Science and IoT. Then describe the packages that are in demand of Data science and Machine learning communities, for example- Pandas, SciPy, TensorFlow, Theano, Matplotlib, etc. After that, we will move to show the significance of python for building IoT applications. We will share different codes throughout an example. To assistance, the learning experience, execute the following examples contained in this paper interactively using the Jupiter notebooks. Keywords: Machine learning, Real world programming, Data Science, IOT, Tools, Different packages, Languages- Python.

Download Full-text