Classification Model Simulator: A simulator for different Machine Learning Classification Algorithms

Author(s):  
Abhinandan Singla ◽  
Unnati Chaturvedi ◽  
Preet Kanwal

The number of e-learning websites as well as e-contents are increasing exponentially over the years and most of the time it become cumbersome for a learner to find e-content suitable for learning as the learner gets overwhelmed by the enormity of the content availability. The proposed work focus on evaluating the efficiencies of the different classification algorithm for the identification of the e-learning content based on difficulty levels. The data is collected from many e-learning web sites through web scraping. The web scraper downloads the web pages and parse to text file. The text files were made to run through many machine learning classification algorithms to find out the best classification model suitable for achieving the highest score with minimum training and testing time. This method helps to understand the performance of different text classification algorithms on e-learning contents and identifies the classifier with high accuracy for document classification.


Algorithms ◽  
2021 ◽  
Vol 14 (6) ◽  
pp. 187
Author(s):  
Aaron Barbosa ◽  
Elijah Pelofske ◽  
Georg Hahn ◽  
Hristo N. Djidjev

Quantum annealers, such as the device built by D-Wave Systems, Inc., offer a way to compute solutions of NP-hard problems that can be expressed in Ising or quadratic unconstrained binary optimization (QUBO) form. Although such solutions are typically of very high quality, problem instances are usually not solved to optimality due to imperfections of the current generations quantum annealers. In this contribution, we aim to understand some of the factors contributing to the hardness of a problem instance, and to use machine learning models to predict the accuracy of the D-Wave 2000Q annealer for solving specific problems. We focus on the maximum clique problem, a classic NP-hard problem with important applications in network analysis, bioinformatics, and computational chemistry. By training a machine learning classification model on basic problem characteristics such as the number of edges in the graph, or annealing parameters, such as the D-Wave’s chain strength, we are able to rank certain features in the order of their contribution to the solution hardness, and present a simple decision tree which allows to predict whether a problem will be solvable to optimality with the D-Wave 2000Q. We extend these results by training a machine learning regression model that predicts the clique size found by D-Wave.


2021 ◽  
Vol 13 (11) ◽  
pp. 6376
Author(s):  
Junseo Bae ◽  
Sang-Guk Yum ◽  
Ji-Myong Kim

Given the highly visible nature, transportation infrastructure construction projects are often exposed to numerous unexpected events, compared to other types of construction projects. Despite the importance of predicting financial losses caused by risk, it is still difficult to determine which risk factors are generally critical and when these risks tend to occur, without benchmarkable references. Most of existing methods are prediction-focused, project type-specific, while ignoring the timing aspect of risk. This study filled these knowledge gaps by developing a neural network-driven machine-learning classification model that can categorize causes of financial losses depending on insurance claim payout proportions and risk occurrence timing, drawing on 625 transportation infrastructure construction projects including bridges, roads, and tunnels. The developed network model showed acceptable classification accuracy of 74.1%, 69.4%, and 71.8% in training, cross-validation, and test sets, respectively. This study is the first of its kind by providing benchmarkable classification references of economic damage trends in transportation infrastructure projects. The proposed holistic approach will help construction practitioners consider the uncertainty of project management and the potential impact of natural hazards proactively, with the risk occurrence timing trends. This study will also assist insurance companies with developing sustainable financial management plans for transportation infrastructure projects.


2020 ◽  
Author(s):  
Valerio Carruba

<p>Asteroid families are groups of asteroids that are the product of collisions or of the rotational fission of a parent object.  These groups are mainly identified in proper elements or frequencies domains.   Because of robotic telescope surveys, the number of known asteroids has increased from about 10,000 in the early 90's to more than 750,000 nowadays. Traditional approaches for identifying new members of asteroid families, like the hierarchical clustering method (HCM), may   struggle to keep up with the growing rate of new discoveries. Here we used machine learning classification algorithms to identify new family members based on the orbital distribution in proper (a,e,sin(i)) of previously known family constituents. We compared the outcome of nine classification algorithms from stand alone and ensemble approaches.  The Extremely Randomized Trees (ExtraTree) method had the highest precision, enabling to  retrieve up to 97% of family members identified with standard HCM.</p>


BJGP Open ◽  
2018 ◽  
Vol 2 (2) ◽  
pp. bjgpopen18X101589 ◽  
Author(s):  
Emmanuel A Jammeh ◽  
Camille, B Carroll ◽  
Stephen, W Pearson ◽  
Javier Escudero ◽  
Athanasios Anastasiou ◽  
...  

BackgroundUp to half of patients with dementia may not receive a formal diagnosis, limiting access to appropriate services. It is hypothesised that it may be possible to identify undiagnosed dementia from a profile of symptoms recorded in routine clinical practice.AimThe aim of this study is to develop a machine learning-based model that could be used in general practice to detect dementia from routinely collected NHS data. The model would be a useful tool for identifying people who may be living with dementia but have not been formally diagnosed.Design & settingThe study involved a case-control design and analysis of primary care data routinely collected over a 2-year period. Dementia diagnosed during the study period was compared to no diagnosis of dementia during the same period using pseudonymised routinely collected primary care clinical data.MethodRoutinely collected Read-encoded data were obtained from 18 consenting GP surgeries across Devon, for 26 483 patients aged >65 years. The authors determined Read codes assigned to patients that may contribute to dementia risk. These codes were used as features to train a machine-learning classification model to identify patients that may have underlying dementia.ResultsThe model obtained sensitivity and specificity values of 84.47% and 86.67%, respectively.ConclusionThe results show that routinely collected primary care data may be used to identify undiagnosed dementia. The methodology is promising and, if successfully developed and deployed, may help to increase dementia diagnosis in primary care.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1518
Author(s):  
Lazar Z. Velimirović ◽  
Radmila Janković ◽  
Jelena D. Velimirović ◽  
Aleksandar Janjić

One way to optimize wastewater treatment system infrastructure, its operations, monitoring, maintenance and management is through development of smart forecasting, monitoring and failure prediction systems using machine learning modeling. The aim of this paper was to develop a model that was able to predict a water pump failure based on the asymmetrical type of data obtained from sensors such as water levels, capacity, current and flow values. Several machine learning classification algorithms were used for predicting water pump failure. Using the classification algorithms, it was possible to make predictions of future values with a simple input of current values, as well as predicting probabilities of each sample belonging to each class. In order to build a prediction model, an asymmetrical type dataset containing the aforementioned variables was used.


Sign in / Sign up

Export Citation Format

Share Document