Early Detection on Students' Failing Open-Source based Course Projects using Machine Learning Approaches

Diabetic Retinopathy is a major cause of vision loss and blindness affecting millions of people across the globe. Although there are established screening methods - fluorescein angiography and optical coherence tomography for detection of the disease but in majority of the cases, the patients remain ignorant and fail to undertake such tests at an appropriate time. The early detection of the disease plays an extremely important role in preventing vision loss which is the consequence of diabetes mellitus remaining untreated among patients for a prolonged time period. Various machine learning and deep learning approaches have been implemented on diabetic retinopathy dataset for classification and prediction of the disease but majority of them have neglected the aspect of data pre-processing and dimensionality reduction, leading to biased results. The dataset used in the present study is a diabetes retinopathy dataset collected from the UCI machine learning repository. At its inceptions, the raw dataset is normalized using the Standardscalar technique and then Principal Component Analysis (PCA) is used to extract the most significant features in the dataset. Further, Firefly algorithm is implemented for dimensionality reduction. This reduced dataset is fed into a Deep Neural Network Model for classification. The results generated from the model is evaluated against the prevalent machine learning models and the results justify the superiority of the proposed model in terms of Accuracy, Precision, Recall, Sensitivity and Specificity.

Download Full-text

Ligo: An Open Source Application for the Management and Execution of Administrative Data Linkage

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.749 ◽

2018 ◽

Vol 3 (4) ◽

Author(s):

Greg Lawrance ◽

Raphael Parra Hernandez ◽

Khalegh Mamakani ◽

Suraiya Khan ◽

Brent Hills ◽

...

Keyword(s):

Machine Learning ◽

Open Source ◽

Administrative Data ◽

Data Science ◽

Population Data ◽

Probabilistic Methods ◽

Learning Approaches ◽

Web Interface ◽

Science Community ◽

Comparison Algorithms

IntroductionLigo is an open source application that provides a framework for managing and executing administrative data linking projects. Ligo provides an easy-to-use web interface that lets analysts select among data linking methods including deterministic, probabilistic and machine learning approaches and use these in a documented, repeatable, tested, step-by-step process. Objectives and ApproachThe linking application has two primary functions: identifying common entities in datasets [de-duplication] and identifying common entities between datasets [linking]. The application is being built from the ground up in a partnership between the Province of British Columbia’s Data Innovation (DI) Program and Population Data BC, and with input from data scientists. The simple web interface allows analysts to streamline the processing of multiple datasets in a straight-forward and reproducible manner. ResultsBuilt in Python and implemented as a desktop-capable and cloud-deployable containerized application, Ligo includes many of the latest data-linking comparison algorithms with a plugin architecture that supports the simple addition of new formulae. Currently, deterministic approaches to linking have been implemented and probabilistic methods are in alpha testing. A fully functional alpha, including deterministic and probabilistic methods is expected to be ready in September, with a machine learning extension expected soon after. Conclusion/ImplicationsLigo has been designed with enterprise users in mind. The application is intended to make the processes of data de-duplication and linking simple, fast and reproducible. By making the application open source, we encourage feedback and collaboration from across the population research and data science community.

Download Full-text

Machine Learning Approaches on Diabetic Retinopathy Prediction

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206377 ◽

2020 ◽

pp. 341-345

Author(s):

Gowri Prasad ◽

Vrinda Raveendran ◽

Vidya B M ◽

Tejavati Hedge

Keyword(s):

Machine Learning ◽

Diabetic Retinopathy ◽

Early Detection ◽

Comparative Study ◽

Blood Sugar ◽

Efficient Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Early Age ◽

Learning Approaches

Diabetic retinopathy is a eye disorder which is developed due to high blood sugar that affects the neurons in retina. A dangerous fact about this disease is that it can lead to blindness. The possible cure is through detection of disease at early age. This can be done using different machine learning algorithms. This paper does a comparative study on different machine learning algorithms that can be used for early detection of diabetic retinopathy. This study is done to find out the most efficient algorithm suitable for the process and to increase the efficiency of the particular algorithm.

Download Full-text

Attempt of Early Stuck Detection Using Unsupervised Deep Learning With Probability Mixture Model

10.1115/omae2021-62739 ◽

2021 ◽

Author(s):

Tomoya Inoue ◽

Yujin Nakagawa ◽

Ryota Wada ◽

Keisuke Miyoshi ◽

Shungo Abe ◽

...

Keyword(s):

Machine Learning ◽

Early Detection ◽

Supervised Machine Learning ◽

Support Vector ◽

Learning Approaches ◽

Learning Models ◽

Vector Machines ◽

Unsupervised Deep Learning ◽

Drilling Operations ◽

Using Data

Abstract The early detection of a stuck pipe during drilling operations is challenging and crucial. Some of the studies on stuck detection have adopted supervised machine learning approaches with ordinal support vector machines or neural networks using datasets for “stuck” and “normal”. However, for early detection before stuck occurs, the application of ordinal supervised machine learning has several concerns, such as limited stuck data, lack of an exact “stuck sign” before it occurs, and the various mechanisms involved in pipe sticking. This study acquires surface drilling data from various wells belonging to several agencies, examines the effectiveness of multiple learning models, and discusses the possibility of the early detection of pipe sticking before it occurs. Unsupervised machine learning using data on the normal activities is a possible advanced method for early stuck detection, which is adopted in this study. In addition, as a countermeasure to another concern that even normal activities involve various operations, we apply unsupervised learning with multiple learning models.

Download Full-text

Early detection of Parkinson's disease through multimodal features using machine learning approaches

International Journal of Signal and Imaging Systems Engineering ◽

10.1504/ijsise.2018.10011741 ◽

2018 ◽

Vol 11 (1) ◽

pp. 31

Author(s):

Bhanu Prasad ◽

Ravi Pushkarna ◽

Gunjan Pahuja ◽

T.N. Nagabhushan

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Parkinson's Disease ◽

Early Detection ◽

Learning Approaches ◽

Multimodal Features

Download Full-text

Early detection of Parkinson's disease through multimodal features using machine learning approaches

International Journal of Signal and Imaging Systems Engineering ◽

10.1504/ijsise.2018.090605 ◽

2018 ◽

Vol 11 (1) ◽

pp. 31

Author(s):

Gunjan Pahuja ◽

T.N. Nagabhushan ◽

Bhanu Prasad ◽

Ravi Pushkarna

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Parkinson's Disease ◽

Early Detection ◽

Learning Approaches ◽

Multimodal Features

Download Full-text

A Deep Learning Approach to Nightfire Detection based on Low-Light Satellite

10.5121/csit.2021.110401 ◽

2021 ◽

Author(s):

Yue Wang ◽

Ye Ni ◽

Xutao Li ◽

Yunming Ye

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Deep Learning ◽

Early Detection ◽

Remote Sensing Image ◽

Experimental Results ◽

Learning Approach ◽

Learning Approaches ◽

Low Light ◽

Conventional Machine

Wildfires are a serious disaster, which often cause severe damages to forests and plants. Without an early detection and suitable control action, a small wildfire could grow into a big and serious one. The problem is especially fatal at night, as firefighters in general miss the chance to detect the wildfires in the very first few hours. Low-light satellites, which take pictures at night, offer an opportunity to detect night fire timely. However, previous studies identify night fires based on threshold methods or conventional machine learning approaches, which are not robust and accurate enough. In this paper, we develop a new deep learning approach, which determines night fire locations by a pixel-level classification on low-light remote sensing image. Experimental results on VIIRS data demonstrate the superiority and effectiveness of the proposed method, which outperforms conventional threshold and machine learning approaches.

Download Full-text

Open-source QSAR models for pKa prediction using multiple machine learning approaches

Journal of Cheminformatics ◽

10.1186/s13321-019-0384-1 ◽

2019 ◽

Vol 11 (1) ◽

Cited By ~ 10

Author(s):

Kamel Mansouri ◽

Neal F. Cariello ◽

Alexandru Korotcov ◽

Valery Tkachenko ◽

Chris M. Grulke ◽

...

Keyword(s):

Machine Learning ◽

Open Source ◽

Acid Dissociation ◽

Support Vector ◽

Learning Approaches ◽

Data Set ◽

Pka Prediction ◽

Chemical Structures ◽

Extreme Gradient Boosting ◽

Qsar Models

Abstract Background The logarithmic acid dissociation constant pKa reflects the ionization of a chemical, which affects lipophilicity, solubility, protein binding, and ability to pass through the plasma membrane. Thus, pKa affects chemical absorption, distribution, metabolism, excretion, and toxicity properties. Multiple proprietary software packages exist for the prediction of pKa, but to the best of our knowledge no free and open-source programs exist for this purpose. Using a freely available data set and three machine learning approaches, we developed open-source models for pKa prediction. Methods The experimental strongest acidic and strongest basic pKa values in water for 7912 chemicals were obtained from DataWarrior, a freely available software package. Chemical structures were curated and standardized for quantitative structure–activity relationship (QSAR) modeling using KNIME, and a subset comprising 79% of the initial set was used for modeling. To evaluate different approaches to modeling, several datasets were constructed based on different processing of chemical structures with acidic and/or basic pKas. Continuous molecular descriptors, binary fingerprints, and fragment counts were generated using PaDEL, and pKa prediction models were created using three machine learning methods, (1) support vector machines (SVM) combined with k-nearest neighbors (kNN), (2) extreme gradient boosting (XGB) and (3) deep neural networks (DNN). Results The three methods delivered comparable performances on the training and test sets with a root-mean-squared error (RMSE) around 1.5 and a coefficient of determination (R2) around 0.80. Two commercial pKa predictors from ACD/Labs and ChemAxon were used to benchmark the three best models developed in this work, and performance of our models compared favorably to the commercial products. Conclusions This work provides multiple QSAR models to predict the strongest acidic and strongest basic pKas of chemicals, built using publicly available data, and provided as free and open-source software on GitHub.

Download Full-text