Earthquake Prediction using Machine Learning Algorithm

Per the statistics received from BBC, data varies for every earthquake occurred till date. Approximately, up to thousands are dead, about 50,000 are injured, around 1-3 Million are dislocated, while a significant amount go missing and homeless. Almost 100% structural damage is experienced. It also affects the economic loss, varying from 10 to 16 million dollars. A magnitude corresponding to 5 and above is classified as deadliest. The most life-threatening earthquake occurred till date took place in Indonesia where about 3 million were dead, 1-2 million were injured and the structural damage accounted to 100%. Hence, the consequences of earthquake are devastating and are not limited to loss and damage of living as well as nonliving, but it also causes significant amount of change-from surrounding and lifestyle to economic. Every such parameter desiderates into forecasting earthquake. A couple of minutes’ notice and individuals can act to shield themselves from damage and demise; can decrease harm and monetary misfortunes, and property, characteristic assets can be secured. In current scenario, an accurate forecaster is designed and developed, a system that will forecast the catastrophe. It focuses on detecting early signs of earthquake by using machine learning algorithms. System is entitled to basic steps of developing learning systems along with life cycle of data science. Data-sets for Indian sub-continental along with rest of the World are collected from government sources. Pre-processing of data is followed by construction of stacking model that combines Random Forest and Support Vector Machine Algorithms. Algorithms develop this mathematical model reliant on “training data-set”. Model looks for pattern that leads to catastrophe and adapt to it in its building, so as to settle on choices and forecasts without being expressly customized to play out the task. After forecast, we broadcast the message to government officials and across various platforms. The focus of information to obtain is keenly represented by the 3 factors – Time, Locality and Magnitude.

Download Full-text

Machine Learning Algorithms in Fraud Detection: Case Study on Retail Consumer Financing Company

Asia Pacific Fraud Journal ◽

10.21532/apfjournal.v6i2.216 ◽

2021 ◽

Vol 6 (2) ◽

pp. 213

Author(s):

Nadya Intan Mustika ◽

Bagus Nenda ◽

Dona Ramadhan

Keyword(s):

Machine Learning ◽

Random Forest ◽

Historical Data ◽

Learning Algorithm ◽

Fraud Detection ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Random Forest Algorithm ◽

Data Set

This study aims to implement a machine learning algorithm in detecting fraud based on historical data set in a retail consumer financing company. The outcome of machine learning is used as samples for the fraud detection team. Data analysis is performed through data processing, feature selection, hold-on methods, and accuracy testing. There are five machine learning methods applied in this study: Logistic Regression, K-Nearest Neighbor (KNN), Decision Tree, Random Forest, and Support Vector Machine (SVM). Historical data are divided into two groups: training data and test data. The results show that the Random Forest algorithm has the highest accuracy with a training score of 0.994999 and a test score of 0.745437. This means that the Random Forest algorithm is the most accurate method for detecting fraud. Further research is suggested to add more predictor variables to increase the accuracy value and apply this method to different financial institutions and different industries.

Download Full-text

Identification of Leukemia Subtypes from Microscopic Images Using Convolutional Neural Network

Diagnostics ◽

10.3390/diagnostics9030104 ◽

2019 ◽

Vol 9 (3) ◽

pp. 104 ◽

Cited By ~ 11

Author(s):

Ahmed ◽

Yigit ◽

Isik ◽

Alpkocak

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Leukemia Data

Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, k-nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.

Download Full-text

Rotor Unbalance Kind and Severity Identification by Current Signature Analysis with Adaptative Update to Multiclass Machine Learning Algorithms

Studies in Engineering and Technology ◽

10.11114/set.v8i1.5213 ◽

2021 ◽

Vol 8 (1) ◽

pp. 28

Author(s):

S. L. Ávila ◽

H. M. Schaberle ◽

S. Youssef ◽

F. S. Pacheco ◽

C. A. Penz

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Signature Analysis ◽

Data Set ◽

Learning Techniques ◽

Environmental Variations ◽

Current Signature

The health of a rotating electric machine can be evaluated by monitoring electrical and mechanical parameters. As more information is available, it easier can become the diagnosis of the machine operational condition. We built a laboratory test bench to study rotor unbalance issues according to ISO standards. Using the electric stator current harmonic analysis, this paper presents a comparison study among Support-Vector Machines, Decision Tree classifies, and One-vs-One strategy to identify rotor unbalance kind and severity problem – a nonlinear multiclass task. Moreover, we propose a methodology to update the classifier for dealing better with changes produced by environmental variations and natural machinery usage. The adaptative update means to update the training data set with an amount of recent data, saving the entire original historical data. It is relevant for engineering maintenance. Our results show that the current signature analysis is appropriate to identify the type and severity of the rotor unbalance problem. Moreover, we show that machine learning techniques can be effective for an industrial application.

Download Full-text

Exploring the Efficiency of Various Supervised Machine Learning Techniques to Predict the Heart Disease using Risk Factors

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a1063.1191s19 ◽

2019 ◽

Vol 9 (1S) ◽

pp. 309-312

Keyword(s):

Machine Learning ◽

Health Care ◽

Heart Disease ◽

Major Part ◽

Data Science ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Data Set

Data Science in healthcare is a innovative and capable for industry implementing the data science applications. Data analytics is recent science in to discover the medical data set to explore and discover the disease. It’s a beginning attempt to identify the disease with the help of large amount of medical dataset. Using this data science methodology, it makes the user to find their disease without the help of health care centres. Healthcare and data science are often linked through finances as the industry attempts to reduce its expenses with the help of large amounts of data. Data science and medicine are rapidly developing, and it is important that they advance together. Health care information is very effective in the society. In a human life day to day heart disease had increased. Based on the heart disease to monitor different factors in human body to analyse and prevent the heart disease. To classify the factors using the machine learning algorithms and to predict the disease is major part. Major part of involves machine level based supervised learning algorithm such as SVM, Naviebayes, Decision Trees and Random forest.

Download Full-text

Amino Acid Composition and Charge Based Prediction of Antisepsis Peptides by Random Forest Machine Learning Algorithm

10.1101/2021.09.26.461860 ◽

2021 ◽

Author(s):

Aayushi Rathore ◽

Anu Saini ◽

Navjot Kaur ◽

Aparna Singh ◽

Ojasvi Dutta ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Learning Algorithms ◽

Multiple Organ ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector ◽

Data Set ◽

Organ Systems

ABSTRACTSepsis is a severe infectious disease with high mortality, and it occurs when chemicals released in the bloodstream to fight an infection trigger inflammation throughout the body and it can cause a cascade of changes that damage multiple organ systems, leading them to fail, even resulting in death. In order to reduce the possibility of sepsis or infection antiseptics are used and process is known as antisepsis. Antiseptic peptides (ASPs) show properties similar to antigram-negative peptides, antigram-positive peptides and many more. Machine learning algorithms are useful in screening and identification of therapeutic peptides and thus provide initial filters or built confidence before using time consuming and laborious experimental approaches. In this study, various machine learning algorithms like Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbour (KNN) and Logistic Regression (LR) were evaluated for prediction of ASPs. Moreover, the characteristics physicochemical features of ASPs were also explored to use them in machine learning. Both manual and automatic feature selection methodology was employed to achieve best performance of machine learning algorithms. A 5-fold cross validation and independent data set validation proved RF as the best model for prediction of ASPs. Our RF model showed an accuracy of 97%, Matthew’s Correlation Coefficient (MCC) of 0.93, which are indication of a robust and good model. To our knowledge this is the first attempt to build a machine learning classifier for prediction of ASPs.

Download Full-text

A Mathematical Study of Glaucoma using Machine Learning Algorithms for Retina

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-v2-i3-305 ◽

2021 ◽

pp. 31-33

Author(s):

K. Prakash ◽

M. Sudharsan

Keyword(s):

Machine Learning ◽

Optic Nerve ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Support Vector ◽

Maximum Output ◽

Data Set ◽

Correlation Clustering ◽

Achievement Rate ◽

Sight Loss

Glaucoma is a category of visual disorders represented by optic nerve neuropathy, a means of gradually declining optic nerve neuropathy. In ground vision, resulting in sight loss. In this article, a novel retinal therapeutic support vector machine for glaucoma using a machine Algorithms for learning are conservative. The algorithm has sufficient pragmatism; the correlation clustering mode is subsequently retained The estimated preparation deterrent on a data set has a 91 percent achievement rate on a data set. Consolidation of 500 realistic resolute and glaucoma retina images; hence, depending on the cluster, the computational advantage of In glaucoma therapy, the overlapping device pedestal on the machine learning algorithm has maximum output.

Download Full-text

Hoax News Classification using Machine Learning Algorithms

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3753.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 3938-3944

Keyword(s):

Machine Learning ◽

Social Media ◽

Learning Algorithm ◽

Detection System ◽

Machine Learning Algorithms ◽

Training Data ◽

Stochastic Gradient Descent ◽

Support Vector ◽

The Impact ◽

F Measure

Hoax news on social media has had a dramatic effect on our society in recent years. The impact of hoax news felt by many people, anxiety, financial loss, and loss of the right name. Therefore we need a detection system that can help reduce hoax news on social media. Hoax news classification is one of the stages in the construction of a hoax news detection system, and this unsupervised learning algorithm becomes a method for creating hoax news datasets, machine learning tools for data processing, and text processing for detecting data. The next will produce a classification of a hoax or not a Hoax based on the text inputted. Hoax news classification in this study uses five algorithms, namely Support Vector Machine, Naïve Bayes, Decision Tree, Logistic Regression, Stochastic Gradient Descent, and Neural Network (MLP). These five algorithms to produce the best algorithm that can use to detect hoax news, with the highest parameters, accuracy, F-measure, Precision, and recall. From the results of testing conducted on five classification algorithms produced shows that the NN-MPL algorithm has an average of 93% for the value of accuracy, F-Measure, and Precision, the highest compared to five other algorithms, but for the highest Recall value generated from the algorithm SVM which is 94%. the results of this experiment show that different effects for different classifiers, and that means that the more hoax data used as training data, the more accurate the system calculates accuracy in more detail.

Download Full-text

Using Machine Learning to Predict Heart Disease

WSEAS TRANSACTIONS ON BIOLOGY AND BIOMEDICINE ◽

10.37394/23208.2022.19.1 ◽

2022 ◽

Vol 19 ◽

pp. 1-9

Author(s):

Nikhil Bora ◽

Sreedevi Gutta ◽

Ahmad Hadaegh

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Heart Disease ◽

Random Forest ◽

Data Science ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor

Heart Disease has become one of the most leading cause of the death on the planet and it has become most life-threatening disease. The early prediction of the heart disease will help in reducing death rate. Predicting Heart Disease has become one of the most difficult challenges in the medical sector in recent years. As per recent statistics, about one person dies from heart disease every minute. In the realm of healthcare, a massive amount of data was discovered for which the data-science is critical for analyzing this massive amount of data. This paper proposes heart disease prediction using different machine-learning algorithms like logistic regression, naïve bayes, support vector machine, k nearest neighbor (KNN), random forest, extreme gradient boost, etc. These machine learning algorithm techniques we used to predict likelihood of person getting heart disease on the basis of features (such as cholesterol, blood pressure, age, sex, etc. which were extracted from the datasets. In our research we used two separate datasets. The first heart disease dataset we used was collected from very famous UCI machine learning repository which has 303 record instances with 14 different attributes (13 features and one target) and the second dataset that we used was collected from Kaggle website which contained 1190 patient’s record instances with 11 features and one target. This dataset is a combination of 5 popular datasets for heart disease. This study compares the accuracy of various machine learning techniques. In our research, for the first dataset we got the highest accuracy of 92% by Support Vector Machine (SVM). And for the second dataset, Random Forest gave us the highest accuracy of 94.12%. Then, we combined both the datasets which we used in our research for which we got the highest accuracy of 93.31% using Random Forest.

Download Full-text

Development of a machine-learning-based decision support mechanism for predicting chemical tanker cleaning activity

Journal of Modelling in Management ◽

10.1108/jm2-12-2019-0284 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Burak Cankaya ◽

Berna Eren Tokgoz ◽

Ali Dag ◽

K.C. Santosh

Keyword(s):

Machine Learning ◽

Decision Support ◽

Test Data ◽

Machine Learning Algorithms ◽

Training Data ◽

Comparative Approach ◽

Support Vector ◽

Data Set ◽

Content Type ◽

Vehicle Activity

Purpose This paper aims to propose a machine learning-based automatic labeling methodology for chemical tanker activities that can be applied to any port with any number of active tankers and the identification of important predictors. The methodology can be applied to any type of activity tracking that is based on automatically generated geospatial data. Design/methodology/approach The proposed methodology uses three machine learning algorithms (artificial neural networks, support vector machines (SVMs) and random forest) along with information fusion (IF)-based sensitivity analysis to classify chemical tanker activities. The data set is split into training and test data based on vessels, with two vessels in the training data and one in the test data set. Important predictors were identified using a receiver operating characteristic comparative approach, and overall variable importance was calculated using IF from the top models. Findings Results show that an SVM model has the best balance between sensitivity and specificity, at 93.5% and 91.4%, respectively. Speed, acceleration and change in the course on the ground for the vessels are identified as the most important predictors for classifying vessel activity. Research limitations/implications The study evaluates the vessel movements waiting between different terminals in the same port, but not their movements between different ports for their tank-cleaning activities. Practical implications The findings in this study can be used by port authorities, shipping companies, vessel operators and other stakeholders for decision support, performance tracking, as well as for automated alerts. Originality/value This analysis makes original contributions to the existing literature by defining and demonstrating a methodology that can automatically label vehicle activity based on location data and identify certain characteristics of the activity by finding important location-based predictors that effectively classify the activity status.

Download Full-text

Predicting Solid Particle Erosion and Uncertainty in Elbows by Artificial Intelligence Methods

Volume 2: Fluid Mechanics; Multiphase Flows ◽

10.1115/fedsm2020-20458 ◽

2020 ◽

Author(s):

Soroor Karimi ◽

Bohan Xu ◽

Alireza Asgharpour ◽

Siamack A. Shirazi ◽

Sandip Sen

Keyword(s):

Machine Learning ◽

Random Forest ◽

Elastic Net ◽

Machine Learning Algorithms ◽

Superficial Velocity ◽

Training Data ◽

Support Vector ◽

Good Prediction ◽

Data Set ◽

Erosion Prediction

Abstract AI approaches include machine learning algorithms in which models are trained from existing data to predict the behavior of the system for previously unseen cases. Recent studies at the Erosion/Corrosion Research Center (E/CRC) have shown that these methods can be quite effective in predicting erosion. However, these methods are not widely used in the engineering industries due to the lack of work and information in this area. Moreover, in most of the available literature, the reported models and results have not been rigorously tested. This fact suggests that these models cannot be fully trusted for the applications for which they are trained. Therefore, in this study three machine learning models, including Elastic Net, Random Forest and Support Vector Machine (SVM), are utilized to increase the confidence in these tools. First, these models are trained with a training data set. Next, the model hyper-parameters are optimized by using nested cross validation. Finally, the results are verified with a test data set. This process is repeated several times to assure the accuracy of the results. In order to be able to predict the erosion under different conditions with these three models, six main variables are considered in the training data set. These variables include material hardness, pipe diameter, particle size, liquid viscosity, liquid superficial velocity, and gas superficial velocity. All three studied models show good prediction performances. The Random Forest and SVM approaches, however, show slightly better results compared to Elastic Net. The performance of these models is compared to both CFD erosion simulation results and also to Sand Production Pipe Saver (SPPS) results, a mechanistic erosion prediction software developed at the E/CRC. The comparison shows SVM prediction has a better match with both CFD and SPPS. The application of AI model to determine the uncertainty of calculated erosion is also discussed.

Download Full-text