Accurate Integrated System to detect Pulmonary and Extra Pulmonary Tuberculosis using Machine Learning Algorithms

Several studies have been reported the use of machine learning algorithms in the detection of Tuberculosis, but studies that discuss the detection of both types of TB, i.e., Pulmonary and Extra Pulmonary Tuberculosis, using machine learning algorithms are lacking. Therefore, an integrated system based on machine learning models has been proposed in this paper to assist doctors and radiologists in interpreting patients’ data to detect of PTB and EPTB. Three basic machine learning algorithms, Decision Tree, Naïve Bayes, SVM, have been used to predict and compare their performance. The clinical data and the image data are used as input to the models and these datasets have been collected from various hospitals of Jalandhar, Punjab, India. The dataset used to train the model comprises 200 patients’ data containing 90 PTB patients, 67 EPTB patients, and 43 patients having NO TB. The validation dataset contains 49 patients, which exhibited the best accuracy of 95% for classifying PTB and EPTB using Decision Tree, a machine learning algorithm.

Download Full-text

Intelligent system of English composition scoring model based on improved machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189235 ◽

2020 ◽

pp. 1-11

Author(s):

Jie Liu ◽

Lin Lin ◽

Xiufang Liang

Keyword(s):

Machine Learning ◽

Evaluation System ◽

Intelligent System ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Assessment System ◽

English Composition ◽

Region Extraction ◽

Constraint Model

The online English teaching system has certain requirements for the intelligent scoring system, and the most difficult stage of intelligent scoring in the English test is to score the English composition through the intelligent model. In order to improve the intelligence of English composition scoring, based on machine learning algorithms, this study combines intelligent image recognition technology to improve machine learning algorithms, and proposes an improved MSER-based character candidate region extraction algorithm and a convolutional neural network-based pseudo-character region filtering algorithm. In addition, in order to verify whether the algorithm model proposed in this paper meets the requirements of the group text, that is, to verify the feasibility of the algorithm, the performance of the model proposed in this study is analyzed through design experiments. Moreover, the basic conditions for composition scoring are input into the model as a constraint model. The research results show that the algorithm proposed in this paper has a certain practical effect, and it can be applied to the English assessment system and the online assessment system of the homework evaluation system algorithm system.

Download Full-text

Performance Improvement of Decision Tree: A Robust Classifier Using Tabu Search Algorithm

Applied Sciences ◽

10.3390/app11156728 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6728

Author(s):

Muhammad Asfand Hafeez ◽

Muhammad Rashid ◽

Hassan Tariq ◽

Zain Ul Abideen ◽

Saud S. Alotaibi ◽

...

Keyword(s):

Machine Learning ◽

Tabu Search ◽

Decision Tree ◽

Decision Trees ◽

Search Algorithm ◽

Learning Algorithms ◽

Performance Comparison ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Tabu Search Algorithm

Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.

Download Full-text

174 A comparison of machine learning algorithms in the classification of beef steers finished in feedlot

Journal of Animal Science ◽

10.1093/jas/skaa278.231 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 126-127

Author(s):

Lucas S Lopes ◽

Christine F Baes ◽

Dan Tulpan ◽

Luis Artur Loyola Chardulo ◽

Otavio Machado Neto ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Final Decision ◽

Relevant Parameter ◽

Good Prediction ◽

Quality Traits ◽

C4.5 Decision Tree

Abstract The aim of this project is to compare some of the state-of-the-art machine learning algorithms on the classification of steers finished in feedlots based on performance, carcass and meat quality traits. The precise classification of animals allows for fast, real-time decision making in animal food industry, such as culling or retention of herd animals. Beef production presents high variability in its numerous carcass and beef quality traits. Machine learning algorithms and software provide an opportunity to evaluate the interactions between traits to better classify animals. Four different treatment levels of wet distiller’s grain were applied to 97 Angus-Nellore animals and used as features for the classification problem. The C4.5 decision tree, Naïve Bayes (NB), Random Forest (RF) and Multilayer Perceptron (MLP) Artificial Neural Network algorithms were used to predict and classify the animals based on recorded traits measurements, which include initial and final weights, sheer force and meat color. The top performing classifier was the C4.5 decision tree algorithm with a classification accuracy of 96.90%, while the RF, the MLP and NB classifiers had accuracies of 55.67%, 39.17% and 29.89% respectively. We observed that the final decision tree model constructed with C4.5 selected only the dry matter intake (DMI) feature as a differentiator. When DMI was removed, no other feature or combination of features was sufficiently strong to provide good prediction accuracies for any of the classifiers. We plan to investigate in a follow-up study on a significantly larger sample size, the reasons behind DMI being a more relevant parameter than the other measurements.

Download Full-text

Development and validation of simplified machine learning algorithms to predict prognosis of hospitalized COVID-19 patients: a multi-center, retrospective study (Preprint)

10.2196/preprints.31549 ◽

2021 ◽

Author(s):

Fang He ◽

John H Page ◽

Kerry R Weinberg ◽

Anirban Mishra

Keyword(s):

Machine Learning ◽

Respiratory Failure ◽

High Risk ◽

Learning Algorithms ◽

Model Development ◽

Machine Learning Algorithms ◽

Validation Dataset ◽

High Risk Patients ◽

Risk Patients ◽

Development And Validation

BACKGROUND The current COVID-19 pandemic is unprecedented; under resource-constrained setting, predictive algorithms can help to stratify disease severity, alerting physicians of high-risk patients, however there are few risk scores derived from a substantially large EHR dataset, using simplified predictors as input. OBJECTIVE To develop and validate simplified machine learning algorithms which predicts COVID-19 adverse outcomes, to evaluate the AUC (area under the receiver operating characteristic curve), sensitivity, specificity and calibration of the algorithms, to derive clinically meaningful thresholds. METHODS We conducted machine learning model development and validation via cohort study using multi-center, patient-level, longitudinal electronic health records (EHR) from Optum® COVID-19 database which provides anonymized, longitudinal EHR from across US. The models were developed based on clinical characteristics to predict 28-day in-hospital mortality, ICU admission, respiratory failure, mechanical ventilator usages at inpatient setting. Data from patients who were admitted prior to Sep 7, 2020, is randomly sampled into development, test and validation datasets; data collected from Sep 7, 2020 through Nov 15, 2020 was reserved as prospective validation dataset. RESULTS Of 3.7M patients in the analysis, a total of 585,867 patients were diagnosed or tested positive for SARS-CoV-2; and 50,703 adult patients were hospitalized with COVID-19 between Feb 1 and Nov 15, 2020. Among the study cohort (N=50,703), there were 6,204 deaths, 9,564 ICU admissions, 6,478 mechanically ventilated or EMCO patients and 25,169 patients developed ARDS or respiratory failure within 28 days since hospital admission. The algorithms demonstrated high accuracy (AUC = 0.89 (0.89 - 0.89) on validation dataset (N=10,752)), consistent prediction through the second wave of pandemic from September to November (AUC = 0.85 (0.85 - 0.86) on post-development validation (N= 14,863)), great clinical relevance and utility. Besides, a comprehensive 386 input covariates from baseline and at admission was included in the analysis; the end-to-end pipeline automates feature selection and model development process, producing 10 key predictors as input such as age, blood urea nitrogen, oxygen saturation, which are both commonly measured and concordant with recognized risk factors for COVID-19. CONCLUSIONS The systematic approach and rigorous validations demonstrate consistent model performance to predict even beyond the time period of data collection, with satisfactory discriminatory power and great clinical utility. Overall, the study offers an accurate, validated and reliable prediction model based on only ten clinical features as a prognostic tool to stratifying COVID-19 patients into intermediate, high and very high-risk groups. This simple predictive tool could be shared with a wider healthcare community, to enable service as an early warning system to alert physicians of possible high-risk patients, or as a resource triaging tool to optimize healthcare resources. CLINICALTRIAL N/A

Download Full-text

A Robust Method to Predict Fluid Properties Based on Big Data and Machine Learning Algorithms

10.2523/iptc-21356-ms ◽

2021 ◽

Author(s):

Yingxian Liu ◽

Cunliang Chen ◽

Hanqing Zhao ◽

Yu Wang ◽

Xiaodong Han

Keyword(s):

Machine Learning ◽

Physical Properties ◽

Learning Algorithm ◽

Direct Method ◽

Learning Algorithms ◽

Small Error ◽

Machine Learning Algorithms ◽

Well Test ◽

Empirical Formulas ◽

Fluid Properties

Abstract Fluid properties are key factors for predicting single well productivity, well test interpretation and oilfield recovery prediction, which directly affect the success of ODP program design. The most accurate and direct method of acquisition is underground sampling. However, not every well has samples due to technical reasons such as excessive well deviation or high cost during the exploration stage. Therefore, analogies or empirical formulas have to be adopted to carry out research in many cases. But a large number of oilfield developments have shown that the errors caused by these methods are very large. Therefore, how to quickly and accurately obtain fluid physical properties is of great significance. In recent years, with the development and improvement of artificial intelligence or machine learning algorithms, their applications in the oilfield have become more and more extensive. This paper proposed a method for predicting crude oil physical properties based on machine learning algorithms. This method uses PVT data from nearly 100 wells in Bohai Oilfield. 75% of the data is used for training and learning to obtain the prediction model, and the remaining 25% is used for testing. Practice shows that the prediction results of the machine learning algorithm are very close to the actual data, with a very small error. Finally, this method was used to apply the preliminary plan design of the BZ29 oilfield which is a new oilfield. Especially for the unsampled sand bodies, the fluid physical properties prediction was carried out. It also compares the influence of the analogy method on the scheme, which provides potential and risk analysis for scheme design. This method will be applied in more oil fields in the Bohai Sea in the future and has important promotion value.

Download Full-text

Risk Monitoring and Quantitative Results of Various Attributes of Machine Learning Algorithms with a Time Series Data

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9570.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 4018-4022

Keyword(s):

Machine Learning ◽

Time Series Data ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Series Data ◽

Machine Learning Algorithm ◽

Risk Modelling ◽

Risk Monitoring ◽

Quantitative Results

The aim of this research is to do risk modelling after analysis of twitter posts based on certain sentiment analysis. In this research we analyze posts of several users or a particular user to check whether they can be cause of concern to the society or not. Every sentiment like happy, sad, anger and other emotions are going to provide scaling of severity in the conclusion of final table on which machine learning algorithm is applied. The data which is put under the machine learning algorithms are been monitored over a period of time and it is related to a particular topic in an area

Download Full-text

DETECTING CREDIT CARD FRAUD USING MACHINE LEARNING ALGORITHMS

InterConf ◽

10.51582/interconf.19-20.08.2021.037 ◽

2021 ◽

pp. 393-403

Author(s):

Olexander Shmatko ◽

Volodimir Fedorchenko ◽

Dmytro Prochukhan

Keyword(s):

Machine Learning ◽

Financial Services ◽

Credit Card ◽

Banking Sector ◽

Learning Algorithm ◽

Learning Algorithms ◽

Credit Cards ◽

Internet Banking ◽

Machine Learning Algorithms ◽

Credit Card Fraud

Today the banking sector offers its clients many different financial services such as ATM cards, Internet banking, Debit card, and Credit card, which allows attracting a large number of new customers. This article proposes an information system for detecting credit card fraud using a machine learning algorithm. Usually, credit cards are used by the customer around the clock, so the bank's server can track all transactions using machine learning algorithms. It must find or predict fraud detection. The dataset contains characteristics for each transaction and fraudulent transactions need to be classified and detected. For these purposes, the work proposes the use of the Random Forest algorithm.

Download Full-text

Significant Impact of Improved Machine Learning Algorithm in The Processes of Large Data Sets

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206133 ◽

2020 ◽

pp. 458-467

Author(s):

Virendra Tiwari ◽

Balendra Garg ◽

Uday Prakash Sharma

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Dynamic Environment ◽

Large Data ◽

Machine Learning Algorithms ◽

Streaming Data ◽

Machine Learning Techniques ◽

Machine Learning Algorithm ◽

Learning Mechanisms

The machine learning algorithms are capable of managing multi-dimensional data under the dynamic environment. Despite its so many vital features, there are some challenges to overcome. The machine learning algorithms still requires some additional mechanisms or procedures for predicting a large number of new classes with managing privacy. The deficiencies show the reliable use of a machine learning algorithm relies on human experts because raw data may complicate the learning process which may generate inaccurate results. So the interpretation of outcomes with expertise in machine learning mechanisms is a significant challenge in the machine learning algorithm. The machine learning technique suffers from the issue of high dimensionality, adaptability, distributed computing, scalability, the streaming data, and the duplicity. The main issue of the machine learning algorithm is found its vulnerability to manage errors. Furthermore, machine learning techniques are also found to lack variability. This paper studies how can be reduced the computational complexity of machine learning algorithms by finding how to make predictions using an improved algorithm.

Download Full-text

Machine Learning Algorithms

Computational Intelligence in the Internet of Things - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-7955-7.ch009 ◽

2019 ◽

pp. 210-233

Author(s):

Namrata Dhanda ◽

Stuti Shukla Datta ◽

Mudrika Dhanda

Keyword(s):

Machine Learning ◽

Intelligent Systems ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Clear Distinction ◽

Human Intelligence ◽

Training Algorithms ◽

Smart Systems ◽

Supervised And Unsupervised Learning

Human intelligence is deeply involved in creating efficient and faster systems that can work independently. Creation of such smart systems requires efficient training algorithms. Thus, the aim of this chapter is to introduce the readers with the concept of machine learning and the commonly employed learning algorithm for developing efficient and intelligent systems. The chapter gives a clear distinction between supervised and unsupervised learning methods. Each algorithm is explained with the help of suitable example to give an insight to the learning process.

Download Full-text

Comparing Machine Learning Models for the Predictions of Speed in Smart Transportation Systems

10.4018/978-1-7998-7685-4.ch002 ◽

2022 ◽

pp. 34-46

Author(s):

Amtul Waheed ◽

Jana Shafi ◽

Saritha V.

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Smart Cities ◽

Learning Algorithms ◽

High Capacity ◽

Transportation Systems ◽

Machine Learning Algorithms ◽

Smart Transportation ◽

Data Propagation ◽

Machine Learning Models

In today's world of advanced technologies in IoT and ITS in smart cities scenarios, there are many different projections such as improved data propagation in smart roads and cooperative transportation networks, autonomous and continuously connected vehicles, and low latency applications in high capacity environments and heterogeneous connectivity and speed. This chapter presents the performance of the speed of vehicles on roadways employing machine learning methods. Input variable for each learning algorithm is the density that is measured as vehicle per mile and volume that is measured as vehicle per hour. And the result shows that the output variable is the speed that is measured as miles per hour represent the performance of each algorithm. The performance of machine learning algorithms is calculated by comparing the result of predictions made by different machine learning algorithms with true speed using the histogram. A result recommends that speed is varying according to the histogram.

Download Full-text