Deducing Optimal Classification Algorithm for Heterogeneous Fabric

Mapping Intimacies ◽

10.36227/techrxiv.17162147.v2 ◽

2022 ◽

Author(s):

Omar Alfarisi ◽

Zeyar Aung ◽

Mohamed Sassi

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Synthetic Data ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Data Set ◽

Heterogeneous Rock ◽

Optimal Classification ◽

Optimal Machine

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneous rock fabric, we identified Random Forest, among others, to be the appropriate algorithm.

Download Full-text

Deducing of Optimal Machine Learning Algorithms for Heterogeneity

10.36227/techrxiv.17162147 ◽

2021 ◽

Author(s):

Omar Alfarisi ◽

Zeyar Aung ◽

Mohamed Sassi

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Learning Algorithms ◽

Synthetic Data ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Data Set ◽

Optimal Machine

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.

Download Full-text

Deducing of Optimal Machine Learning Algorithms for Heterogeneity

10.36227/techrxiv.17162147.v1 ◽

2021 ◽

Author(s):

Omar Alfarisi ◽

Zeyar Aung ◽

Mohamed Sassi

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Learning Algorithms ◽

Synthetic Data ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Data Set ◽

Optimal Machine

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.

Download Full-text

A SYNTHETIC DATA SET OF 3D OOCYTE IMAGES AND MACHINE LEARNING ALGORITHM AS A MODEL TO ASSESS THE REPRODUCTIVE POTENTIAL OF OOCYTES

Fertility and Sterility ◽

10.1016/j.fertnstert.2020.08.424 ◽

2020 ◽

Vol 114 (3) ◽

pp. e145

Author(s):

Gerard Letterie ◽

Nathan Kundtz

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Reproductive Potential ◽

Synthetic Data ◽

Machine Learning Algorithm ◽

Data Set

Download Full-text

Application of machine learning algorithm for predicting gestational diabetes mellitus in early pregnancy†

Frontiers of Nursing ◽

10.2478/fon-2021-0022 ◽

2021 ◽

Vol 8 (3) ◽

pp. 209-221

Author(s):

Li-Li Wei ◽

Yue-Shuai Pan ◽

Yan Zhang ◽

Kai Chen ◽

Hao-Yu Wang ◽

...

Keyword(s):

Diabetes Mellitus ◽

Machine Learning ◽

Random Forest ◽

Prediction Model ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Random Forest Algorithm ◽

Random Forest Regression ◽

Data Set

Abstract Objective To study the application of a machine learning algorithm for predicting gestational diabetes mellitus (GDM) in early pregnancy. Methods This study identified indicators related to GDM through a literature review and expert discussion. Pregnant women who had attended medical institutions for an antenatal examination from November 2017 to August 2018 were selected for analysis, and the collected indicators were retrospectively analyzed. Based on Python, the indicators were classified and modeled using a random forest regression algorithm, and the performance of the prediction model was analyzed. Results We obtained 4806 analyzable data from 1625 pregnant women. Among these, 3265 samples with all 67 indicators were used to establish data set F1; 4806 samples with 38 identical indicators were used to establish data set F2. Each of F1 and F2 was used for training the random forest algorithm. The overall predictive accuracy of the F1 model was 93.10%, area under the receiver operating characteristic curve (AUC) was 0.66, and the predictive accuracy of GDM-positive cases was 37.10%. The corresponding values for the F2 model were 88.70%, 0.87, and 79.44%. The results thus showed that the F2 prediction model performed better than the F1 model. To explore the impact of sacrificial indicators on GDM prediction, the F3 data set was established using 3265 samples (F1) with 38 indicators (F2). After training, the overall predictive accuracy of the F3 model was 91.60%, AUC was 0.58, and the predictive accuracy of positive cases was 15.85%. Conclusions In this study, a model for predicting GDM with several input variables (e.g., physical examination, past history, personal history, family history, and laboratory indicators) was established using a random forest regression algorithm. The trained prediction model exhibited a good performance and is valuable as a reference for predicting GDM in women at an early stage of pregnancy. In addition, there are certain requirements for the proportions of negative and positive cases in sample data sets when the random forest algorithm is applied to the early prediction of GDM.

Download Full-text

Can 1H-Nuclear magnetic resonance (NMR) be used for early detection of hepatocellular cancer (HCC)?

Journal of Clinical Oncology ◽

10.1200/jco.2007.25.18_suppl.15107 ◽

2007 ◽

Vol 25 (18_suppl) ◽

pp. 15107-15107

Author(s):

R. V. Iyer ◽

B. Tennant ◽

M. Ruiz ◽

T. Szyperski ◽

D. Trump ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Hepatocellular Cancer ◽

Screening Tools ◽

Supervised Machine Learning ◽

Support Vector ◽

Machine Learning Algorithm ◽

Data Set ◽

H Nmr ◽

Kappa Value

15107 Background: HCC is a common and rapidly fatal cancer. Current screening tools are inadequate for identification of potentially curable cases. Our aim was to determine whether H-NMR can identify HCC compared to controls in the woodchuck (WC) model of hepatitis related HCC. Methods: Eastern WCs were bred and inoculated at birth with dilute sera from WCs that are chronic carriers of Woodchuck Hepatitis B Virus (WHV). This resulted in chronic hepatitis in ∼60% animals and all carriers developed HCC by 24–36 months. Serum from 10 chronic WHV carriers with HCC (group 1), 5 WHV carriers with no HCC (group 2) and 15 matched non-infected controls (group 3) was obtained. 45uL serum was diluted with 5uL of D2O containing 27mM formic acid + 0.9% saline. Spectra were collected on a 600 MHz INOVA spectrometer using a CapNMR flow probe with 10uL flow cell at 298K without knowledge of group assignments. The resulting 1D spectra were processed using Nuts from AcornNMR. Results: Principle component analysis and supervised PLS-DA was performed using Simca P+ from Umetrics. Despite general separation of groups, the Q2 value of this model was relatively low (0.20). We trained a Support Vector Machine (SVM) algorithm, a supervised machine-learning algorithm, to learn to identify the groups. Evaluation of the performance of the algorithm using 10-fold validation on the data set achieved a Kappa value of 0.43. This algorithm learnt to identify HCC [0.765 ROC, 0.8 sensitivity, and 0.727 positive predictive value (PPV)] and controls (0.75 ROC, 0.69 sensitivity and 0.73 PPV) but not the WHV carrier group, likely due to the small numbers. In a second analysis of 10 HCC and 15 controls, PLS-DA showed clear separation using three components (Q2= 0.5). The corresponding SVM model showed a kappa value of 0.52 and ROC values of 0.767 for both classes. Conclusions: Our preliminary results indicate that H-NMR spectra alone can be used to distinguish HCC from healthy controls using the machine-learning algorithm for classification. Further validation in a larger cohort of woodchucks is ongoing and confirmation of these preliminary findings would support investigation of this technique as a screening tool in patients at risk for developing HCC. No significant financial relationships to disclose.

Download Full-text

A Reckoning Analysis and Assessment of Different Supervised Machine Learning Algorithm for Breast Cancer Prediction

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i3.8388 ◽

2019 ◽

Vol 7 (3) ◽

pp. 83-88

Author(s):

Pragati Prakash ◽

Nidhi Ekka ◽

Manjit Jaiswal

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Learning Algorithm ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Cancer Prediction

Download Full-text

Analysis of Gender Identification in Bahasa Indonesia using Supervised Machine Learning Algorithm

2020 3rd International Conference on Information and Communications Technology (ICOIACT) ◽

10.1109/icoiact50329.2020.9332145 ◽

2020 ◽

Author(s):

Evawaty Tanuar ◽

Edi Abdurachman ◽

Ford Lumban Gaol ◽

Lukas

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Gender Identification ◽

Bahasa Indonesia

Download Full-text

Land subsidence susceptibility assessment using random forest machine learning algorithm

Environmental Earth Sciences ◽

10.1007/s12665-019-8518-3 ◽

2019 ◽

Vol 78 (16) ◽

Cited By ~ 12

Author(s):

Majid Mohammady ◽

Hamid Reza Pourghasemi ◽

Mojtaba Amiri

Keyword(s):

Machine Learning ◽

Random Forest ◽

Land Subsidence ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Susceptibility Assessment

Download Full-text

Big Data for Health Care Analytics using Extreme Machine Learning Based on Map Reduce

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c5808.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2758-2762

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Storage ◽

Clinical Data ◽

Disease Risk ◽

Learning Algorithm ◽

Information Storage ◽

Support Vector ◽

Machine Learning Algorithm ◽

Data Set

A large volume of datasets is available in various fields that are stored to be somewhere which is called big data. Big Data healthcare has clinical data set of every patient records in huge amount and they are maintained by Electronic Health Records (EHR). More than 80 % of clinical data is the unstructured format and reposit in hundreds of forms. The challenges and demand for data storage, analysis is to handling large datasets in terms of efficiency and scalability. Hadoop Map reduces framework uses big data to store and operate any kinds of data speedily. It is not solely meant for storage system however conjointly a platform for information storage moreover as processing. It is scalable and fault-tolerant to the systems. Also, the prediction of the data sets is handled by machine learning algorithm. This work focuses on the Extreme Machine Learning algorithm (ELM) that can utilize the optimized way of finding a solution to find disease risk prediction by combining ELM with Cuckoo Search optimization-based Support Vector Machine (CS-SVM). The proposed work also considers the scalability and accuracy of big data models, thus the proposed algorithm greatly achieves the computing work and got good results in performance of both veracity and efficiency.

Download Full-text

Corporate distress prediction using random forest and tree net for india

Journal of Management and Science ◽

10.26524/jms.2020.1 ◽

2020 ◽

Vol 10 (1) ◽

pp. 1-11

Author(s):

Arvind Shrivastava ◽

Nitin Kumar ◽

Kuldeep Kumar ◽

Sanjeev Gupta

Keyword(s):

Random Forest ◽

Learning Algorithm ◽

Predictive Performance ◽

Machine Learning Algorithm ◽

Data Set ◽

Data Mining Tool ◽

Distress Prediction ◽

Out Of Sample ◽

Mining Tool ◽

Corporate Distress

The paper deals with the Random Forest, a popular classification machine learning algorithm to predict bankruptcy (distress) for Indian firms. Random Forest orders firms according to their propensity to default or their likelihood to become distressed. This is also useful to explain the association between the tendency of firm failure and its features. The results are analyzed vis-à-vis Tree Net. Both in-sample and out of sample estimations have been performed to compare Random Forest with Tree Net, which is a cutting edge data mining tool known to provide satisfactory estimation results. An exhaustive data set comprising companies from varied sectors have been included in the analysis. It is found that Tree Net procedure provides improved classification and predictive performance vis-à-vis Random Forest methodology consistently that may be utilized further by industry analysts and researchers alike for predictive purposes.

Download Full-text