Research on the Controllable Confidence Machine

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1079-1080.851 ◽

2014 ◽

Vol 1079-1080 ◽

pp. 851-855

Author(s):

Fang Chun Jiang ◽

Sheng Feng Tian

Keyword(s):

Machine Learning ◽

Experimental Data ◽

Classification Accuracy ◽

Research Result ◽

Data Sets ◽

Threshold Values

Manageable confidence machine learning is one of the important approaches to implement confidence machine application. This paper is based on two class confidence classifier, adopting two class classifier as tool to convert learning results of classifiers and achieve confidence management through setting threshold values. The research accomplished manageable general accuracy of the classification and manageable positive/negative classification accuracy. Such method is tested in 5 experimental data sets of cardiopathy and diabetes, achieved preferable research result.

Download Full-text

Research on the Confidence Regression Based on KNN Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.713-715.1877 ◽

2015 ◽

Vol 713-715 ◽

pp. 1877-1881

Author(s):

Fang Chun Jiang ◽

Sheng Feng Tian

Keyword(s):

Machine Learning ◽

Experimental Data ◽

Research Field ◽

Experimental Results ◽

Data Sets ◽

Error Evaluation ◽

Significant Research ◽

Specific Error

Confidence regression is a significant research field of confidence machine learning. This paper adopts KNN algorithm as a tool, and performs error evaluation on results of regressive learning to classify the accept field and the refuse field so as to achieve the confidence regression. By setting specific error value, this approach achieves controllable confidence regression, which has been tested on experimental data of bodyfat and other data sets. The experimental results presented show the feasibility of our approach.

Download Full-text

Application of multi-sensor unmanned aerial system for identification of hydrothermal alteration zones

10.5194/egusphere-egu2020-12546 ◽

2020 ◽

Author(s):

Yosoon Choi ◽

Jieun Baek ◽

Jangwon Suh ◽

Sung-Min Kim

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Training Data ◽

Sensor Data ◽

Machine Learning Techniques ◽

Integrated Analysis ◽

Unmanned Aerial System ◽

Data Sets ◽

Learning Techniques ◽

Hydrothermal Alteration Zones

<p>In this study, we proposed a method to utilize a multi-sensor Unmanned Aerial System (UAS) for exploration of hydrothermal alteration zones. This study selected an area (10m &#215; 20m) composed mainly of the andesite and located on the coast, with wide outcrops and well-developed structural and mineralization elements. Multi-sensor (visible, multispectral, thermal, magnetic) data were acquired in the study area using UAS, and were studied using machine learning techniques. For utilizing the machine learning techniques, we applied the stratified random method to sample 1000 training data in the hydrothermal zone and 1000 training data in the non-hydrothermal zone identified through the field survey. The 2000 training data sets created for supervised learning were first classified into 1500 for training and 500 for testing. Then, 1500 for training were classified into 1200 for training and 300 for validation. The training and validation data for machine learning were generated in five sets to enable cross-validation. Five types of machine learning techniques were applied to the training data sets: k-Nearest Neighbors (k-NN), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and Deep Neural Network (DNN). As a result of integrated analysis of multi-sensor data using five types of machine learning techniques, RF and SVM techniques showed high classification accuracy of about 90%. Moreover, performing integrated analysis using multi-sensor data showed relatively higher classification accuracy in all five machine learning techniques than analyzing magnetic sensing data or single optical sensing data only.</p>

Download Full-text

Accurate Liver Disease Prediction with Extreme Gradient Boosting

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8684.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 2288-2295 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Liver Disease ◽

Classification Accuracy ◽

Gradient Boosting ◽

Support Vector ◽

Disease Prediction ◽

Data Sets ◽

Extreme Gradient Boosting ◽

C4.5 Decision Tree ◽

The Times

Abstract-Machine learning is used extensively in medical diagnosis to predict the existence of diseases. Existing classification algorithms are frequently used for automatic detection of diseases. But most of the times, they do not give 100% accurate results. Boosting techniques are often used in Machine learning to get maximum classification accuracy. Though several boosting techniques are in place but the XGBoost algorithm is doing extremely well for some selected data sets. Building an XGBoost model is simple but improving the model by tuning the parameters is a challenging task. There are many parameters to the XGBoost algorithm and deciding what set of parameters to tune and the ideal values of these parameters is a cumbersome and time taking task. We, in this paper, tuned the XGBoost model for the first time for Liver disease prediction and got 100% accuracy by tuning some of the hyper parameters. It is observed that the model proposed by us exhibited highest classification accuracy compared to all other models built till now by machine learning researchers and some regularly used algorithms like Support Vector Machines (SVM), Naive Bayes (NB), C4.5 Decision tree, Random Belief Networks, Alternating Decision Trees (ADT) experimented by us.

Download Full-text

From Data to Assessment Models, Demonstrated through a Digital Twin of Marine Risers

10.4043/30985-ms ◽

2021 ◽

Author(s):

Ehsan Kharazmi ◽

Zhicheng Wang ◽

Dixia Fan ◽

Samuel Rudy ◽

Themis Sapsis ◽

...

Keyword(s):

Machine Learning ◽

Experimental Data ◽

Complex Systems ◽

Fatigue Damage ◽

Complete Characterization ◽

Sensor Data ◽

Data Sets ◽

Multiple Sources ◽

Marine Risers ◽

Vortex Induced Vibrations

Abstract Assessing the fatigue damage in marine risers due to vortex-induced vibrations (VIV) serves as a comprehensive example of using machine learning methods to derive assessment models of complex systems. A complete characterization of response of such complex systems is usually unavailable despite massive experimental data and computation results. These algorithms can use multi-fidelity data sets from multiple sources, including real-time sensor data from the field, systematic experimental data, and simulation data. Here we develop a three-pronged approach to demonstrate how tools in machine learning are employed to develop data-driven models that can be used for accurate and efficient fatigue damage predictions for marine risers subject to VIV.

Download Full-text

On the assessment of abdominal aortic aneurysm rupture risk in the Asian population based on geometric attributes

Proceedings of the Institution of Mechanical Engineers Part H Journal of Engineering in Medicine ◽

10.1177/0954411918794724 ◽

2018 ◽

Vol 232 (9) ◽

pp. 922-929 ◽

Cited By ~ 4

Author(s):

Tejas Canchi ◽

Eddie YK Ng ◽

Sriram Narayanan ◽

Ender A Finol

Keyword(s):

Machine Learning ◽

Abdominal Aortic Aneurysm ◽

Aortic Aneurysm ◽

Classification Accuracy ◽

Area Under The Curve ◽

Aneurysm Rupture ◽

Patient Data ◽

Data Sets ◽

Rupture Risk ◽

Abdominal Aortic

This study aims to review retrospectively the records of Asian patients diagnosed with abdominal aortic aneurysm to investigate the potential correlations between clinical and morphological parameters within the context of whether the aneurysms were ruptured or unruptured. A machine-learning-based approach is proposed to predict the rupture status of Asian abdominal aortic aneurysm by comparing four different classifiers trained with clinical and geometrical parameters obtained from computed tomography images. The classifiers were applied on 312 patient data sets obtained from a regulatory-approved database. The data sets included 17 attributes under three classes: unruptured abdominal aortic aneurysm, ruptured abdominal aortic aneurysm, and normal aorta without aneurysm. Four different classification models, namely, Decision trees, Naïve Bayes, logistic regression, and support vector machines were applied to the patient data set. The models were evaluated by 10-fold cross-validation and the classifier performances were assessed with classification accuracy, area under the curve of receiver operator characteristic, and F-measures. Data analysis and evaluation were performed using the Weka machine learning application. The results indicated that Naïve Bayes achieved the best performance among the classifiers with a classification accuracy of 95.2%, an area under the curve of 0.974, and an F-measure of 0.952. The clinical implications of this work can be addressed in two ways. The best classifier can be applied to prospectively acquired data to predict the likelihood of aneurysm rupture. Next, it would be necessary to estimate the attributes implicated in rupture risk beyond just maximum aneurysm diameter.

Download Full-text

Machine Learning Approaches for the Analysis of Non-Metallic Inclusion Data Sets

AISTech2019 Proceedings of the Iron and Steel Technology Conference ◽

10.33313/377/275 ◽

2019 ◽

Author(s):

M. Webler ◽

B. Abdulsalam

Keyword(s):

Machine Learning ◽

Data Sets ◽

Learning Approaches ◽

Metallic Inclusion

Download Full-text

Trade-off Predictivity and Explainability for Machine-Learning Powered Predictive Toxicology: An in-Depth Investigation with Tox21 Data Sets

Chemical Research in Toxicology ◽

10.1021/acs.chemrestox.0c00373 ◽

2021 ◽

Vol 34 (2) ◽

pp. 541-549 ◽

Cited By ~ 1

Author(s):

Leihong Wu ◽

Ruili Huang ◽

Igor V. Tetko ◽

Zhonghua Xia ◽

Joshua Xu ◽

...

Keyword(s):

Machine Learning ◽

Data Sets ◽

Predictive Toxicology ◽

Trade Off

Download Full-text

Using Machine Learning Methods to Identify Particle Types from Doppler Lidar Measurements in Iceland

Remote Sensing ◽

10.3390/rs13132433 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2433

Author(s):

Shu Yang ◽

Fengchao Peng ◽

Sibylle von Löwis ◽

Guðrún Nína Petersen ◽

David Christian Finger

Keyword(s):

Machine Learning ◽

Weather Conditions ◽

Dust Storms ◽

Machine Learning Algorithms ◽

Lidar Data ◽

Data Sets ◽

Doppler Lidar ◽

Lidar Measurements ◽

Using Data ◽

Filter Noise

Doppler lidars are used worldwide for wind monitoring and recently also for the detection of aerosols. Automatic algorithms that classify the lidar signals retrieved from lidar measurements are very useful for the users. In this study, we explore the value of machine learning to classify backscattered signals from Doppler lidars using data from Iceland. We combined supervised and unsupervised machine learning algorithms with conventional lidar data processing methods and trained two models to filter noise signals and classify Doppler lidar observations into different classes, including clouds, aerosols and rain. The results reveal a high accuracy for noise identification and aerosols and clouds classification. However, precipitation detection is underestimated. The method was tested on data sets from two instruments during different weather conditions, including three dust storms during the summer of 2019. Our results reveal that this method can provide an efficient, accurate and real-time classification of lidar measurements. Accordingly, we conclude that machine learning can open new opportunities for lidar data end-users, such as aviation safety operators, to monitor dust in the vicinity of airports.

Download Full-text

A top-level model of case-based argumentation for explanation: Formalisation and experiments

Argument & Computation ◽

10.3233/aac-210009 ◽

2021 ◽

pp. 1-36

Author(s):

Henry Prakken ◽

Rosa Ratsma

Keyword(s):

Machine Learning ◽

Decision Making ◽

Linear Models ◽

Evaluation Studies ◽

Data Sets ◽

Machine Learning Applications ◽

Level Model ◽

Similarities And Differences ◽

Further Development ◽

Case Based

This paper proposes a formal top-level model of explaining the outputs of machine-learning-based decision-making applications and evaluates it experimentally with three data sets. The model draws on AI & law research on argumentation with cases, which models how lawyers draw analogies to past cases and discuss their relevant similarities and differences in terms of relevant factors and dimensions in the problem domain. A case-based approach is natural since the input data of machine-learning applications can be seen as cases. While the approach is motivated by legal decision making, it also applies to other kinds of decision making, such as commercial decisions about loan applications or employee hiring, as long as the outcome is binary and the input conforms to this paper’s factor- or dimension format. The model is top-level in that it can be extended with more refined accounts of similarities and differences between cases. It is shown to overcome several limitations of similar argumentation-based explanation models, which only have binary features and do not represent the tendency of features towards particular outcomes. The results of the experimental evaluation studies indicate that the model may be feasible in practice, but that further development and experimentation is needed to confirm its usefulness as an explanation model. Main challenges here are selecting from a large number of possible explanations, reducing the number of features in the explanations and adding more meaningful information to them. It also remains to be investigated how suitable our approach is for explaining non-linear models.

Download Full-text

Data Mining-based Financial Statement Fraud Detection: Systematic Literature Review and Meta-analysis to Estimate Data Sample Mapping of Fraudulent Companies Against Non-fraudulent Companies

Global Business Review ◽

10.1177/0972150920984857 ◽

2021 ◽

pp. 097215092098485

Author(s):

Sonika Gupta ◽

Sushil Kumar Mehta

Keyword(s):

Machine Learning ◽

Data Mining ◽

Literature Review ◽

Systematic Literature Review ◽

Classification Accuracy ◽

Meta Analysis ◽

Financial Statement ◽

Research Articles ◽

Financial Statement Fraud ◽

Data Mining Techniques

Data mining techniques have proven quite effective not only in detecting financial statement frauds but also in discovering other financial crimes, such as credit card frauds, loan and security frauds, corporate frauds, bank and insurance frauds, etc. Classification of data mining techniques, in recent years, has been accepted as one of the most credible methodologies for the detection of symptoms of financial statement frauds through scanning the published financial statements of companies. The retrieved literature that has used data mining classification techniques can be broadly categorized on the basis of the type of technique applied, as statistical techniques and machine learning techniques. The biggest challenge in executing the classification process using data mining techniques lies in collecting the data sample of fraudulent companies and mapping the sample of fraudulent companies against non-fraudulent companies. In this article, a systematic literature review (SLR) of studies from the area of financial statement fraud detection has been conducted. The review has considered research articles published between 1995 and 2020. Further, a meta-analysis has been performed to establish the effect of data sample mapping of fraudulent companies against non-fraudulent companies on the classification methods through comparing the overall classification accuracy reported in the literature. The retrieved literature indicates that a fraudulent sample can either be equally paired with non-fraudulent sample (1:1 data mapping) or be unequally mapped using 1:many ratio to increase the sample size proportionally. Based on the meta-analysis of the research articles, it can be concluded that machine learning approaches, in comparison to statistical approaches, can achieve better classification accuracy, particularly when the availability of sample data is low. High classification accuracy can be obtained with even a 1:1 mapping data set using machine learning classification approaches.

Download Full-text