Multi-Instance Learning with MultiObjective Genetic Programming

Author(s):  
Amelia Zafra

The multiple-instance problem is a difficult machine learning problem that appears in cases where knowledge about training examples is incomplete. In this problem, the teacher labels examples that are sets (also called bags) of instances. The teacher does not label whether an individual instance in a bag is positive or negative. The learning algorithm needs to generate a classifier that will correctly classify unseen examples (i.e., bags of instances). This learning framework is receiving growing attention in the machine learning community and since it was introduced by Dietterich, Lathrop, Lozano-Perez (1997), a wide range of tasks have been formulated as multi-instance problems. Among these tasks, we can cite content-based image retrieval (Chen, Bi, & Wang, 2006) and annotation (Qi and Han, 2007), text categorization (Andrews, Tsochantaridis, & Hofmann, 2002), web index page recommendation (Zhou, Jiang, & Li, 2005; Xue, Han, Jiang, & Zhou, 2007) and drug activity prediction (Dietterich et al., 1997; Zhou & Zhang, 2007). In this chapter we introduce MOG3P-MI, a multiobjective grammar guided genetic programming algorithm to handle multi-instance problems. In this algorithm, based on SPEA2, individuals represent classification rules which make it possible to determine if a bag is positive or negative. The quality of each individual is evaluated according to two quality indexes: sensitivity and specificity. Both these measures have been adapted to MIL circumstances. Computational experiments show that the MOG3P-MI is a robust algorithm for classification in different domains where achieves competitive results and obtain classifiers which contain simple rules which add comprehensibility and simplicity in the knowledge discovery process, being suitable method for solving MIL problems (Zafra & Ventura, 2007).

Author(s):  
Jie Yuan ◽  
Yuan Ji ◽  
Zhou Zhu ◽  
Liya Huang ◽  
Junfeng Qian ◽  
...  

In order to solve the problems of large error and low performance of traditional progressive image model matching information checking methods, an automatic progressive image model matching information checking method based on machine learning is proposed. The generation method of progressive image is analyzed, and the target image sample is obtained. On this basis, machine learning algorithm is used to segment progressive image samples. In each image segmentation part, crawler technology is used to automatically collect progressive image model matching information, and under the constraint of image model matching information checking standard, automatic checking of progressive image model matching information is realized from geometric structure, image content and other aspects. Experimental results show that the verification error of the design method is reduced by 0.687 Mb, and the quality of progressive image is improved.


2021 ◽  
Vol 22 (Supplement_1) ◽  
Author(s):  
M Omer ◽  
A Amir-Khalili ◽  
A Sojoudi ◽  
T Thao Le ◽  
S A Cook ◽  
...  

Abstract Funding Acknowledgements Type of funding sources: Public grant(s) – National budget only. Main funding source(s): SmartHeart EPSRC programme grant (www.nihr.ac.uk), London Medical Imaging and AI Centre for Value-Based Healthcare Background Quality measures for machine learning algorithms include clinical measures such as end-diastolic (ED) and end-systolic (ES) volume, volumetric overlaps such as Dice similarity coefficient and surface distances such as Hausdorff distance. These measures capture differences between manually drawn and automated contours but fail to capture the trust of a clinician to an automatically generated contour. Purpose We propose to directly capture clinicians’ trust in a systematic way. We display manual and automated contours sequentially in random order and ask the clinicians to score the contour quality. We then perform statistical analysis for both sources of contours and stratify results based on contour type. Data The data selected for this experiment came from the National Health Center Singapore. It constitutes CMR scans from 313 patients with diverse pathologies including: healthy, dilated cardiomyopathy (DCM), hypertension (HTN), hypertrophic cardiomyopathy (HCM), ischemic heart disease (IHD), left ventricular non-compaction (LVNC), and myocarditis. Each study contains a short axis (SAX) stack, with ED and ES phases manually annotated. Automated contours are generated for each SAX image for which manual annotation is available. For this, a machine learning algorithm trained at Circle Cardiovascular Imaging Inc. is applied and the resulting predictions are saved to be displayed in the contour quality scoring (CQS) application. Methods: The CQS application displays manual and automated contours in a random order and presents the user an option to assign a contour quality score 1: Unacceptable, 2: Bad, 3: Fair, 4: Good. The UK Biobank standard operating procedure is used for assessing the quality of the contoured images. Quality scores are assigned based on how the contour affects clinical outcomes. However, as images are presented independent of spatiotemporal context, contour quality is assessed based on how well the area of the delineated structure is approximated. Consequently, small contours and small deviations are rarely assigned a quality score of less than 2, as they are not clinically relevant. Special attention is given to the RV-endo contours as often, mostly in basal images, two separate contours appear. In such cases, a score of 3 is given if the two disjoint contours sufficiently encompass the underlying anatomy; otherwise they are scored as 2 or 1. Results A total of 50991 quality scores (24208 manual and 26783 automated) are generated by five expert raters. The mean score for all manual and automated contours are 3.77 ± 0.48 and 3.77 ± 0.52, respectively. The breakdown of mean quality scores by contour type is included in Fig. 1a while the distribution of quality scores for various raters are shown in Fig. 1b. Conclusion We proposed a method of comparing the quality of manual versus automated contouring methods. Results suggest similar statistics in quality scores for both sources of contours. Abstract Figure 1


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Samar Ali Shilbayeh ◽  
Sunil Vadera

Purpose This paper aims to describe the use of a meta-learning framework for recommending cost-sensitive classification methods with the aim of answering an important question that arises in machine learning, namely, “Among all the available classification algorithms, and in considering a specific type of data and cost, which is the best algorithm for my problem?” Design/methodology/approach This paper describes the use of a meta-learning framework for recommending cost-sensitive classification methods for the aim of answering an important question that arises in machine learning, namely, “Among all the available classification algorithms, and in considering a specific type of data and cost, which is the best algorithm for my problem?” The framework is based on the idea of applying machine learning techniques to discover knowledge about the performance of different machine learning algorithms. It includes components that repeatedly apply different classification methods on data sets and measures their performance. The characteristics of the data sets, combined with the algorithms and the performance provide the training examples. A decision tree algorithm is applied to the training examples to induce the knowledge, which can then be used to recommend algorithms for new data sets. The paper makes a contribution to both meta-learning and cost-sensitive machine learning approaches. Those both fields are not new, however, building a recommender that recommends the optimal case-sensitive approach for a given data problem is the contribution. The proposed solution is implemented in WEKA and evaluated by applying it on different data sets and comparing the results with existing studies available in the literature. The results show that a developed meta-learning solution produces better results than METAL, a well-known meta-learning system. The developed solution takes the misclassification cost into consideration during the learning process, which is not available in the compared project. Findings The proposed solution is implemented in WEKA and evaluated by applying it to different data sets and comparing the results with existing studies available in the literature. The results show that a developed meta-learning solution produces better results than METAL, a well-known meta-learning system. Originality/value The paper presents a major piece of new information in writing for the first time. Meta-learning work has been done before but this paper presents a new meta-learning framework that is costs sensitive.


Author(s):  
ZAHRA NIKDEL ◽  
HAMID BEIGY

In this paper, we introduce a new hybrid learning algorithm, called DTGP, to construct cost-sensitive classifiers. This algorithm uses a decision tree as its basic classifier and the constructed decision tree will be pruned by a genetic programming algorithm using a fitness function that is sensitive to misclassification costs. The proposed learning algorithm has been examined through six cost-sensitive problems. The experimental results show that the proposed learning algorithm outperforms in comparison to some other known learning algorithms like C4.5 or naïve Bayesian.


Author(s):  
Nisha Yadav ◽  
Kakoli Banerjee ◽  
Vikram Bali

In the software industry, where the quality of the output is based on human performance, fatigue can be a reason for performance degradation. Fatigue not only degrades quality, but is also a health risk factor. Sleep disorders, depression, and stress are all results of fatigue which can contribute to fatal problems. This article presents a comparative study of different techniques which can be used for detecting fatigue of programmers and data miners who spent lots of time in front of a computer screen. Machine learning can used for worker fatigue detection also, but there are some factors which are specific for software workers. One of such factors is screen illumination. Screen illumination is the light of the computer screen or laptop screen that is casted on the workers face and makes it difficult for the machine learning algorithm to extract the facial features. This article presents a comparative study of the techniques which can be used for general fatigue detection and identifies the best techniques.


2019 ◽  
Vol 9 (15) ◽  
pp. 3037 ◽  
Author(s):  
Isaac Machorro-Cano ◽  
Giner Alor-Hernández ◽  
Mario Andrés Paredes-Valverde ◽  
Uriel Ramos-Deonati ◽  
José Luis Sánchez-Cervantes ◽  
...  

Overweight and obesity are affecting productivity and quality of life worldwide. The Internet of Things (IoT) makes it possible to interconnect, detect, identify, and process data between objects or services to fulfill a common objective. The main advantages of IoT in healthcare are the monitoring, analysis, diagnosis, and control of conditions such as overweight and obesity and the generation of recommendations to prevent them. However, the objects used in the IoT have limited resources, so it has become necessary to consider other alternatives to analyze the data generated from monitoring, analysis, diagnosis, control, and the generation of recommendations, such as machine learning. This work presents PISIoT: a machine learning and IoT-based smart health platform for the prevention, detection, treatment, and control of overweight and obesity, and other associated conditions or health problems. Weka API and the J48 machine learning algorithm were used to identify critical variables and classify patients, while Apache Mahout and RuleML were used to generate medical recommendations. Finally, to validate the PISIoT platform, we present a case study on the prevention of myocardial infarction in elderly patients with obesity by monitoring biomedical variables.


2019 ◽  
Vol 16 (10) ◽  
pp. 4425-4430 ◽  
Author(s):  
Devendra Prasad ◽  
Sandip Kumar Goyal ◽  
Avinash Sharma ◽  
Amit Bindal ◽  
Virendra Singh Kushwah

Machine Learning is a growing area in computer science in today’s era. This article is focusing on prediction analysis using K-Nearest Neighbors (KNN) Machine Learning algorithm. Data in the dataset are processed, analyzed and predicated using the specified algorithm. Introduction of various Machine Learning algorithms, its pros and cons have been discussed. The KNN algorithm with detail study is given and it is implemented on the specified data with certain parameters. The research work elucidates prediction analysis and explicates the prediction of quality of restaurants.


2020 ◽  
Author(s):  
Lisa Laux ◽  
Marie F.A. Cutiongco ◽  
Nikolaj Gadegaard ◽  
Bjørn Sand Jensen

AbstractAutomatic profiling of cell morphology is a powerful tool for inferring cell function. However, this technique retains a high barrier to entry. In particular, configuring image processing parameters for optimal cell profiling is susceptible to cognitive biases and dependent on user experience. Here, we use interactive machine learning to identify the optimum cell profiling configuration that maximises quality of the cell profiling outcome. The process is guided by the user, from whom a rating of the quality of a cell profiling configuration is obtained. We use Bayesian optimisation, an established machine learning algorithm, to learn from this information and automatically recommend the next configuration to examine with the aim to maximize the quality of the processing or analysis. Compared to existing interactive machine learning tools that require domain expertise for per-class or per-pixel annotations, we rely on users explicit assessment of output quality of the cell profiling task at hand. We validated our interactive approach against the standard human trial-and-error scheme to optimise an object segmentation task using the standard software CellProfiler. Our toolkit enabled rapid optimisation of an object segmentation pipeline, increasing the quality of object segmentation over a pipeline optimised through trial-and-error. Users also attested to the ease of use and reduced cognitive load enabled by our machine learning strategy over the standard approach. We envision that our interactive machine learning approach can enhance the quality and efficiency of pipeline optimisation to democratise image-based cell profiling.


Sign in / Sign up

Export Citation Format

Share Document