scholarly journals Classification of Biodegradable Substances Using Balanced Random Trees and Boosted C5.0 Decision Trees

Author(s):  
Alaa M. Elsayad ◽  
Ahmed M. Nassef ◽  
Mujahed Al-Dhaifallah ◽  
Khaled A. Elsayad

Substances that do not degrade over time have proven to be harmful to the environment and are dangerous to living organisms. Being able to predict the biodegradability of substances without costly experiments is useful. Recently, the quantitative structure–activity relationship (QSAR) models have proposed effective solutions to this problem. However, the molecular descriptor datasets usually suffer from the problems of unbalanced class distribution, which adversely affects the efficiency and generalization of the derived models. Accordingly, this study aims at validating the performances of balanced random trees (RTs) and boosted C5.0 decision trees (DTs) to construct QSAR models to classify the ready biodegradation of substances and their abilities to deal with unbalanced data. The balanced RTs model algorithm builds individual trees using balanced bootstrap samples, while the boosted C5.0 DT is modeled using cost-sensitive learning. We employed the two-dimensional molecular descriptor dataset, which is publicly available through the University of California, Irvine (UCI) machine learning repository. The molecular descriptors were ranked according to their contributions to the balanced RTs classification process. The performance of the proposed models was compared with previously reported results. Based on the statistical measures, the experimental results showed that the proposed models outperform the classification results of the support vector machine (SVM), K-nearest neighbors (KNN), and discrimination analysis (DA). Classification measures were analyzed in terms of accuracy, sensitivity, specificity, precision, false positive rate, false negative rate, F1 score, receiver operating characteristic (ROC) curve, and area under the ROC curve (AUROC).

Author(s):  
Srinivas Gutta ◽  
Ibrahim F. Imam ◽  
Harry Wechsler

Hand gestures are the natural form of communication among people, yet human-computer interaction is still limited to mice movements. The use of hand gestures in the field of human-computer interaction has attracted renewed interest in the past several years. Special glove-based devices have been developed to analyze finger and hand motion and use them to manipulate and explore virtual worlds. To further enrich the naturalness of the interaction, different computer vision-based techniques have been developed. At the same time the need for more efficient systems has resulted in new gesture recognition approaches. In this paper we present an hybrid intelligent system for hand gesture recognition. The hybrid approach consists of an ensemble of connectionist networks — radial basis functions (RBF) — and inductive decision trees (AQDT). Cross Validation (CV) experimental results yield a false negative rate of 1.7% and a false positive rate of 1% while the evaluation takes place on a data base including 150 images corresponding to 15 gestures of 5 subjects. In order to assess the robustness of the system, the vocabulary of the gestures has been increased from 15 to 25 and the size of the database from 150 to 750 images corresponding now to 15 subjects. Cross Validation (CV) experimental results yield a false negative rate of 3.6% and a false positive rate of 1.8% respectively. The benefits of our hybrid architecture include (i) robustness via query by consensus as provided by ensembles of networks when facing the inherent variability of the image formation and data acquisition process, (ii) classifications made using decision trees, (iii) flexible and adaptive thresholds as opposed to ad hoc and hard thresholds and (iv) interpretability of the way classification and retrieval is eventually achieved.


2014 ◽  
Vol 41 (4) ◽  
pp. 294-303 ◽  
Author(s):  
Robert Richard Harvey ◽  
Edward Arthur McBean

Closed-circuit television inspections of sewer condition deterioration as required for proactive management are expensive and hence limited to portions of a sewer network. The data mining approach presented herein is shown capable of unlocking information contained within inspection records and enhances existing pipe inspection practices currently used in the wastewater industry. Predictive models developed using the random forests algorithm are found capable of predicting individual sewer pipe condition so that uninspected pipes in a sewer network with the greatest likelihood of being in a structurally defective condition state are identified for future rounds of inspection. Complications posed by imbalance between classes common within inspection datasets are overcome by first establishing the classification task in a binary format (where pipes are in either good or bad structural condition) and then using the receiver-operating characteristic (ROC) curve to establish alternative cutoffs for the predicted class probability. The random forests algorithm achieved a stratified test set false negative rate of 18%, false positive rate of 27% and an excellent area under the ROC curve of 0.81 in a case study application to the City of Guelph, Ontario, Canada. The novel inclusion of condition information of pipes attached at either the upstream or downstream manholes of an individual pipe enhances the predictive power for bad pipes representing the minority class of interest (reducing the false negative rate to 11%, reducing the false positive rate to 25% and increasing the area under the ROC curve to 0.85). An area under the ROC curve >0.80 indicates random forests are an “excellent” choice for predicting the condition of individual pipes in a sewer network.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yun Zuo ◽  
Jianyuan Lin ◽  
Xiangxiang Zeng ◽  
Quan Zou ◽  
Xiangrong Liu

Abstract Background Carbonylation is a non-enzymatic irreversible protein post-translational modification, and refers to the side chain of amino acid residues being attacked by reactive oxygen species and finally converted into carbonyl products. Studies have shown that protein carbonylation caused by reactive oxygen species is involved in the etiology and pathophysiological processes of aging, neurodegenerative diseases, inflammation, diabetes, amyotrophic lateral sclerosis, Huntington’s disease, and tumor. Current experimental approaches used to predict carbonylation sites are expensive, time-consuming, and limited in protein processing abilities. Computational prediction of the carbonylation residue location in protein post-translational modifications enhances the functional characterization of proteins. Results In this study, an integrated classifier algorithm, CarSite-II, was developed to identify K, P, R, and T carbonylated sites. The resampling method K-means similarity-based undersampling and the synthetic minority oversampling technique (SMOTE-KSU) were incorporated to balance the proportions of K, P, R, and T carbonylated training samples. Next, the integrated classifier system Rotation Forest uses “support vector machine” subclassifications to divide three types of feature spaces into several subsets. CarSite-II gained Matthew’s correlation coefficient (MCC) values of 0.2287/0.3125/0.2787/0.2814, False Positive rate values of 0.2628/0.1084/0.1383/0.1313, False Negative rate values of 0.2252/0.0205/0.0976/0.0608 for K/P/R/T carbonylation sites by tenfold cross-validation, respectively. On our independent test dataset, CarSite-II yield MCC values of 0.6358/0.2910/0.4629/0.3685, False Positive rate values of 0.0165/0.0203/0.0188/0.0094, False Negative rate values of 0.1026/0.1875/0.2037/0.3333 for K/P/R/T carbonylation sites. The results show that CarSite-II achieves remarkably better performance than all currently available prediction tools. Conclusion The related results revealed that CarSite-II achieved better performance than the currently available five programs, and revealed the usefulness of the SMOTE-KSU resampling approach and integration algorithm. For the convenience of experimental scientists, the web tool of CarSite-II is available in http://47.100.136.41:8081/


Kybernetes ◽  
2016 ◽  
Vol 45 (6) ◽  
pp. 977-994 ◽  
Author(s):  
Oluyinka Aderemi Adewumi ◽  
Ayobami Andronicus Akinyelu

Purpose – Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars has been lost by many companies and individuals. The global impact of phishing attacks will continue to be on the increase and thus a more efficient phishing detection technique is required. The purpose of this paper is to investigate and report the use of a nature inspired based-machine learning (ML) approach in classification of phishing e-mails. Design/methodology/approach – ML-based techniques have been shown to be efficient in detecting phishing attacks. In this paper, firefly algorithm (FFA) was integrated with support vector machine (SVM) with the primary aim of developing an improved phishing e-mail classifier (known as FFA_SVM), capable of accurately detecting new phishing patterns as they occur. From a data set consisting of 4,000 phishing and ham e-mails, a set of features, suitable for phishing e-mail detection, was extracted and used to construct the hybrid classifier. Findings – The FFA_SVM was applied to a data set consisting of up to 4,000 phishing and ham e-mails. Simulation experiments were performed to evaluate and compared the performance of the classifier. The tests yielded a classification accuracy of 99.94 percent, false positive rate of 0.06 percent and false negative rate of 0.04 percent. Originality/value – The hybrid algorithm has not been earlier apply, as in this work, to the classification and detection of phishing e-mail, to the best of the authors’ knowledge.


2018 ◽  
Vol 61 (2) ◽  
pp. 469-479 ◽  
Author(s):  
Chao Zhou ◽  
Chuanheng Sun ◽  
Kai Lin ◽  
Daming Xu ◽  
Qiang Guo ◽  
...  

Abstract. In aquaculture, almost all images collected of an aquaculture scene contain reflections, which often affect the results and accuracy of machine vision. Classifying these images and obtaining images of interest are key to subsequent image processing. The purpose of this study was to identify useful images and remove images that had a substantial effect on the results of image processing for computer vision in aquaculture. In this study, a method for classification of reflective frames based on image texture and a support vector machine (SVM) was proposed for an actual aquaculture site. Objectives of this study were to: (1) develop an algorithm to improve the speed of the method and to ensure that the method has a high classification accuracy, (2) design an algorithm to improve the intelligence and adaptability of the classification, and (3) demonstrate the performance of the method. The results show that the average classification accuracy, false positive rate, and false negative rate for two types of reflective frames (type I and II) were 96.34%, 4.65%, and 2.23%, respectively. In addition, the running time was very low (1.25 s). This strategy also displayed considerable adaptability and could be used to obtain useful images or remove images that have substantial effects on the accuracy of image processing results, thereby improving the applicability of computer vision in aquaculture. Keywords: Aquaculture, Genetic algorithm, Gray level-gradient co-occurrence matrix, Principal component analysis, Reflection frame, Support vector machine.


2020 ◽  
Vol 22 (1) ◽  
pp. 25-29
Author(s):  
Zubayer Ahmad ◽  
Mohammad Ali ◽  
Kazi lsrat Jahan ◽  
ABM Khurshid Alam ◽  
G M Morshed

Background: Biliary disease is one of the most common surgical problems encountered all over the world. Ultrasound is widely accepted for the diagnosis of biliary system disease. However, it is a highly operator dependent imaging modality and its diagnostic success is also influenced by the situation, such as non-fasting, obesity, intestinal gas. Objective: To compare the ultrasonographic findings with the peroperative findings in biliary surgery. Methods: This prospective study was conducted in General Hospital, comilla between the periods of July 2006 to June 2008 among 300 patients with biliary diseases for which operative treatment is planned. Comparison between sonographic findings with operative findings was performed. Results: Right hypochondriac pain and jaundice were two significant symptoms (93% and 15%). Right hypochondriac tenderness, jaundice and palpable gallbladder were most valuable physical findings (respectively, 40%, 15% and 5%). Out of 252 ultrasonically positive gallbladder, stone were confirmed in 249 cases preoperatively. Sensitivity of USG in diagnosis of gallstone disease was 100%. There was, however, 25% false positive rate detection. Specificity was, however, 75% in this case. USG could demonstrate stone in common bile duct in only 12 out of 30 cases. Sensitivity of the test in diagnosing common bile duct stone was 40%, false negative rate 60%. In the series, ultrasonography sensitivity was 100% in diagnosing stone in cystic duct. USG could detect with relatively good but less sensitivity the presence of chronic cholecystitis (92.3%) and worm inside gallbladder (50%). Conclusion: Ultrasonography is the most important investigation in the diagnosis of biliary disease and a useful test for patients undergoing operative management for planning and anticipating technical difficulties. Journal of Surgical Sciences (2018) Vol. 22 (1): 25-29


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 259-260
Author(s):  
Laura Curtis ◽  
Lauren Opsasnick ◽  
Julia Yoshino Benavente ◽  
Cindy Nowinski ◽  
Rachel O’Conor ◽  
...  

Abstract Early detection of Cognitive impairment (CI) is imperative to identify potentially treatable underlying conditions or provide supportive services when due to progressive conditions such as Alzheimer’s Disease. While primary care settings are ideal for identifying CI, it frequently goes undetected. We developed ‘MyCog’, a brief technology-enabled, 2-step assessment to detect CI and dementia in primary care settings. We piloted MyCog in 80 participants 65 and older recruited from an ongoing cognitive aging study. Cases were identified either by a documented diagnosis of dementia or mild cognitive impairment (MCI) or based on a comprehensive cognitive battery. Administered via an iPad, Step 1 consists of a single self-report item indicating concern about memory or other thinking problems and Step 2 includes two cognitive assessments from the NIH Toolbox: Picture Sequence Memory (PSM) and Dimensional Change Card Sorting (DCCS). 39%(31/80) participants were considered cognitively impaired. Those who expressed concern in Step 1 (n=52, 66%) resulted in a 37% false positive and 3% false negative rate. With the addition of the PSM and DCCS assessments in Step 2, the paradigm demonstrated 91% sensitivity, 75% specificity and an area under the ROC curve (AUC)=0.82. Steps 1 and 2 had an average administration time of <7 minutes. We continue to optimize MyCog by 1) examining additional items for Step 1 to reduce the false positive rate and 2) creating a self-administered version to optimize use in clinical settings. With further validation, MyCog offers a practical, scalable paradigm for the routine detection of cognitive impairment and dementia.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Baiba Līcīte ◽  
Arvīds Irmejs ◽  
Jeļena Maksimenko ◽  
Pēteris Loža ◽  
Genādijs Trofimovičs ◽  
...  

Abstract Background Aim of the study is to evaluate the role of ultrasound guided fine needle aspiration cytology (FNAC) in the restaging of node positive breast cancer after preoperative systemic therapy (PST). Methods From January 2016 – October 2020 106 node positive stage IIA-IIIC breast cancer cases undergoing PST were included in the study. 18 (17 %) were carriers of pathogenic variant in BRCA1/2. After PST restaging of axilla was performed with ultrasound and FNAC of the marked and/or the most suspicious axillary node. In 72/106 cases axilla conserving surgery and in 34/106 cases axillary lymph node dissection (ALND) was performed. Results False Positive Rate (FPR) of FNAC after PST in whole cohort and BRCA1/2 positive subgroup is 8 and 0 % and False Negative Rate (FNR) – 43 and 18 % respectively. Overall Sensitivity − 55 %, specificity- 93 %, accuracy 70 %. Conclusion FNAC after PST has low FPR and is useful to predict residual axillary disease and to streamline surgical decision making regarding ALND both in BRCA1/2 positive and negative subgroups. FNR is high in overall cohort and FNAC alone are not able to predict ypCR and omission of further axillary surgery. However, FNAC performance in BRCA1/2 positive subgroup is more promising and further research with larger number of cases is necessary to confirm the results.


2021 ◽  
Vol 13 (6) ◽  
pp. 1211
Author(s):  
Pan Fan ◽  
Guodong Lang ◽  
Bin Yan ◽  
Xiaoyan Lei ◽  
Pengju Guo ◽  
...  

In recent years, many agriculture-related problems have been evaluated with the integration of artificial intelligence techniques and remote sensing systems. The rapid and accurate identification of apple targets in an illuminated and unstructured natural orchard is still a key challenge for the picking robot’s vision system. In this paper, by combining local image features and color information, we propose a pixel patch segmentation method based on gray-centered red–green–blue (RGB) color space to address this issue. Different from the existing methods, this method presents a novel color feature selection method that accounts for the influence of illumination and shadow in apple images. By exploring both color features and local variation in apple images, the proposed method could effectively distinguish the apple fruit pixels from other pixels. Compared with the classical segmentation methods and conventional clustering algorithms as well as the popular deep-learning segmentation algorithms, the proposed method can segment apple images more accurately and effectively. The proposed method was tested on 180 apple images. It offered an average accuracy rate of 99.26%, recall rate of 98.69%, false positive rate of 0.06%, and false negative rate of 1.44%. Experimental results demonstrate the outstanding performance of the proposed method.


2013 ◽  
Vol 694-697 ◽  
pp. 1987-1992 ◽  
Author(s):  
Xing Gang Wu ◽  
Cong Guo

Proposed an approach to identify vehicles considering the variation in image size, illumination, and view angles under different cameras using Support Vector Machine with weighted random trees (WRT-SVM). With quantizing the scale-invariant features of image pairs by the weighted random trees, the identification problem is formulated as a same-different classification problem. Results show the efficiency of building the randomized tree due to the weights of the samples and the control of the false-positive rate of the identify system.


Sign in / Sign up

Export Citation Format

Share Document