A Novel Bilingual OCR System Based on Column-Stochastic Features and SVM Classifier for the Specially Enabled

Author(s):  
Bindu Philip ◽  
R.D. Sudhaker Samuel
Keyword(s):  
Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


2020 ◽  
Author(s):  
Nalika Ulapane ◽  
Karthick Thiyagarajan ◽  
sarath kodagoda

<div>Classification has become a vital task in modern machine learning and Artificial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classification. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classifier performance. In this paper, we consider the case of a given supervised learning classification task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classification performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classification accuracy of a Support Vector Machine (SVM) classifier increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>


2020 ◽  
Vol 20 ◽  
Author(s):  
Hongwei Zhang ◽  
Steven Wang ◽  
Tao Huang

Aims: We would like to identify the biomarkers for chronic hypersensitivity pneumonitis (CHP) and facilitate the precise gene therapy of CHP. Background: Chronic hypersensitivity pneumonitis (CHP) is an interstitial lung disease caused by hypersensitive reactions to inhaled antigens. Clinically, the tasks of differentiating between CHP and other interstitial lungs diseases, especially idiopathic pulmonary fibrosis (IPF), were challenging. Objective: In this study, we analyzed the public available gene expression profile of 82 CHP patients, 103 IPF patients, and 103 control samples to identify the CHP biomarkers. Method: The CHP biomarkers were selected with advanced feature selection methods: Monte Carlo Feature Selection (MCFS) and Incremental Feature Selection (IFS). A Support Vector Machine (SVM) classifier was built. Then, we analyzed these CHP biomarkers through functional enrichment analysis and differential co-expression analysis. Result: There were 674 identified CHP biomarkers. The co-expression network of these biomarkers in CHP included more negative regulations and the network structure of CHP was quite different from the network of IPF and control. Conclusion: The SVM classifier may serve as an important clinical tool to address the challenging task of differentiating between CHP and IPF. Many of the biomarker genes on the differential co-expression network showed great promise in revealing the underlying mechanisms of CHP.


Author(s):  
B. Venkatesh ◽  
J. Anuradha

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.


Diagnostics ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 739
Author(s):  
Alessandro Bevilacqua ◽  
Margherita Mottola ◽  
Fabio Ferroni ◽  
Alice Rossi ◽  
Giampaolo Gavelli ◽  
...  

Predicting clinically significant prostate cancer (csPCa) is crucial in PCa management. 3T-magnetic resonance (MR) systems may have a novel role in quantitative imaging and early csPCa prediction, accordingly. In this study, we develop a radiomic model for predicting csPCa based solely on native b2000 diffusion weighted imaging (DWIb2000) and debate the effectiveness of apparent diffusion coefficient (ADC) in the same task. In total, 105 patients were retrospectively enrolled between January–November 2020, with confirmed csPCa or ncsPCa based on biopsy. DWIb2000 and ADC images acquired with a 3T-MRI were analyzed by computing 84 local first-order radiomic features (RFs). Two predictive models were built based on DWIb2000 and ADC, separately. Relevant RFs were selected through LASSO, a support vector machine (SVM) classifier was trained using repeated 3-fold cross validation (CV) and validated on a holdout set. The SVM models rely on a single couple of uncorrelated RFs (ρ < 0.15) selected through Wilcoxon rank-sum test (p ≤ 0.05) with Holm–Bonferroni correction. On the holdout set, while the ADC model yielded AUC = 0.76 (95% CI, 0.63–0.96), the DWIb2000 model reached AUC = 0.84 (95% CI, 0.63–0.90), with specificity = 75%, sensitivity = 90%, and informedness = 0.65. This study establishes the primary role of 3T-DWIb2000 in PCa quantitative analyses, whilst ADC can remain the leading sequence for detection.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Frederick S. Vizeacoumar ◽  
Hongyu Guo ◽  
Lynn Dwernychuk ◽  
Adnan Zaidi ◽  
Andrew Freywald ◽  
...  

AbstractGastro-esophageal (GE) cancers are one of the major causes of cancer-related death in the world. There is a need for novel biomarkers in the management of GE cancers, to yield predictive response to the available therapies. Our study aims to identify leading genes that are differentially regulated in patients with these cancers. We explored the expression data for those genes whose protein products can be detected in the plasma using the Cancer Genome Atlas to identify leading genes that are differentially regulated in patients with GE cancers. Our work predicted several candidates as potential biomarkers for distinct stages of GE cancers, including previously identified CST1, INHBA, STMN1, whose expression correlated with cancer recurrence, or resistance to adjuvant therapies or surgery. To define the predictive accuracy of these genes as possible biomarkers, we constructed a co-expression network and performed complex network analysis to measure the importance of the genes in terms of a ratio of closeness centrality (RCC). Furthermore, to measure the significance of these differentially regulated genes, we constructed an SVM classifier using machine learning approach and verified these genes by using receiver operator characteristic (ROC) curve as an evaluation metric. The area under the curve measure was > 0.9 for both the overexpressed and downregulated genes suggesting the potential use and reliability of these candidates as biomarkers. In summary, we identified leading differentially expressed genes in GE cancers that can be detected in the plasma proteome. These genes have potential to become diagnostic and therapeutic biomarkers for early detection of cancer, recurrence following surgery and for development of targeted treatment.


2021 ◽  
Vol 1767 (1) ◽  
pp. 012012
Author(s):  
R Muralidharan ◽  
T Kanagasabapathy ◽  
R P Vijai Ganesh
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document