A systematic feature selection process for a Sinhala character recognition system

Author(s):  
Titus Nanda Kumara ◽  
Roshan Ragel
2020 ◽  
Author(s):  
Damodara Krishna Kishore Galla ◽  
BabuReddy Mukamalla ◽  
Rama Prakasha Reddy Chegireddy

Abstract The blind people has their difficulty to identify the object moving around them, therefore with a high accuracy score object detection and human face recognition system will helps them in identifying the things around them with ease. Facial record images are immobile an difficult assignment for biometric authentication systems due to various types of characteristics are dimensions, pose, expressions, illustrations and age etc. In facial and other united images includes different objects classifications. In this research article, a minimum distance trainer for feature selection by accessing SVM feature optimization process. For feature selection process SVM (support vector machine) was considered for improving its feature interpretability and computational efficiency., then LASSO classifier applied to perform object recognition and gender classification. Original face image database used for the gender classification. This approach was implemented with dual classification model (1) Recognizing or classifying human faces from various objects and (2) Classifying gender through face recognition] is made possible with the help of combining modified SIFT feature in combination with ridge regression (RR), elastic net (EN), lasso regression(LR) and lasso regression with Gaussian Support Vector Machines (LRGS) based classification.


2018 ◽  
Vol 7 (2) ◽  
pp. 43
Author(s):  
Abir Alharbi

Handwritten recognition systems are a dynamic field of research in areas of artificial intelligence. Many smart devices available in the market such as pen-based computers, tablets, mobiles with handwritten recognition technology need to rely on efficient handwritten recognition systems. In this paper we present a novel Arabic character handwritten recognition system based on a hybrid method consisting of a genetic algorithm and a Learning vector quantization (LVQ) neural network. Sixty different handwritten Arabic character datasets are used for training the neural network. Each character dataset contains 28 letters written twice with 15 distinct shaped alphabets, and each handwritten Arabic letter is represented by a binary matrix that is used as an input to a genetic algorithm for feature selection and dimension reduction to include only the most effective features to be fed to the LVQ classifier. The recognition process in the system involves several essential steps such as: handwritten letter acquisition, dataset preparation, feature selection, training, and recognition. Comparing our results to those acquired by the whole feature dataset without selection, and to the results using other classification algorithms confirms the effectiveness of our proposed handwritten recognition system with an accuracy of 95.4%, hence, showing a promising potential for improving future handwritten Arabic recognition devices in the market.


Author(s):  
Marwa Amara ◽  
Kamel Zidi

The recognition of a character begins with analyzing its form and extracting the features that will be exploited for the identification. Primitives can be described as a tool to distinguish an object of one class from another object of another class. It is necessary to define the significant primitives during the development of an optical character recognition system. Primitives are defined by experience or by intuition. Several primitives can be extracted while some are irrelevant or redundant. The size of vector primitives can be large if a large number of primitives are extracted including redundant and irrelevant features. As a result, the performance of the recognition system becomes poor, and as the number of features increases, so does the computing time. Feature selection, therefore, is required to ensure the selection of a subset of features that gives accurate recognition and has low computational overhead. We use feature selection techniques to improve the discrimination capacity of the Multilayer Perceptron Neural Networks (MLPNNs).


2022 ◽  
Vol 13 (2) ◽  
pp. 1-20
Author(s):  
Byron Marshall ◽  
Michael Curry ◽  
Robert E. Crossler ◽  
John Correia

Survey items developed in behavioral Information Security (InfoSec) research should be practically useful in identifying individuals who are likely to create risk by failing to comply with InfoSec guidance. The literature shows that attitudes, beliefs, and perceptions drive compliance behavior and has influenced the creation of a multitude of training programs focused on improving ones’ InfoSec behaviors. While automated controls and directly observable technical indicators are generally preferred by InfoSec practitioners, difficult-to-monitor user actions can still compromise the effectiveness of automatic controls. For example, despite prohibition, doubtful or skeptical employees often increase organizational risk by using the same password to authenticate corporate and external services. Analysis of network traffic or device configurations is unlikely to provide evidence of these vulnerabilities but responses to well-designed surveys might. Guided by the relatively new IPAM model, this study administered 96 survey items from the Behavioral InfoSec literature, across three separate points in time, to 217 respondents. Using systematic feature selection techniques, manageable subsets of 29, 20, and 15 items were identified and tested as predictors of non-compliance with security policy. The feature selection process validates IPAM's innovation in using nuanced self-efficacy and planning items across multiple time frames. Prediction models were trained using several ML algorithms. Practically useful levels of prediction accuracy were achieved with, for example, ensemble tree models identifying 69% of the riskiest individuals within the top 25% of the sample. The findings indicate the usefulness of psychometric items from the behavioral InfoSec in guiding training programs and other cybersecurity control activities and demonstrate that they are promising as additional inputs to AI models that monitor networks for security events.


Author(s):  
Liang Zhang ◽  
Jin Wen ◽  
Yimin Chen

An accurate building energy forecasting model is a key component for real-time and advanced control of building energy system and building-to-grid integration. With the fast deployment and advancement of building automation systems, data are collected by hundreds and sometimes thousands of sensors every few minutes in buildings, which provide great potential for data-driven building energy forecasting. To develop building energy forecasting models from a large number of potential inputs, feature selection is a critical procedure to ensure model accuracy and computation efficiency. Though the theory of feature selection is well developed in statistics and machine learning fields, it is not well studied in the application of building energy modeling. In this paper, a feature selection framework proposed in an earlier study is examined using a real campus building in Philadelphia. This feature selection framework combines domain knowledge and statistical methods and is developed for short-term data-driven building energy forecasting. In this case study, the feasibilities of using this feature selection framework in developing whole building energy forecasting model and chiller energy forecasting model are studied. Results show that, for both whole building and chiller energy forecasting applications, the model with systematic feature selection process presents better performance (in terms of cross validation error of forecasted output) than other models including that with conventional inputs and that uses only single feature selection technique.


Author(s):  
Manish M. Kayasth ◽  
Bharat C. Patel

The entire character recognition system is logically characterized into different sections like Scanning, Pre-processing, Classification, Processing, and Post-processing. In the targeted system, the scanned image is first passed through pre-processing modules then feature extraction, classification in order to achieve a high recognition rate. This paper describes mainly on Feature extraction and Classification technique. These are the methodologies which play an important role to identify offline handwritten characters specifically in Gujarati language. Feature extraction provides methods with the help of which characters can identify uniquely and with high degree of accuracy. Feature extraction helps to find the shape contained in the pattern. Several techniques are available for feature extraction and classification, however the selection of an appropriate technique based on its input decides the degree of accuracy of recognition. 


2018 ◽  
Author(s):  
I Wayan Agus Surya Darma

Balinese character recognition is a technique to recognize feature or pattern of Balinese character. Feature of Balinese character is generated through feature extraction process. This research using handwritten Balinese character. Feature extraction is a process to obtain the feature of character. In this research, feature extraction process generated semantic and direction feature of handwritten Balinese character. Recognition is using K-Nearest Neighbor algorithm to recognize 81 handwritten Balinese character. The feature of Balinese character images tester are compared with reference features. Result of the recognition system with K=3 and reference=10 is achieved a success rate of 97,53%.


2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 692
Author(s):  
Jingcheng Chen ◽  
Yining Sun ◽  
Shaoming Sun

Human activity recognition (HAR) is essential in many health-related fields. A variety of technologies based on different sensors have been developed for HAR. Among them, fusion from heterogeneous wearable sensors has been developed as it is portable, non-interventional and accurate for HAR. To be applied in real-time use with limited resources, the activity recognition system must be compact and reliable. This requirement can be achieved by feature selection (FS). By eliminating irrelevant and redundant features, the system burden is reduced with good classification performance (CP). This manuscript proposes a two-stage genetic algorithm-based feature selection algorithm with a fixed activation number (GFSFAN), which is implemented on the datasets with a variety of time, frequency and time-frequency domain features extracted from the collected raw time series of nine activities of daily living (ADL). Six classifiers are used to evaluate the effects of selected feature subsets from different FS algorithms on HAR performance. The results indicate that GFSFAN can achieve good CP with a small size. A sensor-to-segment coordinate calibration algorithm and lower-limb joint angle estimation algorithm are introduced. Experiments on the effect of the calibration and the introduction of joint angle on HAR shows that both of them can improve the CP.


Sign in / Sign up

Export Citation Format

Share Document