scholarly journals Eye Movement Feature Set and Predictive Model for Dyslexia

Dyslexia is a learning disorder that can cause difficulties in reading or writing. Dyslexia is not a visual problem but many dyslexics have impaired magnocellular system which causes poor eye control. Eye-trackers are used to track eye movements. This research work proposes a set of significant eye movement features that are used to build a predictive model for dyslexia. Fixation and saccade eye events are detected using the dispersion-threshold and velocity-threshold algorithms. Various machine learning models are experimented. Validation is done on 185 subjects using 10-fold cross-validation. Velocity based features gave high accuracy compared to statistical and dispersion features. Highest accuracy of 96% was achieved using the Hybrid Kernel Support Vector Machine- Particle Swarm Optimization model followed by the Xtreme Gradient Boosting model with an accuracy of 95%. The best set of features are the first fixation start time, average fixation saccade duration, the total number of fixations, total number of saccades and ratio between saccades and fixations.

Author(s):  
Jothi Prabha Appadurai ◽  
Bhargavi R.

Dyslexia is a learning disorder that can cause difficulties in reading or writing. Dyslexia is not a visual problem but many dyslexics have impaired magnocellular system which causes poor eye control. Eye-trackers are used to track eye movements. This research work proposes a set of significant eye movement features that are used to build a predictive model for dyslexia. Fixation and saccade eye events are detected using the dispersion-threshold and velocity-threshold algorithms. Various machine learning models are experimented. Validation is done on 185 subjects using 10-fold cross-validation. Velocity based features gave high accuracy compared to statistical and dispersion features. Highest accuracy of 96% was achieved using the Hybrid Kernel Support Vector Machine- Particle Swarm Optimization model followed by the Xtreme Gradient Boosting model with an accuracy of 95%. The best set of features are the first fixation start time, average fixation saccade duration, the total number of fixations, total number of saccades and ratio between saccades and fixations.


2018 ◽  
Vol 7 (4) ◽  
pp. 2795
Author(s):  
Jothi Prabha A ◽  
Bhargavi R ◽  
Ramesh Ragala

Dyslexia is a learning disorder characterized by lack of reading and /or writing skills, difficulty in rapid word naming and also poor in spelling. Dyslexic individuals have great difficulty to read and interpret words or letters. Research work is carried out to classify dyslexic from non-dyslexics by various approaches such as machine learning, image processing, understanding the brain behavior through psychology, studying the differences in anatomy of brain. In addition to it several assistive tools are developed to support dyslexics. In this work, brain images are used for screening individuals who have high risk to dyslexia. This work also motivates the application of machine learning in distributed environment. The proposed predictive model uses the machine-learning algorithm Support Vector Machine (SVM). The model is designed in Apache SPARK framework to support voluminous data. The prediction accuracy of 92.5% is achieved using SVM. 


2021 ◽  
Vol 4 (2(112)) ◽  
pp. 58-72
Author(s):  
Chingiz Kenshimov ◽  
Zholdas Buribayev ◽  
Yedilkhan Amirgaliyev ◽  
Aisulyu Ataniyazova ◽  
Askhat Aitimov

In the course of our research work, the American, Russian and Turkish sign languages were analyzed. The program of recognition of the Kazakh dactylic sign language with the use of machine learning methods is implemented. A dataset of 5000 images was formed for each gesture, gesture recognition algorithms were applied, such as Random Forest, Support Vector Machine, Extreme Gradient Boosting, while two data types were combined into one database, which caused a change in the architecture of the system as a whole. The quality of the algorithms was also evaluated. The research work was carried out due to the fact that scientific work in the field of developing a system for recognizing the Kazakh language of sign dactyls is currently insufficient for a complete representation of the language. There are specific letters in the Kazakh language, because of the peculiarities of the spelling of the language, problems arise when developing recognition systems for the Kazakh sign language. The results of the work showed that the Support Vector Machine and Extreme Gradient Boosting algorithms are superior in real-time performance, but the Random Forest algorithm has high recognition accuracy. As a result, the accuracy of the classification algorithms was 98.86 % for Random Forest, 98.68 % for Support Vector Machine and 98.54 % for Extreme Gradient Boosting. Also, the evaluation of the quality of the work of classical algorithms has high indicators. The practical significance of this work lies in the fact that scientific research in the field of gesture recognition with the updated alphabet of the Kazakh language has not yet been conducted and the results of this work can be used by other researchers to conduct further research related to the recognition of the Kazakh dactyl sign language, as well as by researchers, engaged in the development of the international sign language


Author(s):  
Inssaf El Guabassi ◽  
Zakaria Bousalem ◽  
Rim Marah ◽  
Aimad Qazdar

In recent years, the world's population is increasingly demanding to predict the future with certainty, predicting the right information in any area is becoming a necessity. One of the ways to predict the future with certainty is to determine the possible future. In this sense, machine learning is a way to analyze huge datasets to make strong predictions or decisions. The main objective of this research work is to build a predictive model for evaluating students’ performance. Hence, the contributions are threefold. The first is to apply several supervised machine learning algorithms (i.e. ANCOVA, Logistic Regression, Support Vector Regression, Log-linear Regression, Decision Tree Regression, Random Forest Regression, and Partial Least Squares Regression) on our education dataset. The second purpose is to compare and evaluate algorithms used to create a predictive model based on various evaluation metrics. The last purpose is to determine the most important factors that influence the success or failure of the students. The experimental results showed that the Log-linear Regression provides a better prediction as well as the behavioral factors that influence students’ performance.


According to the health statistics of India on Chronic Kidney Disease (CKD) a total of 63538 cases has been registered. Average age of men and women prone to kidney disease lies in the range of 48 to 70 years. CKD is more prevalent among male than among female. India ranks 17th position in CKD during 2015[1]. This paper focus on the predictive analytics architecture to analyse CKD dataset using feature engineering and classification algorithm. The proposed model incorporates techniques to validate the feasibility of the data points used for analysis. The main focus of this research work is to analyze the dataset of chronic kidney failure and perform the classification of CKD and Non CKD cases. The feasibility of the proposed dataset is determined through the Learning curve performance. The features which play a vital role in classification are determined using sequential forward selection algorithm. The training dataset with the selected features is fed into various classifier to determine which classifier plays a vital and accurate role in detection of CKD. The proposed dataset is classified using various Classification algorithms like Linear Regression(LR), Linear Discriminant Analysis(LDA), K-Nearest Neighbour(KNN), Classification and Regression Tree(CART), Naive Bayes(NB), Support Vector Machine(SVM), Random Forest(RF), eXtreme Gradient Boosting(XGBoost) and Ada Boost Regressor (ABR). It was found that for the given CKD dataset with 25 attributes of 11 Numeric and 14 Nominal the following classifier like LR, LDA, CART,NB,RF,XGB and ABR provides an accuracy ranging from 98% to 100% . The proposed architecture validates the dataset against the thumb rule when working with less number of data points used for classification and the classifier is validated against under fit, over fit conditions. The performance of the classifier is evaluated using accuracy and F-Score. The proposed architecture indicates that LR, RF and ABR provides a very high accuracy and F-Score


2021 ◽  
Vol 10 (2) ◽  
pp. 1-20
Author(s):  
Sheik Abdullah A. ◽  
Akash K. ◽  
Bhubesh K. R. A. ◽  
Selvakumar S.

This research work specifically focusses on the development of a predictive model for movie review data using support vector machine (SVM) classifier with its improvisations using different kernel functions upon sentiment score estimation. The predictive model development proceeds with user level data input with the data processing with the data stream for analysis. Then formal calculation of TF-IDF evaluation has been made upon data clustering using simple k-means algorithm. Once the labeled data has been sorted out, then the SVM with kernel functions corresponding to linear, sigmoid, rbf, and polynomial have been applied over the clustered data with specific parameter setting for each type of library functions. Performance of each of the kernels has been measured using precision, recall, and F-score values for each of the specified kernel, and from the analysis, it has been found that sentiment analysis using SVM linear kernel with sentiment score analysis has been found to provide an improved accuracy of about 91.18%.


Complexity ◽  
2022 ◽  
Vol 2022 ◽  
pp. 1-20
Author(s):  
Nihad Brahimi ◽  
Huaping Zhang ◽  
Lin Dai ◽  
Jianzi Zhang

The car-sharing system is a popular rental model for cars in shared use. It has become particularly attractive due to its flexibility; that is, the car can be rented and returned anywhere within one of the authorized parking slots. The main objective of this research work is to predict the car usage in parking stations and to investigate the factors that help to improve the prediction. Thus, new strategies can be designed to make more cars on the road and fewer in the parking stations. To achieve that, various machine learning models, namely vector autoregression (VAR), support vector regression (SVR), eXtreme gradient boosting (XGBoost), k-nearest neighbors (kNN), and deep learning models specifically long short-time memory (LSTM), gated recurrent unit (GRU), convolutional neural network (CNN), CNN-LSTM, and multilayer perceptron (MLP), were performed on different kinds of features. These features include the past usage levels, Chongqing’s environmental conditions, and temporal information. After comparing the obtained results using different metrics, we found that CNN-LSTM outperformed other methods to predict the future car usage. Meanwhile, the model using all the different feature categories results in the most precise prediction than any of the models using one feature category at a time


Data ◽  
2021 ◽  
Vol 6 (8) ◽  
pp. 80
Author(s):  
O. V. Mythreyi ◽  
M. Rohith Srinivaas ◽  
Tigga Amit Kumar ◽  
R. Jayaganthan

This research work focuses on machine-learning-assisted prediction of the corrosion behavior of laser-powder-bed-fused (LPBF) and postprocessed Inconel 718. Corrosion testing data of these specimens were collected and fit into the following machine learning algorithms: polynomial regression, support vector regression, decision tree, and extreme gradient boosting. The model performance, after hyperparameter optimization, was evaluated using a set of established metrics: R2, mean absolute error, and root mean square error. Among the algorithms, the extreme gradient boosting algorithm performed best in predicting the corrosion behavior, closely followed by other algorithms. Feature importance analysis was executed in order to determine the postprocessing parameters that influenced the most the corrosion behavior in Inconel 718 manufactured by LPBF.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


Sign in / Sign up

Export Citation Format

Share Document