Decision tree and deep learning based probabilistic model for character recognition

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Download Full-text

Utilizing Twitter Data Analysis and Deep Learning to Identify Drug Use (Preprint)

10.2196/preprints.14681 ◽

2019 ◽

Author(s):

Joseph Tassone ◽

Peizhi Yan ◽

Mackenzie Simpson ◽

Chetan Mendhe ◽

Vijay Mago ◽

...

Keyword(s):

Social Media ◽

Logistic Regression ◽

Deep Learning ◽

Decision Tree ◽

Semantic Meaning ◽

Predictive Capability ◽

Logistic Regression Models ◽

Twitter Data ◽

Data Points ◽

Positive Classification

BACKGROUND The collection and examination of social media has become a useful mechanism for studying the mental activity and behavior tendencies of users. OBJECTIVE Through the analysis of a collected set of Twitter data, a model will be developed for predicting positively referenced, drug-related tweets. From this, trends and correlations can be determined. METHODS Twitter social media tweets and attribute data were collected and processed using topic pertaining keywords, such as drug slang and use-conditions (methods of drug consumption). Potential candidates were preprocessed resulting in a dataset 3,696,150 rows. The predictive classification power of multiple methods was compared including regression, decision trees, and CNN-based classifiers. For the latter, a deep learning approach was implemented to screen and analyze the semantic meaning of the tweets. RESULTS The logistic regression and decision tree models utilized 12,142 data points for training and 1041 data points for testing. The results calculated from the logistic regression models respectively displayed an accuracy of 54.56% and 57.44%, and an AUC of 0.58. While an improvement, the decision tree concluded with an accuracy of 63.40% and an AUC of 0.68. All these values implied a low predictive capability with little to no discrimination. Conversely, the CNN-based classifiers presented a heavy improvement, between the two models tested. The first was trained with 2,661 manually labeled samples, while the other included synthetically generated tweets culminating in 12,142 samples. The accuracy scores were 76.35% and 82.31%, with an AUC of 0.90 and 0.91. Using association rule mining in conjunction with the CNN-based classifier showed a high likelihood for keywords such as “smoke”, “cocaine”, and “marijuana” triggering a drug-positive classification. CONCLUSIONS Predictive analysis without a CNN is limited and possibly fruitless. Attribute-based models presented little predictive capability and were not suitable for analyzing this type of data. The semantic meaning of the tweets needed to be utilized, giving the CNN-based classifier an advantage over other solutions. Additionally, commonly mentioned drugs had a level of correspondence with frequently used illicit substances, proving the practical usefulness of this system. Lastly, the synthetically generated set provided increased scores, improving the predictive capability. CLINICALTRIAL None

Download Full-text

The Comparison of Deep Learning Driven Optical Character Recognition for Hard Disk Head Slider Serial Number

2020 International Conference on Power, Energy and Innovations (ICPEI) ◽

10.1109/icpei49860.2020.9431431 ◽

2020 ◽

Author(s):

Palakorn Imsamer ◽

Vorachat Boonyaphon ◽

Somporn Tiacharoen

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Hard Disk ◽

Head Slider ◽

Optical Character ◽

Serial Number

Download Full-text

Added value of deep learning-based liver parenchymal CT volumetry for predicting major arterial injury after blunt hepatic trauma: a decision tree analysis

Abdominal Radiology ◽

10.1007/s00261-020-02892-x ◽

2021 ◽

Author(s):

David Dreizin ◽

Tina Chen ◽

Yuanyuan Liang ◽

Yuyin Zhou ◽

Fabio Paes ◽

...

Keyword(s):

Deep Learning ◽

Decision Tree ◽

Arterial Injury ◽

Added Value ◽

Decision Tree Analysis ◽

Hepatic Trauma ◽

Ct Volumetry ◽

Tree Analysis ◽

Liver Parenchymal

Download Full-text

Insecurity Early Warning for Large Scale Hybrid AC/DC Grids Based on Decision Tree and Semi-Supervised Deep Learning

IEEE Transactions on Power Systems ◽

10.1109/tpwrs.2021.3071918 ◽

2021 ◽

pp. 1-1

Author(s):

Jiongcheng Yan ◽

Changgang Li ◽

Yutian Liu

Keyword(s):

Deep Learning ◽

Decision Tree ◽

Early Warning ◽

Large Scale

Download Full-text

Evaluation Efficiency of Hybrid Deep Learning algorithms with Neural Network, Decision Tree and Boosting Methods for Predicting Groundwater Potential

Geocarto International ◽

10.1080/10106049.2021.1920635 ◽

2021 ◽

pp. 1-20

Author(s):

Yunzhi Chen ◽

Wei Chen ◽

Subodh Chandra Pal ◽

Asish Saha ◽

Indrajit Chowdhuri ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Decision Tree ◽

Learning Algorithms ◽

Groundwater Potential

Download Full-text

Automated Sensing System for Real-Time Recognition of Trucks in River Dredging Areas Using Computer Vision and Convolutional Deep Learning

Sensors ◽

10.3390/s21020555 ◽

2021 ◽

Vol 21 (2) ◽

pp. 555

Author(s):

Jui-Sheng Chou ◽

Chia-Hsuan Liu

Keyword(s):

Deep Learning ◽

Real Time ◽

Character Recognition ◽

Initial Period ◽

Control Point ◽

Recognition Rate ◽

Classification Model ◽

License Plate ◽

Construction Site ◽

Illegal Mining

Sand theft or illegal mining in river dredging areas has been a problem in recent decades. For this reason, increasing the use of artificial intelligence in dredging areas, building automated monitoring systems, and reducing human involvement can effectively deter crime and lighten the workload of security guards. In this investigation, a smart dredging construction site system was developed using automated techniques that were arranged to be suitable to various areas. The aim in the initial period of the smart dredging construction was to automate the audit work at the control point, which manages trucks in river dredging areas. Images of dump trucks entering the control point were captured using monitoring equipment in the construction area. The obtained images and the deep learning technique, YOLOv3, were used to detect the positions of the vehicle license plates. Framed images of the vehicle license plates were captured and were used as input in an image classification model, C-CNN-L3, to identify the number of characters on the license plate. Based on the classification results, the images of the vehicle license plates were transmitted to a text recognition model, R-CNN-L3, that corresponded to the characters of the license plate. Finally, the models of each stage were integrated into a real-time truck license plate recognition (TLPR) system; the single character recognition rate was 97.59%, the overall recognition rate was 93.73%, and the speed was 0.3271 s/image. The TLPR system reduces the labor force and time spent to identify the license plates, effectively reducing the probability of crime and increasing the transparency, automation, and efficiency of the frontline personnel’s work. The TLPR is the first step toward an automated operation to manage trucks at the control point. The subsequent and ongoing development of system functions can advance dredging operations toward the goal of being a smart construction site. By intending to facilitate an intelligent and highly efficient management system of dredging-related departments by providing a vehicle LPR system, this paper forms a contribution to the current body of knowledge in the sense that it presents an objective approach for the TLPR system.

Download Full-text

Research on Online Scene Teaching Mode of Tobacco Picking Decision Tree Construction Process Integrating Deep Learning

Tobacco Regulatory Science ◽

10.18001/trs.7.5.1.78 ◽

2021 ◽

Vol 7 (5) ◽

pp. 3076-3086

Author(s):

Zhang Shuili ◽

Zhao Yi ◽

Zheng Kexin ◽

Zhang Jun ◽

Zheng Fuchun

Keyword(s):

Deep Learning ◽

Decision Tree ◽

Online Teaching ◽

Information Gain ◽

Teaching Evaluation ◽

Teaching Process ◽

Teaching Mode ◽

Teaching Interaction ◽

Tree Construction ◽

Information And Communication

Objectives: In view of the characteristics of online teaching during the coronavirus pandemic and the importance of practical teaching in training students’ skills in the process of graduate education, this paper proposes an online scene teaching mode that takes projects as the carrier and integrates with deep learning. In order to meet the demand for information and communication engineering professionals in the big data context, the whole teaching process is divided into four stages: Topic selection, Teaching project setting, online teaching interaction and teaching evaluation. In the teaching process of Python Data Analysis Foundations, the project “establishment process of tobacco picking decision tree based on information gain” is taken as the teaching case. Prior knowledge and references are pushed through the cloud platform before class, and The scene of tobacco picking affected by the weather is set in the online classroom to guide students to seek solutions to problems, and the results are presented with graphics to assist students to summarize, and then reset the scene to promote knowledge transfer, so as to integrate deep learning into the teaching process, and modify the corresponding stages according to the teaching evaluation results. The content of the scene is gradually increased from easy to difficult, from simple to complex, and from least to most, gradually increasing the difficulty, which enhances students’ learning interest and sense of achievement. Meanwhile, students’ initiative to participate in curriculum research further strengthens the effectiveness of the course in serving scientific research, which has a certain value of popularization and application.

Download Full-text