On automatic text segmentation

In this paper, text mining and statistical models are deployed to explore the relationship between the Shanghai Stock Exchange Composite Index (SSECI) and the collective emotions of individual investors. The emotions of individual investors are quantified by extracting and aggregating investor online posts that contain finance-related keywords. To identify a set of finance-related keywords, three years of blogs from a famous financial blog site are segmented by an automatic text segmentation method; meanwhile, in the literature of social media, people typically select keywords manually. Posts that discuss the keywords are extracted out of all types of topics from Sina Weibo, the largest microblog platform in China. Statistical results reveal the relationship between daily posts and daily opening prices with a one-day lag, which indicates the existence of information (news) propagation lag. This study contributes to the existing literature by demonstrating that the microblog sentiment level reports can be quantitatively incorporated as a proxy to provide valuable support to portfolio decision making.

Download Full-text

Context-Driven Corpus-Based Model for Automatic Text Segmentation and Part of Speech Tagging in Setswana Using OpenNLP Tool

Modeling and Using Context - Lecture Notes in Computer Science ◽

10.1007/978-3-030-34974-5_6 ◽

2019 ◽

pp. 62-73

Author(s):

Mary Ambrossine Dibitso ◽

Pius Adewale Owolawi ◽

Sunday Olusegun Ojo

Keyword(s):

Text Segmentation ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Speech Tagging ◽

Automatic Text

Download Full-text

AUTOMATIC TEXT SEGMENTATION AND RECOGNITION IN NATURAL SCENE IMAGES USING MSOCR

Current Signal Transduction Therapy ◽

10.2174/1574362414666190725105748 ◽

2019 ◽

Vol 14 ◽

Author(s):

Surem Samuel SR ◽

Seldev Christopher C ◽

VinilaJinny S

Keyword(s):

Character Recognition ◽

Template Matching ◽

Optical Character Recognition ◽

Recall Rate ◽

Svm Classifier ◽

Text Segmentation ◽

Natural Scene ◽

Scene Image ◽

Innovative Methods ◽

Automatic Text

Segmentation and recognition of text from the scene image is a challenging task due to blurred, low-resolution and small sized image. Innovative methods have been proposed to address this problem and to recognize the text from natural scene image. The acquired image is pre-processed by the YUV channel conversion technique and the Y channel image is converted to a gray scale image. Connected Component Based Text Segmentation Algorithm (CCBTSA) and MSER methods are used for segmentation and recognition of text using Optical Character Recognition (OCR). GLCM and FOS features are extracted from the segmented region. Template matching algorithm is used to extract the text character from the bounding box of segmented image. Trained SVM classifier is used to classify the image containing text and non-text region.Performances are analysed based on the recall rate, precision, accuracy and F-measure. From the experimental results, the accuracy of the classifier was obtained as 95%.

Download Full-text