A Web Text Extraction Method Based on Regular Expressions and Text Density

A video text extraction method for character recognition

Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318) ◽

10.1109/icdar.1999.791716 ◽

1999 ◽

Cited By ~ 17

Author(s):

O. Hori

Keyword(s):

Character Recognition ◽

Extraction Method ◽

Text Extraction

An Improved Scene Text Extraction Method Using Conditional Random Field and Optical Character Recognition

2011 International Conference on Document Analysis and Recognition ◽

10.1109/icdar.2011.148 ◽

2011 ◽

Cited By ~ 20

Author(s):

Hongwei Zhang ◽

Changsong Liu ◽

Cheng Yang ◽

Xiaoqing Ding ◽

KongQiao Wang

Keyword(s):

Random Field ◽

Character Recognition ◽

Optical Character Recognition ◽

Extraction Method ◽

Conditional Random Field ◽

Text Extraction ◽

Optical Character ◽

Scene Text

Correction to “Inference of Regular Expressions for Text Extraction from Examples”

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2016.2557978 ◽

2016 ◽

Vol 28 (7) ◽

pp. 1944-1944

Author(s):

Alberto Bartoli ◽

Andrea De Lorenzo ◽

Eric Medvet ◽

Fabiano Tarlao

Keyword(s):

Regular Expressions ◽

Text Extraction

A robust video text extraction method based on text traversing line and stroke connectivity

2008 9th International Conference on Signal Processing ◽

10.1109/icosp.2008.4697297 ◽

2008 ◽

Cited By ~ 2

Author(s):

Peng Tianqiang ◽

Tian Pohuang ◽

Li Bicheng

Keyword(s):

Extraction Method ◽

Text Extraction

Text extraction from digital English comic image using two blobs extraction method

International Conference on Pattern Recognition, Informatics and Medical Engineering (PRIME-2012) ◽

10.1109/icprime.2012.6208388 ◽

2012 ◽

Cited By ~ 4

Author(s):

M. Sundaresan ◽

S. Ranjini

Keyword(s):

Extraction Method ◽

Text Extraction

A Web Information Extraction Method Based on HTML Parser

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.774-776.1802 ◽

2013 ◽

Vol 774-776 ◽

pp. 1802-1806

Author(s):

Zhi Ming Zhang ◽

Shuai Shuai Huang ◽

Ping Li

Keyword(s):

Information Extraction ◽

Extraction Method ◽

Rapid Development ◽

Extraction Time ◽

The Internet ◽

Regular Expressions ◽

Web Information Extraction ◽

Amount Of Information ◽

Web Information ◽

Html Parser

With the rapid development of Internet, and surge in the amount of information on the Internet, how to accurately and quickly get the information of the users really need, such as the title, links, and pictures, is the hotspot. This paper proposed a fast web information extraction method based on html parser, this paper validated the effect of the proposed method by extracting commodities information of e-commerce website, the results show that the accuracy of the information extraction by our method is higher than the extraction method based on regular expressions, and the extraction time is greatly shortened.

A Novel Image Text Extraction Method Based on K-Means Clustering

Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008) ◽

10.1109/icis.2008.31 ◽

2008 ◽

Cited By ~ 20

Author(s):

Yan Song ◽

Anan Liu ◽

Lin Pang ◽

Shouxun Lin ◽

Yongdong Zhang ◽

...

Keyword(s):

Extraction Method ◽

Text Extraction

A Novel Text Extraction Method from Pure Text Images Using Morphological Operations

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.3768 ◽

2014 ◽

Vol 989-994 ◽

pp. 3768-3772

Author(s):

Xuan Qi Chen ◽

Biao He ◽

Guo Cheng Wang ◽

Yao Xin Li

Keyword(s):

Mathematical Morphology ◽

Extraction Method ◽

New Method ◽

Morphological Operations ◽

Text Extraction ◽

Text Images ◽

Robust To Noise

This paper presents a new method to achieve effective text extraction using mathematical morphology. Firstly, the document is segmented and divided into several parts based on the layout. And then, every part is dilated to big connected regions, whose biggest skeleton will be extracted and serve as a structure element (SE). Finally, a proposed region-concatenated operation with the SE will be employed, whose result can be the input of subsequent OCR system. Experimentally, the proposed method is robust to noise, the text orientation, font style and size, language and layout.

A robust video text extraction method for character recognition

Systems and Computers in Japan ◽

10.1002/scj.10148 ◽

2005 ◽

Vol 36 (9) ◽

pp. 87-96 ◽

Cited By ~ 2

Author(s):

Osamu Hori ◽

Takeshi Mita

Keyword(s):

Character Recognition ◽

Extraction Method ◽

Text Extraction

FREGEX: A Feature Extraction Method for Biomedical Text Classification using Regular Expressions

2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) ◽

10.1109/embc.2019.8857471 ◽

2019 ◽

Author(s):

Christopher A. Flores ◽

Rosa L. Figueroa ◽

Jorge E. Pezoa

Keyword(s):

Feature Extraction ◽

Text Classification ◽

Extraction Method ◽

Biomedical Text ◽

Regular Expressions ◽

Feature Extraction Method ◽

Biomedical Text Classification