Document Image Watermarking Based on Weight-Invariant Partition Using Support Vector Machine

Author(s):  
Shiyan Hu
Author(s):  
YAN ZHANG ◽  
BIN YU ◽  
HAI-MING GU

Document image segmentation is an important research area of document image analysis which classifies the contents of a document image into a set of text and non-text classes. Previous existing methods are often designed to classify text and halftone therefore they perform poorly in classifying graphics, tables and circuit, etc. In this paper, we present a robust multi-level classification method using multi-layer perceptron (MLP) and support vector machine (SVM) to segment the texts from non-texts and thereafter classify them as tables, graphics and halftones. This method outperforms previously existing methods by overcoming various issues associated with the complexity of document images. Experimental results prove the effectiveness of our proposed method. By virtue of our multi-level classification approach, the text components, halftone components, graphic components and table components are accurately classified respectively which would highly improve OCR accuracy to reduce garbage symbols as well as increase compression ratio thereafter simultaneously.


Author(s):  
Fauziah Kasmin ◽  
Zuraini Othman ◽  
Sharifah Sakinah Syed Ahmad

<span lang="EN-GB">Binarization</span><span lang="EN-GB"> of historical documents nowadays is very important as digital archiving has become the best and preferred solution for the retrieval and storage of valuable archives. However, the process becomes more challenging due to the degradation of historical documents. Hence, this paper described a method on binarization of historical documents using the learning concept. Support vector machine (SVM) learning was used as a classifier in this work. After training some images with the help of ground truth images, a model was developed. Testing images then used the model to segregate each pixel as text or non-text. The grey level and RGB values were chosen as descriptors for a particular pixel and comparisons were made between these two descriptors. The intensities of the local neighbourhood for every pixel were used in the experiment. To compare these descriptors, standard dataset HDIBCO2014, DIBCO2012 and DIBCO2016 were used in the training and testing phase. The results from the experiment clearly showed that grey level values gave better performance compared to RGB values.</span>


2015 ◽  
Vol 2015 ◽  
pp. 1-14 ◽  
Author(s):  
Lin Sun ◽  
Jiucheng Xu ◽  
Xingxing Zhang ◽  
Yun Tian

With the development of information security, the traditional encryption algorithm for image has been far from ensuring the security of image in the transmission. This paper presents a new image watermarking scheme based on Arnold Transform (AT) and Fuzzy Smooth Support Vector Machine (FSSVM). First of all, improved AT (IAT) is obtained by adding variables and expanding transformation space, and FSSVM is proposed by introducing fuzzy membership degree. The embedding positions of watermark are obtained from IAT, and the pixel values are embedded in carrier image by quantization embedding rules. Then, the watermark can be embedded in carrier image. In order to realize blind extraction of watermark, FSSVM model is used to find the embedding positions of watermark, and the pixel values are extracted by using quantization extraction rules. Through using improved Arnold inverse transformation for embedding positions, the watermark coordinates can be calculated, and the extraction of watermark is carried out. Compared with other watermarking techniques, the presented scheme can promote the security by adding more secret keys, and the imperceptibility of watermark is improved by introducing quantization rules. The experimental results show that the proposed method outperforms many existing methods against various types of attacks.


Author(s):  
Yuna Sugianela ◽  
Nanik Suciati

Some ancient documents in Indonesia are written in the Javanese script. Those documents contain the knowledge of history and culture of Indonesia, especially about Java. However, only a few people understand the Javanese script. Thus, the automation system is needed to translate the document written in the Javanese script. In this study, the researchers use the classification method to recognize the Javanese script written in the document. The method used is the Multiclass Support Vector Machine (SVM) using One Against One (OAO) strategy. The researchers use seven variations of Javanese script from the different document for this study. There are 31 classes and 182 data for training and testing data. The result shows good performance in the evaluation. The recognition system successfully resolves the problem of color variation from the dataset. The accuracy of the study is 81.3%.


Sign in / Sign up

Export Citation Format

Share Document