A Novel Reversible Chinese Text Information Hiding Scheme based on Lookalike Traditional and Simplified Chinese Characters

2014 ◽  
Vol 8 (1) ◽  
pp. 269-281 ◽  
2013 ◽  
Vol 798-799 ◽  
pp. 423-426
Author(s):  
Xiao Feng Wang

Through the analysis of the hypertext markup, proposed and implemented several new methods of text information hiding. The concealment of these methods is better, Comprehensive utilization of these methods can obtain large information hiding capacity, better concealed. And they have better robustness for traditional attack.


2004 ◽  
Vol 30 (1) ◽  
pp. 75-93 ◽  
Author(s):  
Haodi Feng ◽  
Kang Chen ◽  
Xiaotie Deng ◽  
Weimin Zheng

We are interested in the problem of word extraction from Chinese text collections. We define a word to be a meaningful string composed of several Chinese characters. For example, ‘percent’, and, ‘more and more’, are not recognized as traditional Chinese words from the viewpoint of some people. However, in our work, they are words because they are very widely used and have specific meanings. We start with the viewpoint that a word is a distinguished linguistic entity that can be used in many different language environments. We consider the characters that are directly before a string (predecessors) and the characters that are directly after a string (successors) as important factors for determining the independence of the string. We call such characters accessors of the string, consider the number of distinct predecessors and successors of a string in a large corpus (TREC 5 and TREC 6 documents), and use them as the measurement of the context independency of a string from the rest of the sentences in the document. Our experiments confirm our hypothesis and show that this simple rule gives quite good results for Chinese word extraction and is comparable to, and for long words outperforms, other iterative methods.


Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 964
Author(s):  
Mingshu He ◽  
Xiaojuan Wang ◽  
Chundong Zou ◽  
Bingying Dai ◽  
Lei Jin

Text, voice, images and videos can express some intentions and facts in daily life. By understanding these contents, people can identify and analyze some behaviors. This paper focuses on the commodity trade declaration process and identifies the commodity categories based on text information on customs declarations. Although the technology of text recognition is mature in many application fields, there are few studies on the classification and recognition of customs declaration goods. In this paper, we proposed a classification framework based on machine learning (ML) models for commodity trade declaration that reaches a high rate of accuracy. This paper also proposed a symmetrical decision fusion method for this task based on convolutional neural network (CNN) and transformer. The experimental results show that the fusion model can make up for the shortcomings of the two original models and some improvements have been made. In the two datasets used in this paper, the accuracy can reach 88% and 99%, respectively. To promote the development of study of customs declaration business and Chinese text recognition, we also exposed the proprietary datasets used in this study.


Author(s):  
Hanqing Tao ◽  
Shiwei Tong ◽  
Hongke Zhao ◽  
Tong Xu ◽  
Binbin Jin ◽  
...  

Recent years, Chinese text classification has attracted more and more research attention. However, most existing techniques which specifically aim at English materials may lose effectiveness on this task due to the huge difference between Chinese and English. Actually, as a special kind of hieroglyphics, Chinese characters and radicals are semantically useful but still unexplored in the task of text classification. To that end, in this paper, we first analyze the motives of using multiple granularity features to represent a Chinese text by inspecting the characteristics of radicals, characters and words. For better representing the Chinese text and then implementing Chinese text classification, we propose a novel Radicalaware Attention-based Four-Granularity (RAFG) model to take full advantages of Chinese characters, words, characterlevel radicals, word-level radicals simultaneously. Specifically, RAFG applies a serialized BLSTM structure which is context-aware and able to capture the long-range information to model the character sharing property of Chinese and sequence characteristics in texts. Further, we design an attention mechanism to enhance the effects of radicals thus model the radical sharing property when integrating granularities. Finally, we conduct extensive experiments, where the experimental results not only show the superiority of our model, but also validate the effectiveness of radicals in the task of Chinese text classification.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Laxmisha Rai ◽  
Hong Li

Majority of Chinese characters are pictographic characters with strong associative ability and when a character appears for Chinese readers, they usually associate with the objects, or actions related to the character immediately. Having this background, we propose a system to visualize the simplified Chinese characters, so that developing any skills of either reading or writing Chinese characters is not necessary. Considering the extensive use and application of mobile devices, automatic identification of Chinese characters and display of associative images are made possible in smart devices to facilitate quick overview of a Chinese text. This work is of practical significance considering the research and development of real-time Chinese text recognition, display of associative images and for such users who would like to visualize the text with only images. The proposed Chinese character recognition system and visualization tool is named as MyOcrTool and developed for Android platform. The application recognizes the Chinese characters through OCR engine, and uses the internal voice playback interface to realize the audio functions and display the visual images of Chinese characters in real-time.


Author(s):  
Mingjun Zhai ◽  
Hsuan-Chih Chen ◽  
Michael C. W. Yip

Abstract. The present study was conducted to examine whether traditional and simplified Chinese readers (TCRs and SCRs) differed in stroke encoding in character processing by an eye-tracking experiment. We recruited 66 participants (32 TCRs and 34 SCRs) to read sentences comprising characters with different proportions and types of strokes removed in order to explore whether any visual complexity effect existed in their processing of simplified and traditional Chinese characters. The present study found a cross-script visual complexity effect and that SCRs were more influenced by visual complexity change in lexical access than were TCRs. In addition, the stroke-order effect appeared to be more salient for TCRs than for SCRs.


Sign in / Sign up

Export Citation Format

Share Document