scholarly journals Novel Linguistic Steganography Based on Character-Level Text Generation

Mathematics ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. 1558 ◽  
Author(s):  
Lingyun Xiang ◽  
Shuanghui Yang ◽  
Yuhang Liu ◽  
Qian Li ◽  
Chengzhang Zhu

With the development of natural language processing, linguistic steganography has become a research hotspot in the field of information security. However, most existing linguistic steganographic methods may suffer from the low embedding capacity problem. Therefore, this paper proposes a character-level linguistic steganographic method (CLLS) to embed the secret information into characters instead of words by employing a long short-term memory (LSTM) based language model. First, the proposed method utilizes the LSTM model and large-scale corpus to construct and train a character-level text generation model. Through training, the best evaluated model is obtained as the prediction model of generating stego text. Then, we use the secret information as the control information to select the right character from predictions of the trained character-level text generation model. Thus, the secret information is hidden in the generated text as the predicted characters having different prediction probability values can be encoded into different secret bit values. For the same secret information, the generated stego texts vary with the starting strings of the text generation model, so we design a selection strategy to find the highest quality stego text from a number of candidate stego texts as the final stego text by changing the starting strings. The experimental results demonstrate that compared with other similar methods, the proposed method has the fastest running speed and highest embedding capacity. Moreover, extensive experiments are conducted to verify the effect of the number of candidate stego texts on the quality of the final stego text. The experimental results show that the quality of the final stego text increases with the number of candidate stego texts increasing, but the growth rate of the quality will slow down.

2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Zuoyong Xiang ◽  
Zhenyu Chen ◽  
Xingyu Gao ◽  
Xinjun Wang ◽  
Fangchun Di ◽  
...  

A new partitioning method, called Wedging Insertion, is proposed for solving large-scale symmetric Traveling Salesman Problem (TSP). The idea of our proposed algorithm is to cut a TSP tour into four segments by nodes’ coordinate (not by rectangle, such as Strip, FRP, and Karp). Each node is located in one of their segments, which excludes four particular nodes, and each segment does not twist with other segments. After the partitioning process, this algorithm utilizes traditional construction method, that is, the insertion method, for each segment to improve the quality of tour, and then connects the starting node and the ending node of each segment to obtain the complete tour. In order to test the performance of our proposed algorithm, we conduct the experiments on various TSPLIB instances. The experimental results show that our proposed algorithm in this paper is more efficient for solving large-scale TSPs. Specifically, our approach is able to obviously reduce the time complexity for running the algorithm; meanwhile, it will lose only about 10% of the algorithm’s performance.


Author(s):  
Wei Wang ◽  
Xiang-Yu Guo ◽  
Shao-Yuan Li ◽  
Yuan Jiang ◽  
Zhi-Hua Zhou

Crowdsourcing systems make it possible to hire voluntary workers to label large-scale data by offering them small monetary payments. Usually, the taskmaster requires to collect high-quality labels, while the quality of labels obtained from the crowd may not satisfy this requirement. In this paper, we study the problem of obtaining high-quality labels from the crowd and present an approach of learning the difficulty of items in crowdsourcing, in which we construct a small training set of items with estimated difficulty and then learn a model to predict the difficulty of future items. With the predicted difficulty, we can distinguish between easy and hard items to obtain high-quality labels. For easy items, the quality of their labels inferred from the crowd could be high enough to satisfy the requirement; while for hard items, the crowd could not provide high-quality labels, it is better to choose a more knowledgable crowd or employ specialized workers to label them. The experimental results demonstrate that the proposed approach by learning to distinguish between easy and hard items can significantly improve the label quality.


2020 ◽  
Vol 24 ◽  
pp. 63-86
Author(s):  
Francisco Mena ◽  
Ricardo Ñanculef ◽  
Carlos Valle

The lack of annotated data is one of the major barriers facing machine learning applications today. Learning from crowds, i.e. collecting ground-truth data from multiple inexpensive annotators, has become a common method to cope with this issue. It has been recently shown that modeling the varying quality of the annotations obtained in this way, is fundamental to obtain satisfactory performance in tasks where inexpert annotators may represent the majority but not the most trusted group. Unfortunately, existing techniques represent annotation patterns for each annotator individually, making the models difficult to estimate in large-scale scenarios. In this paper, we present two models to address these problems. Both methods are based on the hypothesis that it is possible to learn collective annotation patterns by introducing confusion matrices that involve groups of data point annotations or annotators. The first approach clusters data points with a common annotation pattern, regardless the annotators from which the labels have been obtained. Implicitly, this method attributes annotation mistakes to the complexity of the data itself and not to the variable behavior of the annotators. The second approach explicitly maps annotators to latent groups that are collectively parametrized to learn a common annotation pattern. Our experimental results show that, compared with other methods for learning from crowds, both methods have advantages in scenarios with a large number of annotators and a small number of annotations per annotator.


Author(s):  
Ke Wang ◽  
Xiaojun Wan

Generating texts of different sentiment labels is getting more and more attention in the area of natural language generation. Recently, Generative Adversarial Net (GAN) has shown promising results in text generation. However, the texts generated by GAN usually suffer from the problems of poor quality, lack of diversity and mode collapse. In this paper, we propose a novel framework - SentiGAN, which has multiple generators and one multi-class discriminator, to address the above problems. In our framework, multiple generators are trained simultaneously, aiming at generating texts of different sentiment labels without supervision. We propose a penalty based objective in the generators to force each of them to generate diversified examples of a specific sentiment label. Moreover, the use of multiple generators and one multi-class discriminator can make each generator focus on generating its own examples of a specific sentiment label accurately. Experimental results on four datasets demonstrate that our model consistently outperforms several state-of-the-art text generation methods in the sentiment accuracy and quality of generated texts.


2010 ◽  
Vol 40-41 ◽  
pp. 469-472
Author(s):  
Yan Hui Li ◽  
Yu Liang Gao

Histogram analysis for wavelet coefficients is powerful steganalysis for detecting the presence of secret information embedded in the wavelet coefficients. In order to improve the security, a wavelet-based steganography against histogram analysis is presented. First, a cover image is divided into blocks, and every block is decomposed into wavelet. Then, if the secret bit is not same as the information denoted by nonzero wavelet coefficient, the absolute value of wavelet coefficient is subtracted by 1, if the value of wavelet coefficient was 0 after embedding the secret bit, the secret bit should be embedded into next wavelet coefficient. If the sum of wavelet coefficients is large, the wavelet coefficients of next level should be embedded by secret information. Finally, the stego-image can be obtained by using the inverse wavelet transform. From the experimental results, the proposed method could effectively keep the identity of histogram for wavelet coefficients and maintain a good visual quality of stego-image.


2018 ◽  
Vol 27 (11) ◽  
pp. 1850175 ◽  
Author(s):  
Neeraj Kumar Jain ◽  
Singara Singh Kasana

The proposed reversible data hiding technique is the extension of Peng et al.’s technique [F. Peng, X. Li and B. Yang, Improved PVO-based reversible data hiding, Digit. Signal Process. 25 (2014) 255–265]. In this technique, a cover image is segmented into nonoverlapping blocks of equal size. Each block is sorted in ascending order and then differences are calculated on the basis of locations of its largest and second largest pixel values. Negative predicted differences are utilized to create empty spaces which further enhance the embedding capacity of the proposed technique. Also, the already sorted blocks are used to enhance the visual quality of marked images as pixels of these blocks are more correlated than the unsorted pixels of the block. Experimental results show the effectiveness of the proposed technique.


2018 ◽  
Vol 2018 ◽  
pp. 1-11
Author(s):  
Junhui He ◽  
Junxi Chen ◽  
Shichang Xiao ◽  
Xiaoyu Huang ◽  
Shaohua Tang

Steganography is a means of covert communication without revealing the occurrence and the real purpose of communication. The adaptive multirate wideband (AMR-WB) is a widely adapted format in mobile handsets and is also the recommended speech codec for VoLTE. In this paper, a novel AMR-WB speech steganography is proposed based on diameter-neighbor codebook partition algorithm. Different embedding capacity may be achieved by adjusting the iterative parameters during codebook division. The experimental results prove that the presented AMR-WB steganography may provide higher and flexible embedding capacity without inducing perceptible distortion compared with the state-of-the-art methods. With 48 iterations of cluster merging, twice the embedding capacity of complementary-neighbor-vertices-based embedding method may be obtained with a decrease of only around 2% in speech quality and much the same undetectability. Moreover, both the quality of stego speech and the security regarding statistical steganalysis are better than the recent speech steganography based on neighbor-index-division codebook partition.


2020 ◽  
Vol 10 (6) ◽  
pp. 1967
Author(s):  
Qiangqaing Guo ◽  
Zhenfang Zhu ◽  
Qiang Lu ◽  
Dianyuan Zhang ◽  
Wenqing Wu

With the development of deep learning, the method of large-scale dialogue generation based on deep learning has received extensive attention. The current research has aimed to solve the problem of the quality of generated dialogue content, but has failed to fully consider the emotional factors of generated dialogue content. In order to solve the problem of emotional response in the open domain dialogue system, we proposed a dynamic emotional session generation model (DESG). On the basis of the Seq2Seq (sequence-to-sequence) framework, the model abbreviation incorporates a dictionary-based attention mechanism that encourages the substitution of words in response with synonyms in emotion dictionaries. Meanwhile, in order to improve the model, internal emotion regulator and emotion classifier mechanisms are introduced in order to build a large-scale emotion-session generation model. Experimental results show that our DESG model can not only produce an appropriate output sequence in terms of content (related grammar) for a given post and emotion category, but can also express the expected emotional response explicitly or implicitly.


2021 ◽  
Vol 2 (2) ◽  
pp. 1-13
Author(s):  
Yamin Li ◽  
Jun Zhang ◽  
Zhongliang Yang ◽  
Ru Zhang

The core challenge of steganography is always how to improve the hidden capacity and the concealment. Most current generation-based linguistic steganography methods only consider the probability distribution between text characters, and the emotion and topic of the generated steganographic text are uncontrollable. Especially for long texts, generating several sentences related to a topic and displaying overall coherence and discourse-relatedness can ensure better concealment. In this article, we address the problem of generating coherent multi-sentence texts for better concealment, and a topic-aware neural linguistic steganography method that can generate a steganographic paragraph with a specific topic is present. We achieve a topic-controllable steganographic long text generation by encoding the related entities and their relationships from Knowledge Graphs. Experimental results illustrate that the proposed method can guarantee both the quality of the generated steganographic text and its relevance to a specific topic. The proposed model can be widely used in covert communication, privacy protection, and many other areas of information security.


Author(s):  
A. Babirad

Cerebrovascular diseases are a problem of the world today, and according to the forecast, the problem of the near future arises. The main risk factors for the development of ischemic disorders of the cerebral circulation include oblique and aging, arterial hypertension, smoking, diabetes mellitus and heart disease. An effective strategy for the prevention of cerebrovascular events is based on the implementation of large-scale risk control measures, including the use of antiagregant and anticoagulant therapy, invasive interventions such as atheromectomy, angioplasty and stenting. In this connection, the efforts of neurologists, cardiologists, angiosurgery, endocrinologists and other specialists are the basis for achieving an acceptable clinical outcome. A review of the SF-36 method for assessing the quality of life in patients with the effects of transient ischemic stroke is presented. The assessment of quality of life is recognized in world medical practice and research, an indicator that is also used to assess the quality of the health system and in general sociological research.


Sign in / Sign up

Export Citation Format

Share Document