scholarly journals Research on Improvement of N-grams Based Text Classification by Applying Pointwise Mutual Information Measures

2021 ◽  
Vol 9 (3) ◽  
Author(s):  
Tsvetanka Georgieva-Trifonova

This chapter presents a higher-order-logic formalization of the main concepts of information theory (Cover & Thomas, 1991), such as the Shannon entropy and mutual information, using the formalization of the foundational theories of measure, Lebesgue integration, and probability. The main results of the chapter include the formalizations of the Radon-Nikodym derivative and the Kullback-Leibler (KL) divergence (Coble, 2010). The latter provides a unified framework based on which most of the commonly used measures of information can be defined. The chapter then provides the general definitions that are valid for both discrete and continuous cases and then proves the corresponding reduced expressions where the measures considered are absolutely continuous over finite spaces.


2020 ◽  
pp. 1586-1597
Author(s):  
Yasen Aizezi ◽  
Anwar Jamal ◽  
Ruxianguli Abudurexiti ◽  
Mutalipu Muming

This paper mainly discusses the use of mutual information (MI) and Support Vector Machines (SVMs) for Uyghur Web text classification and digital forensics process of web text categorization: automatic classification and identification, conversion and pretreatment of plain text based on encoding features of various existing Uyghur Web documents etc., introduces the pre-paratory work for Uyghur Web text encoding. Focusing on the non-Uyghur characters and stop words in the web texts filtering, we put forward a Multi-feature Space Normalized Mutual Information (M-FNMI) algorithm and replace MI between single feature and category with mutual information (MI) between input feature combination and category so as to extract more accurate feature words; finally, we classify features with support vector machine (SVM) algorithm. The experimental result shows that this scheme has a high precision of classification and can provide criterion for digital forensics with specific purpose.


Author(s):  
Yasen Aizezi ◽  
Anwar Jamal ◽  
Ruxianguli Abudurexiti ◽  
Mutalipu Muming

This paper mainly discusses the use of mutual information (MI) and Support Vector Machines (SVMs) for Uyghur Web text classification and digital forensics process of web text categorization: automatic classification and identification, conversion and pretreatment of plain text based on encoding features of various existing Uyghur Web documents etc., introduces the pre-paratory work for Uyghur Web text encoding. Focusing on the non-Uyghur characters and stop words in the web texts filtering, we put forward a Multi-feature Space Normalized Mutual Information (M-FNMI) algorithm and replace MI between single feature and category with mutual information (MI) between input feature combination and category so as to extract more accurate feature words; finally, we classify features with support vector machine (SVM) algorithm. The experimental result shows that this scheme has a high precision of classification and can provide criterion for digital forensics with specific purpose.


Author(s):  
M. D. MADULARA ◽  
P. A. B. FRANCISCO ◽  
S. NAWANG ◽  
D. C. AROGANCIA ◽  
C. J. CELLUCCI ◽  
...  

We investigate the pairwise mutual information and transfer entropy of ten-channel, free-running electroencephalographs measured from thirteen subjects under two behavioral conditions: eyes open resting and eyes closed resting. Mutual information measures nonlinear correlations; transfer entropy determines the directionality of information transfer. For all channel pairs, mutual information is generally lower with eyes open compared to eyes closed indicating that EEG signals at different scalp sites become more dissimilar as the visual system is engaged. On the other hand, transfer entropy increases on average by almost two-fold when the eyes are opened. The largest one-way transfer entropies are to and from the Oz site consistent with the involvement of the occipital lobe in vision. The largest net transfer entropies are from F3 and F4 to almost all the other scalp sites.


Sign in / Sign up

Export Citation Format

Share Document