Modeling method of internet public information data mining based on probabilistic topic model

2019 ◽  
Vol 75 (9) ◽  
pp. 5882-5897 ◽  
Author(s):  
Shaofei Wu ◽  
Jun Liu ◽  
Lizhi Liu
2014 ◽  
Vol 912-914 ◽  
pp. 1710-1713
Author(s):  
Qing Zhang ◽  
Sui Huai Yu ◽  
Ming Jiu Yu

During the design processing of the future exploratory products, requirements from users seems to be a key factor for products availability achievement. As a practical user modeling method, Persona may accomplish the potential needs data mining effectively based on the analyzing of users. This review mainly focused on how to apply the persona in the exploratory products investigation to acquire useful information from the products design. The method to establish persona and the operating rules were also discussed in this article. The concept of the mobile internet device in future was used as an case to demonstrate the persona mentioned above.


2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Hongcheng Zou ◽  
Ziling Wei ◽  
Jinshu Su ◽  
Baokang Zhao ◽  
Yusheng Xia ◽  
...  

Website fingerprinting (WFP) attack enables identifying the websites a user is browsing even under the protection of privacy-enhancing technologies (PETs). Previous studies demonstrate that most machine-learning attacks need multiple types of features as input, thus inducing tremendous feature engineering work. However, we show the other alternative. That is, we present Probabilistic Fingerprinting (PF), a new website fingerprinting attack that merely leverages one type of features. They are produced by using a mathematical model PWFP that combines a probabilistic topic model with WFP for the first time, due to a finding that a plain text and the sequence file generated from a traffic instance are essentially the same. Experimental results show that the proposed new features are more distinguishing than the existing features. In a closed-world setting, PF attains a better accuracy performance (99.79% at most) than prior attacks on various datasets gathered in the scenarios of Shadowsocks, SSH, and TLS, respectively. Besides, even when the number of training instances drops to as few as 4, PF still reaches an accuracy of above 90%. In the more realistic open-world setting, PF attains a high true positive rate (TPR) and Bayes detection rate (BDR), and a low false positive rate (FPR) in all evaluations, which outperforms the other attacks. These results highlight that it is meaningful and possible to explore new features to improve the accuracy of WFP attacks.


Sign in / Sign up

Export Citation Format

Share Document