A Security-Related Reputation Scheme of Android Apps Based on NLP Analysis of Comments
Comments are exploited by product vendors to measure satisfaction of consumers. With the advent of Natural Language Processing (NLP), comments on Google Play can be processed to extract knowledge on applications such as their reputation. Proposals in that direction are either informal or interested merely on functionality. Unlike, this work aims to determine reputation of Android applications in terms of confidentiality, integrity, availability and authentication (CIAA). This work proposes a model of assessing app reputation relying on sentiment analysis and text analysis of comments. While assuming that comments are reliable, we collect Google Play applications subject to comments which include security keywords. An in-depth analysis of keywords based on Naive Bayes classification is made to provide polarity of any comment. Based on comment polarity, reputation is evaluated for the whole application. Experiments made on real applications including dozens to billions of comments, reveal that developers lack to make efforts to guarantee CIAA services. A fine-grained analysis shows that not security reputed applications can be reputed in specific CIAA services. Results also show that applications with negative security polarities display in general positive functional polarities. This result suggests that security checking should include careful comment analysis to improve security of applications.