scholarly journals Fingerprinting Keywords in Search Queries over Tor

2017 ◽  
Vol 2017 (4) ◽  
pp. 251-270 ◽  
Author(s):  
Se Eun Oh ◽  
Shuai Li ◽  
Nicholas Hopper

AbstractSearch engine queries contain a great deal of private and potentially compromising information about users. One technique to prevent search engines from identifying the source of a query, and Internet service providers (ISPs) from identifying the contents of queries is to query the search engine over an anonymous network such as Tor.In this paper, we study the extent to which Website Fingerprinting can be extended to fingerprint individual queries or keywords to web applications, a task we call Keyword Fingerprinting (KF). We show that by augmenting traffic analysis using a two-stage approach with new task-specific feature sets, a passive network adversary can in many cases defeat the use of Tor to protect search engine queries.We explore three popular search engines, Google, Bing, and Duckduckgo, and several machine learning techniques with various experimental scenarios. Our experimental results show that KF can identify Google queries containing one of 300 targeted keywords with recall of 80% and precision of 91%, while identifying the specific monitored keyword among 300 search keywords with accuracy 48%. We also further investigate the factors that contribute to keyword fingerprintability to understand how search engines and users might protect against KF.

Author(s):  
RajKishore Sahni

The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter spam emails. We present a systematic review of some of the popular machine learning based email spam filtering approaches. Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering. The preliminary discussion in the study background examines the applications of machine learning techniques to the email spam filtering process of the leading internet service providers (ISPs) like Gmail, Yahoo and Outlook emails spam filters. Discussion on general email spam filtering process, and the various efforts by different researchers in combating spam through the use machine learning techniques was done. Our review compares the strengths and drawbacks of existing machine learning approaches and the open research problems in spam filtering. We recommended deep learning and deep adversarial learning as the future techniques that can effectively handle the menace of spam emails


10.2196/20995 ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. e20995
Author(s):  
Debbie Rankin ◽  
Michaela Black ◽  
Bronac Flanagan ◽  
Catherine F Hughes ◽  
Adrian Moore ◽  
...  

Background Machine learning techniques, specifically classification algorithms, may be effective to help understand key health, nutritional, and environmental factors associated with cognitive function in aging populations. Objective This study aims to use classification techniques to identify the key patient predictors that are considered most important in the classification of poorer cognitive performance, which is an early risk factor for dementia. Methods Data were used from the Trinity-Ulster and Department of Agriculture study, which included detailed information on sociodemographic, clinical, biochemical, nutritional, and lifestyle factors in 5186 older adults recruited from the Republic of Ireland and Northern Ireland, a proportion of whom (987/5186, 19.03%) were followed up 5-7 years later for reassessment. Cognitive function at both time points was assessed using a battery of tests, including the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS), with a score <70 classed as poorer cognitive performance. This study trained 3 classifiers—decision trees, Naïve Bayes, and random forests—to classify the RBANS score and to identify key health, nutritional, and environmental predictors of cognitive performance and cognitive decline over the follow-up period. It assessed their performance, taking note of the variables that were deemed important for the optimized classifiers for their computational diagnostics. Results In the classification of a low RBANS score (<70), our models performed well (F1 score range 0.73-0.93), all highlighting the individual’s score from the Timed Up and Go (TUG) test, the age at which the participant stopped education, and whether or not the participant’s family reported memory concerns to be of key importance. The classification models performed well in classifying a greater rate of decline in the RBANS score (F1 score range 0.66-0.85), also indicating the TUG score to be of key importance, followed by blood indicators: plasma homocysteine, vitamin B6 biomarker (plasma pyridoxal-5-phosphate), and glycated hemoglobin. Conclusions The results suggest that it may be possible for a health care professional to make an initial evaluation, with a high level of confidence, of the potential for cognitive dysfunction using only a few short, noninvasive questions, thus providing a quick, efficient, and noninvasive way to help them decide whether or not a patient requires a full cognitive evaluation. This approach has the potential benefits of making time and cost savings for health service providers and avoiding stress created through unnecessary cognitive assessments in low-risk patients.


Author(s):  
Rizwan Ur Rahman ◽  
Rishu Verma ◽  
Himani Bansal ◽  
Deepak Singh Tomar

With the explosive expansion of information on the world wide web, search engines are becoming more significant in the day-to-day lives of humans. Even though a search engine generally gives huge number of results for certain query, the majority of the search engine users simply view the first few web pages in result lists. Consequently, the ranking position has become a most important concern of internet service providers. This article addresses the vulnerabilities, spamming attacks, and countermeasures in blogging sites. In the first part, the article explores the spamming types and detailed section on vulnerabilities. In the next part, an attack scenario of form spamming is presented, and defense approach is presented. Consequently, the aim of this article is to provide review of vulnerabilities, threats of spamming associated with blogging websites, and effective measures to counter them.


Author(s):  
Jaani Riordan

Internet intermediaries are essential features of modern commerce, social and political life, and the dissemination of ideas. Some act as conduits through which our transmissions pass; others are custodians of our personal data and gatekeepers of the world’s knowledge. They supply the infrastructure and tools which make electronic communications possible. These services encompass a vast ecosystem of different entities: internet service providers, website operators, hosts, data centres, social networks, media platforms, search engines, app developers, marketplaces, app stores, and others—many of which are household names.


Author(s):  
Lucas Logan

Intermediary liability is at the center of the debate over free expression, free speech, and an open Internet. The underlying policies form network regulation that governs the extent that websites, search engines, and Internet service providers that host user content are legally responsible for what their users post or upload. Levels of intermediary liability are commonly categorized as providing broad immunity, limited liability, or strict liability. In the United States, intermediaries are given broad immunity through Section 230 of the Communication Decency Act. In practice, this means that search engines cannot be held liable for the speech of individuals appearing in search results, or a news site is not responsible for what people are typing in its comment section. Immunity is important to the existence of free expression because it ensures that intermediaries do not have incentives to censor content out of fear of the law. The millions of users continuously generating content through Facebook and YouTube, for instance, would not be able to do so if those intermediaries were fearful of legal consequences due to the actions of any given user. Privacy policy online is most evidently showcased by the European Union’s Right to be Forgotten policy, which forces search engines to delist an individual’s information that is deemed harmful to reputation. Hateful and harmful speech is also regulated online through intermediary liability, although social media services often decide when and how to remove this type of content based on company policy.


Author(s):  
Leena N ◽  
K. K. Saju

<p>Detection of nutritional deficiencies in plants is vital for improving crop productivity. Timely identification of nutrient deficiency through visual symptoms in the plants can help farmers take quick corrective action by appropriate nutrient management strategies. The application of computer vision and machine learning techniques offers new prospects in non-destructive field-based analysis for nutrient deficiency. Color and shape are important parameters in feature extraction. In this work, two different techniques are used for image segmentation and feature extraction to generate two different feature sets from the same image sets. These are then used for classification using different machine learning techniques. The experimental results are analyzed and compared in terms of classification accuracy to find the best algorithm for the two feature sets.</p>


Sign in / Sign up

Export Citation Format

Share Document