The Impact of the Mode of Data Representation for the Result Quality of the Detection and Filtering of Spam

Author(s):  
Reda Mohamed Hamou ◽  
Abdelmalek Amine ◽  
Moulay Tahar

Spam is now of phenomenal proportions since it represents a high percentage of total emails exchanged on the Internet. In the fight against spam, we are using this article to develop a hybrid algorithm based primarily on the probabilistic model in this case, Naïve Bayes, for weighting the terms of the matrix term -category and second place used an algorithm of unsupervised learning (K-means) to filter two classes, namely spam and ham (legitimate email). To determine the sensitive parameters that make up the classifications we are interested in studying the content of the messages by using a representation of messages using the n-gram words and characters independent of languages (because a message may be received in any language) to later decide what representation to use to get a good classification. We have chosen several metrics as evaluation to validate our results.

2013 ◽  
Vol 3 (1) ◽  
pp. 43-59
Author(s):  
Reda Mohamed Hamou ◽  
Abdelmalek Amine

Spam is now seized of the Internet in phenomenal proportions since it high represents a percentage of total emails exchanged on the Internet. In the fight against spam, the authors are interested in this article to develop a hybrid algorithm based primarily on the probabilistic model in this case Naïve Bayes for weighting the terms of the matrix term -category and second place used an algorithm of unsupervised learning (K-means) to filter two classes namely spam and ham. To determine the sensitive parameters that improve the classifications the authors are interested in studying the content of the messages by using a representation of messages by the n-gram words and characters independent of languages (because a message may be received in any language) to later decide what representation opt to get a good classification. The authors have chosen several metrics evaluation to validate their results.


Author(s):  
Reda Mohamed Hamou ◽  
Abdelmalek Amine

This chapter studies a boosting algorithm based, first, on Bayesian filters that work by establishing a correlation between the presence of certain elements in a message and the fact that they appear in general unsolicited messages (spam) or in legitimate email (ham) to calculate the probability that the message is spam and, second, on an unsupervised learning algorithm: in this case the K-means. A probabilistic technique is used to weight the terms of the matrix term-category, and K-means are used to filter the two classes (spam and ham). To determine the sensitive parameters that improve the classifications, the authors study the content of the messages by using a representation of messages by the n-gram words and characters independent of languages to later decide what representation ought to get a good classification. The work was validated by several validation measures based on recall and precision.


Author(s):  
Patrícia Rossini ◽  
Jennifer Stromer-Galley

Political conversation is at the heart of democratic societies, and it is an important precursor of political engagement. As society has become intertwined with the communication infrastructure of the Internet, we need to understand its uses and the implications of those uses for democracy. This chapter provides an overview of the core topics of scholarly concern around online citizen deliberation, focusing on three key areas of research: the standards of quality of communication and the normative stance on citizen deliberation online; the impact and importance of digital platforms in structuring political talk; and the differences between formal and informal political talk spaces. After providing a critical review of these three major areas of research, we outline directions for future research on online citizen deliberation.


2021 ◽  
Author(s):  
Ilya Mishev ◽  
Ruslan Rin

Abstract Combining the Perpendicular Bisector (PEBI) grids with the Two Point Flux Approximation (TPFA) scheme demonstrates a potential to accurately model on unstructured grids, conforming to the geological and engineering features of real grids. However, with the increased complexity and resolution of the grids, the PEBI conditions will inevitably be violated in some cells and the approximation properties will be compromised. The objective is to develop accurate and practical grid quality measures that quantify such errors. We critically evaluated the existing grid quality measures and found them lacking predictive power in several areas. The available k-orthogonality measures predict error for flow along the strata, although TPFA provides an accurate approximation. The false-positive results are not only misleading but can overwhelm further analysis. We developed the so-called "truncation error" grid measure which is probably the most accurate measure for flow through a plane face and accurately measures the error along the strata. We also quantified the error due to the face curvature. Curved faces are bound to exist in any real grid. The impact of the quality of the 2-D Delaunay triangulation on TPFA approximation properties is usually not taken into account. We investigate the impact of the size of the smallest angles that can cause considerable increase of the condition number of the matrix and an eventual loss of accuracy, demonstrated with simple examples. Based on the analysis, we provide recommendations. We also show how the size of the largest angles impacts the approximation quality of TPFA. Furthermore, we discuss the impact of the change of the permeability on the TPFA approximation. Finally, we present simple tools that reservoir engineers can use to incorporate the above-mentioned grid quality measures into a workflow. The grid quality measures discussed up to now are static. We also sketch the further extension to dynamic measures, that is, how the static measures can be used to detect change in the flow behavior, potentially leading to increased error. We investigate a comprehensive set of methods, several of them new, to measure the static grid quality of TPFA on PEBI grids and possible extension to dynamic measures. All measures can be easily implemented in production reservoir simulators and examined using the suggested tools in a workflow.


Author(s):  
Mahmoud Elkhodr ◽  
Seyed Shahrestani ◽  
Hon Cheung

The Internet of Things (IoT) brings connectivity to about every objects found in the physical space. It extends connectivity not only to computer and mobile devices but also to everyday objects. From connected fridges, cars and cities, the IoT creates opportunities in numerous domains. This chapter briefly surveys some IoT applications and the impact the IoT could have on societies. It shows how the various application of the IoT enhances the overall quality of life and reduces management and costs in various sectors.


Author(s):  
David L. Scott

Outcomes evaluate the impact of disease. In rheumatology they span measures of disease activity, end-organ damage, and quality of life. Some outcomes are categorical, such as the presence or absence of remission. Other outcomes involve extended numeric scales such as joint counts, radiographic scores, and quality of life measures. Outcomes can be measured in the short term—weeks and months—or over years and decades. Short-term outcomes, though readily related to treatment, may have less relevance for patients. Clinical trials focus on short-term outcomes whereas observational studies explore longer-term outcomes. The matrix of rheumatic disease outcomes is exemplified by rheumatoid arthritis. Its outcomes span disease activity assessments like joint counts, damage assessed by erosive scores, quality of life evaluated by disease-specific measures like the Health Assessment Questionnaire (HAQ) or generic measures like the Short Form 36 (SF-36), overall assessments like remission, and end result such as joint replacement or death. Outcome measures capture the impact of treating rheumatic diseases. They are influenced by disease severity and effective treatment. They also reflect many confounding factors. These include demographic factors like age, gender, and ethnicity and also deprivation, as poverty worsens outcomes. Comorbidities affect outcomes and patients with multiple comorbid conditions have worse quality of life with poorer outcomes. Patient self-assessment has grown in importance; it is simple and understandable. However, self-assessment can vary over time and does not always reflect assessors’ perspectives. Caution is needed comparing outcomes across units; the various confounding factors and measurement complexities make such comparative analyses challenging.


Author(s):  
Izabella Lejbkowicz

The exponential development of Information Technologies revolutionized healthcare. A significant aspect of this revolution is the access to health information in the Internet. The Internet World Stats estimates that 56.8% of the world population used the Internet in March 2019, an increase of 1,066% from 2000. According to The Pew Research Center survey of 2012 81% of Americans used the internet and 72% of them searched for health information. Even though there is a lack in more recent data on the percentage of online health information seekers, it is clear that this trend is on the rise. This chapter focuses on the characteristics of the search for online health information by patients and providers, investigates features related to the quality of health web sites, and discusses the impact of these searches on healthcare.


2021 ◽  
Vol 19 (34) ◽  
Author(s):  
Jelena Jevtić ◽  
Milan S. Dajić

Social networks are a way of creating a virtual identity and entering into relationships with strangers in a series of interactions that were not known to a man before the existence of the Internet. Mobile phones and the virtual world often create a personality of a person that is not the same in the real world. It can be said that technology has changed the course of humanity and human consciousness and contributed to many changes in the mentality of society, especially among the youth. Children are often overwhelmed by materialism and jealousy, which further encourages them to become an unconscious, immoral and unambitious population. One of the negative effects of social networks is the abuse of privacy, which is also becoming a growing problem everywhere in the world and should not be ignored. However, a positive attitude should be maintained when it comes to social networks, because they facilitate communication, access to information and learning, greater availability of services and free advertising of some products or services. High school students use the Internet intensively every day, and the work raises the question of whether they use it constructively or destructively. The research was conducted in 2019, the population of high school students was observed and 100 students were included on the territory of Belgrade, Niš and Vitina.


2008 ◽  
Vol 55 (2) ◽  
pp. 233-248 ◽  
Author(s):  
Tassos Patokos

Since its early days, the Internet has been used by the music industry as a powerful marketing tool to promote artists and their products. Nevertheless, technology developments of the past ten years, and especially the ever-growing phenomenon of file sharing, have created the general impression that the Internet is responsible for a crisis within the industry, on the grounds that music piracy has become more serious than it has ever been. The purpose of this paper is to present the impact of new technologies and the Internet on the three main actors of the music industry: consumers, artists and record companies. It is claimed that the Internet has changed the way music is valued, and also, that it may have a direct effect on the quality of the music produced, as perceived by both artists and consumers alike.


Author(s):  
David L. Scott

Outcomes evaluate the impact of disease. In rheumatology they span measures of disease activity, end-organ damage, and quality of life. Some outcomes are categorical, such as the presence or absence of remission. Other outcomes involve extended numeric scales such as joint counts, radiographic scores, and quality of life measures. Outcomes can be measured in the short term—weeks and months—or over years and decades. Short-term outcomes, though readily related to treatment, may have less relevance for patients. Clinical trials focus on short-term outcomes whereas observational studies explore longer-term outcomes. The matrix of rheumatic disease outcomes is exemplified by rheumatoid arthritis. Its outcomes span disease activity assessments like joint counts, damage assessed by erosive scores, quality of life evaluated by disease-specific measures like the Health Assessment Questionnaire (HAQ) or generic measures like the Short Form 36 (SF-36), overall assessments like remission, and end result such as joint replacement or death. Outcome measures are used to capture the impact of treating rheumatic diseases, and are influenced by both disease severity and the effectiveness of treatment. However, they are also influenced by a range of confounding factors. Demographic factors like age, gender, and ethnicity can all have crucial impacts. Deprivation is important, as poverty invariably worsens outcomes. Finally, comorbidities affect outcomes and patients with multiple comorbid conditions usually have worse quality of life with poorer outcomes for all diseases. These multiple confounding factors mean comparing outcomes across units without adjustment will invariably show major differences.


Sign in / Sign up

Export Citation Format

Share Document