The Impact of the Mode of Data Representation for the Result Quality of the Detection and Filtering of Spam

Ontologies and Big Data Considerations for Effective Intelligence - Advances in Information Quality and Management ◽

10.4018/978-1-5225-2058-0.ch004 ◽

2017 ◽

pp. 150-168

Author(s):

Reda Mohamed Hamou ◽

Abdelmalek Amine ◽

Moulay Tahar

Keyword(s):

Unsupervised Learning ◽

Probabilistic Model ◽

Hybrid Algorithm ◽

Data Representation ◽

The Internet ◽

The Matrix ◽

N Gram ◽

Sensitive Parameters ◽

The Impact

Spam is now of phenomenal proportions since it represents a high percentage of total emails exchanged on the Internet. In the fight against spam, we are using this article to develop a hybrid algorithm based primarily on the probabilistic model in this case, Naïve Bayes, for weighting the terms of the matrix term -category and second place used an algorithm of unsupervised learning (K-means) to filter two classes, namely spam and ham (legitimate email). To determine the sensitive parameters that make up the classifications we are interested in studying the content of the messages by using a representation of messages using the n-gram words and characters independent of languages (because a message may be received in any language) to later decide what representation to use to get a good classification. We have chosen several metrics as evaluation to validate our results.

Download Full-text

The Impact of the Mode of Data Representation for the Result Quality of the Detection and Filtering of Spam

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2013010103 ◽

2013 ◽

Vol 3 (1) ◽

pp. 43-59

Author(s):

Reda Mohamed Hamou ◽

Abdelmalek Amine

Keyword(s):

Probabilistic Model ◽

Hybrid Algorithm ◽

Data Representation ◽

The Internet ◽

Good Classification ◽

The Matrix ◽

N Gram ◽

Sensitive Parameters ◽

The Impact

Spam is now seized of the Internet in phenomenal proportions since it high represents a percentage of total emails exchanged on the Internet. In the fight against spam, the authors are interested in this article to develop a hybrid algorithm based primarily on the probabilistic model in this case Naïve Bayes for weighting the terms of the matrix term -category and second place used an algorithm of unsupervised learning (K-means) to filter two classes namely spam and ham. To determine the sensitive parameters that improve the classifications the authors are interested in studying the content of the messages by using a representation of messages by the n-gram words and characters independent of languages (because a message may be received in any language) to later decide what representation opt to get a good classification. The authors have chosen several metrics evaluation to validate their results.

Download Full-text

Using Data Mining Techniques and the Choice of Mode of Text Representation for Improving the Detection and Filtering of Spam

Advances in Business Information Systems and Analytics - Handbook of Research on Organizational Transformations through Big Data Analytics ◽

10.4018/978-1-4666-7272-7.ch018 ◽

2015 ◽

pp. 300-319

Author(s):

Reda Mohamed Hamou ◽

Abdelmalek Amine

Keyword(s):

Data Mining ◽

Unsupervised Learning ◽

Learning Algorithm ◽

Data Mining Techniques ◽

Probabilistic Technique ◽

The Matrix ◽

Boosting Algorithm ◽

N Gram ◽

Using Data ◽

Sensitive Parameters

This chapter studies a boosting algorithm based, first, on Bayesian filters that work by establishing a correlation between the presence of certain elements in a message and the fact that they appear in general unsolicited messages (spam) or in legitimate email (ham) to calculate the probability that the message is spam and, second, on an unsupervised learning algorithm: in this case the K-means. A probabilistic technique is used to weight the terms of the matrix term-category, and K-means are used to filter the two classes (spam and ham). To determine the sensitive parameters that improve the classifications, the authors study the content of the messages by using a representation of messages by the n-gram words and characters independent of languages to later decide what representation ought to get a good classification. The work was validated by several validation measures based on recall and precision.

Download Full-text

Citizen Deliberation Online

The Oxford Handbook of Electoral Persuasion ◽

10.1093/oxfordhb/9780190860806.013.14 ◽

2019 ◽

pp. 689-712

Author(s):

Patrícia Rossini ◽

Jennifer Stromer-Galley

Keyword(s):

Political Engagement ◽

Future Research ◽

The Internet ◽

Digital Platforms ◽

The Core ◽

Political Conversation ◽

Political Talk ◽

Citizen Deliberation ◽

The Impact

Political conversation is at the heart of democratic societies, and it is an important precursor of political engagement. As society has become intertwined with the communication infrastructure of the Internet, we need to understand its uses and the implications of those uses for democracy. This chapter provides an overview of the core topics of scholarly concern around online citizen deliberation, focusing on three key areas of research: the standards of quality of communication and the normative stance on citizen deliberation online; the impact and importance of digital platforms in structuring political talk; and the differences between formal and informal political talk spaces. After providing a critical review of these three major areas of research, we outline directions for future research on online citizen deliberation.

Download Full-text

Grid Quality Measures for PEBI Grids

10.2118/203961-ms ◽

2021 ◽

Author(s):

Ilya Mishev ◽

Ruslan Rin

Keyword(s):

Flow Behavior ◽

Quality Measures ◽

Approximation Properties ◽

Approximation Quality ◽

Flux Approximation ◽

The Matrix ◽

Eventual Loss ◽

Dynamic Measures ◽

The Impact

Abstract Combining the Perpendicular Bisector (PEBI) grids with the Two Point Flux Approximation (TPFA) scheme demonstrates a potential to accurately model on unstructured grids, conforming to the geological and engineering features of real grids. However, with the increased complexity and resolution of the grids, the PEBI conditions will inevitably be violated in some cells and the approximation properties will be compromised. The objective is to develop accurate and practical grid quality measures that quantify such errors. We critically evaluated the existing grid quality measures and found them lacking predictive power in several areas. The available k-orthogonality measures predict error for flow along the strata, although TPFA provides an accurate approximation. The false-positive results are not only misleading but can overwhelm further analysis. We developed the so-called "truncation error" grid measure which is probably the most accurate measure for flow through a plane face and accurately measures the error along the strata. We also quantified the error due to the face curvature. Curved faces are bound to exist in any real grid. The impact of the quality of the 2-D Delaunay triangulation on TPFA approximation properties is usually not taken into account. We investigate the impact of the size of the smallest angles that can cause considerable increase of the condition number of the matrix and an eventual loss of accuracy, demonstrated with simple examples. Based on the analysis, we provide recommendations. We also show how the size of the largest angles impacts the approximation quality of TPFA. Furthermore, we discuss the impact of the change of the permeability on the TPFA approximation. Finally, we present simple tools that reservoir engineers can use to incorporate the above-mentioned grid quality measures into a workflow. The grid quality measures discussed up to now are static. We also sketch the further extension to dynamic measures, that is, how the static measures can be used to detect change in the flow behavior, potentially leading to increased error. We investigate a comprehensive set of methods, several of them new, to measure the static grid quality of TPFA on PEBI grids and possible extension to dynamic measures. All measures can be easily implemented in production reservoir simulators and examined using the suggested tools in a workflow.

Download Full-text

Internet of Things Applications

Innovative Research and Applications in Next-Generation High Performance Computing - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-5225-0287-6.ch016 ◽

2016 ◽

pp. 397-427 ◽

Cited By ~ 2

Author(s):

Mahmoud Elkhodr ◽

Seyed Shahrestani ◽

Hon Cheung

Keyword(s):

Quality Of Life ◽

Internet Of Things ◽

Mobile Devices ◽

Physical Space ◽

The Internet ◽

Everyday Objects ◽

Iot Applications ◽

The Impact ◽

The Internet Of Things

The Internet of Things (IoT) brings connectivity to about every objects found in the physical space. It extends connectivity not only to computer and mobile devices but also to everyday objects. From connected fridges, cars and cities, the IoT creates opportunities in numerous domains. This chapter briefly surveys some IoT applications and the impact the IoT could have on societies. It shows how the various application of the IoT enhances the overall quality of life and reduces management and costs in various sectors.

Download Full-text

Outcomes

Oxford Textbook of Rheumatology ◽

10.1093/med/9780199642489.003.0029_update_002 ◽

2013 ◽

pp. 221-226

Author(s):

David L. Scott

Keyword(s):

Quality Of Life ◽

Disease Activity ◽

Short Form ◽

Organ Damage ◽

Confounding Factors ◽

Short Term ◽

Self Assessment ◽

The Matrix ◽

The Impact

Outcomes evaluate the impact of disease. In rheumatology they span measures of disease activity, end-organ damage, and quality of life. Some outcomes are categorical, such as the presence or absence of remission. Other outcomes involve extended numeric scales such as joint counts, radiographic scores, and quality of life measures. Outcomes can be measured in the short term—weeks and months—or over years and decades. Short-term outcomes, though readily related to treatment, may have less relevance for patients. Clinical trials focus on short-term outcomes whereas observational studies explore longer-term outcomes. The matrix of rheumatic disease outcomes is exemplified by rheumatoid arthritis. Its outcomes span disease activity assessments like joint counts, damage assessed by erosive scores, quality of life evaluated by disease-specific measures like the Health Assessment Questionnaire (HAQ) or generic measures like the Short Form 36 (SF-36), overall assessments like remission, and end result such as joint replacement or death. Outcome measures capture the impact of treating rheumatic diseases. They are influenced by disease severity and effective treatment. They also reflect many confounding factors. These include demographic factors like age, gender, and ethnicity and also deprivation, as poverty worsens outcomes. Comorbidities affect outcomes and patients with multiple comorbid conditions have worse quality of life with poorer outcomes. Patient self-assessment has grown in importance; it is simple and understandable. However, self-assessment can vary over time and does not always reflect assessors’ perspectives. Caution is needed comparing outcomes across units; the various confounding factors and measurement complexities make such comparative analyses challenging.

Download Full-text

Web-Based Information for Patients and Providers

Advances in Medical Technologies and Clinical Practice - Impacts of Information Technology on Patient Care and Empowerment ◽

10.4018/978-1-7998-0047-7.ch002 ◽

2020 ◽

pp. 19-33

Author(s):

Izabella Lejbkowicz

Keyword(s):

Health Information ◽

Web Sites ◽

Information Technologies ◽

The Internet ◽

Online Health Information ◽

World Population ◽

Web Based ◽

Access To Health ◽

The Impact

The exponential development of Information Technologies revolutionized healthcare. A significant aspect of this revolution is the access to health information in the Internet. The Internet World Stats estimates that 56.8% of the world population used the Internet in March 2019, an increase of 1,066% from 2000. According to The Pew Research Center survey of 2012 81% of Americans used the internet and 72% of them searched for health information. Even though there is a lack in more recent data on the percentage of online health information seekers, it is clear that this trend is on the rise. This chapter focuses on the characteristics of the search for online health information by patients and providers, investigates features related to the quality of health web sites, and discusses the impact of these searches on healthcare.

Download Full-text

ANALYSIS OF THE IMPACT OF SOCIAL NETWORKS ON THE QUALITY OF LIFE AND EDUCATION OF YOUNG PEOPLE

ACTA ECONOMICA ◽

10.7251/ace2134239j ◽

2021 ◽

Vol 19 (34) ◽

Author(s):

Jelena Jevtić ◽

Milan S. Dajić

Keyword(s):

High School ◽

Social Networks ◽

High School Students ◽

Virtual World ◽

The Internet ◽

Access To Information ◽

Negative Effects ◽

School Students ◽

The Impact

Social networks are a way of creating a virtual identity and entering into relationships with strangers in a series of interactions that were not known to a man before the existence of the Internet. Mobile phones and the virtual world often create a personality of a person that is not the same in the real world. It can be said that technology has changed the course of humanity and human consciousness and contributed to many changes in the mentality of society, especially among the youth. Children are often overwhelmed by materialism and jealousy, which further encourages them to become an unconscious, immoral and unambitious population. One of the negative effects of social networks is the abuse of privacy, which is also becoming a growing problem everywhere in the world and should not be ignored. However, a positive attitude should be maintained when it comes to social networks, because they facilitate communication, access to information and learning, greater availability of services and free advertising of some products or services. High school students use the Internet intensively every day, and the work raises the question of whether they use it constructively or destructively. The research was conducted in 2019, the population of high school students was observed and 100 students were included on the territory of Belgrade, Niš and Vitina.

Download Full-text

A new era for the music industry: How new technologies and the internet affect the way music is valued and have an impact on output quality

Panoeconomicus ◽

10.2298/pan0802233p ◽

2008 ◽

Vol 55 (2) ◽

pp. 233-248 ◽

Cited By ~ 2

Author(s):

Tassos Patokos

Keyword(s):

New Technologies ◽

Music Industry ◽

The Internet ◽

New Era ◽

The Past ◽

Output Quality ◽

Marketing Tool ◽

The Impact ◽

The Way

Since its early days, the Internet has been used by the music industry as a powerful marketing tool to promote artists and their products. Nevertheless, technology developments of the past ten years, and especially the ever-growing phenomenon of file sharing, have created the general impression that the Internet is responsible for a crisis within the industry, on the grounds that music piracy has become more serious than it has ever been. The purpose of this paper is to present the impact of new technologies and the Internet on the three main actors of the music industry: consumers, artists and record companies. It is claimed that the Internet has changed the way music is valued, and also, that it may have a direct effect on the quality of the music produced, as perceived by both artists and consumers alike.

Download Full-text

Outcomes

10.1093/med/9780199642489.003.0029_update_001 ◽

2016 ◽

Author(s):

David L. Scott

Keyword(s):

Quality Of Life ◽

Disease Activity ◽

Short Form ◽

Health Assessment ◽

Organ Damage ◽

Confounding Factors ◽

Short Term ◽

The Matrix ◽

The Impact

Outcomes evaluate the impact of disease. In rheumatology they span measures of disease activity, end-organ damage, and quality of life. Some outcomes are categorical, such as the presence or absence of remission. Other outcomes involve extended numeric scales such as joint counts, radiographic scores, and quality of life measures. Outcomes can be measured in the short term—weeks and months—or over years and decades. Short-term outcomes, though readily related to treatment, may have less relevance for patients. Clinical trials focus on short-term outcomes whereas observational studies explore longer-term outcomes. The matrix of rheumatic disease outcomes is exemplified by rheumatoid arthritis. Its outcomes span disease activity assessments like joint counts, damage assessed by erosive scores, quality of life evaluated by disease-specific measures like the Health Assessment Questionnaire (HAQ) or generic measures like the Short Form 36 (SF-36), overall assessments like remission, and end result such as joint replacement or death. Outcome measures are used to capture the impact of treating rheumatic diseases, and are influenced by both disease severity and the effectiveness of treatment. However, they are also influenced by a range of confounding factors. Demographic factors like age, gender, and ethnicity can all have crucial impacts. Deprivation is important, as poverty invariably worsens outcomes. Finally, comorbidities affect outcomes and patients with multiple comorbid conditions usually have worse quality of life with poorer outcomes for all diseases. These multiple confounding factors mean comparing outcomes across units without adjustment will invariably show major differences.

Download Full-text