Majority Vote method for preferences detection: Application for Social Networks

Social networks play an important role in today’s society and in our relationships with others. They give the Internet user the opportunity to play an active role, e.g., one can relay certain information via a blog, a comment, or even a vote. The Internet user has the possibility to share any content at any time. However, some malicious Internet users take advantage of this freedom to share fake news to manipulate or mislead an audience, to invade the privacy of others, and also to harm certain institutions. Fake news seeks to resemble traditional media to establish its credibility with the public. Its seriousness pushes the public to share them. As a result, fake news can spread quickly. This fake news can cause enormous difficulties for users and institutions. Several authors have proposed systems to detect fake news in social networks using crowd signals through the process of crowdsourcing. Unfortunately, these authors do not use the expertise of the crowd and the expertise of a third party in an associative way to make decisions. Crowds are useful in indicating whether or not a story should be fact-checked. This work proposes a new method of binary aggregation of opinions of the crowd and the knowledge of a third-party expert. The aggregator is based on majority voting on the crowd side and weighted averaging on the third-party side. An experimentation has been conducted on 25 posts and 50 voters. A quantitative comparison with the majority vote model reveals that our aggregation model provides slightly better results due to weights assigned to accredited users. A qualitative investigation against existing aggregation models shows that the proposed approach meets the requirements or properties expected of a crowdsourcing system and a voting system.

Download Full-text

Label Noise Cleaning with an Adaptive Ensemble Method Based on Noise Detection Metric

Sensors ◽

10.3390/s20236718 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6718

Author(s):

Wei Feng ◽

Yinghui Quan ◽

Gabriel Dauphin

Keyword(s):

Majority Vote ◽

Ensemble Method ◽

Validation Dataset ◽

Ensemble Classifiers ◽

Noise Detection ◽

K Nearest Neighbor ◽

Training Set ◽

Label Noise ◽

Vote Method ◽

Real World Datasets

Real-world datasets are often contaminated with label noise; labeling is not a clear-cut process and reliable methods tend to be expensive or time-consuming. Depending on the learning technique used, such label noise is potentially harmful, requiring an increased size of the training set, making the trained model more complex and more prone to overfitting and yielding less accurate prediction. This work proposes a cleaning technique called the ensemble method based on the noise detection metric (ENDM). From the corrupted training set, an ensemble classifier is first learned and used to derive four metrics assessing the likelihood for a sample to be mislabeled. For each metric, three thresholds are set to maximize the classifying performance on a corrupted validation dataset when using three different ensemble classifiers, namely Bagging, AdaBoost and k-nearest neighbor (k-NN). These thresholds are used to identify and then either remove or correct the corrupted samples. The effectiveness of the ENDM is demonstrated in performing the classification of 15 public datasets. A comparative analysis is conducted concerning the homogeneous-ensembles-based majority vote method and consensus vote method, two popular ensemble-based label noise filters.

Download Full-text

Majority Vote in Social Networks: Make Random Friends or Be Stubborn to Overpower Elites

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/49 ◽

2021 ◽

Author(s):

Charlotte Out ◽

Ahad N. Zehmakan

Keyword(s):

Social Networks ◽

Real World ◽

Majority Vote ◽

Graph Models ◽

Stabilization Time ◽

The Real ◽

Random Graph Models ◽

Small Set ◽

High Degree ◽

Majority Model

Consider a graph G, representing a social network. Assume that initially each node is colored either black or white, which corresponds to a positive or negative opinion regarding a consumer product or a technological innovation. In the majority model, in each round all nodes simultaneously update their color to the most frequent color among their connections. Experiments on the graph data from the real world social networks (SNs) suggest that if all nodes in an extremely small set of high-degree nodes, often referred to as the elites, agree on a color, that color becomes the dominant color at the end of the process. We propose two countermeasures that can be adopted by individual nodes relatively easily and guarantee that the elites will not have this disproportionate power to engineer the dominant output color. The first countermeasure essentially requires each node to make some new connections at random while the second one demands the nodes to be more reluctant towards changing their color (opinion). We verify their effectiveness and correctness both theoretically and experimentally. We also investigate the majority model and a variant of it when the initial coloring is random on the real world SNs and several random graph models. In particular, our results on the Erdős-Rényi, and regular random graphs confirm or support several theoretical findings or conjectures by the prior work regarding the threshold behavior of the process. Finally, we provide theoretical and experimental evidence for the existence of a poly-logarithmic bound on the expected stabilization time of the majority model.

Download Full-text

The Impact of Cross-Species Gene Flow on Species Tree Estimation

Systematic Biology ◽

10.1093/sysbio/syaa001 ◽

2020 ◽

Vol 69 (5) ◽

pp. 830-847 ◽

Cited By ~ 1

Author(s):

Xiyun Jiao ◽

Tomáš Flouri ◽

Bruce Rannala ◽

Ziheng Yang

Keyword(s):

Gene Flow ◽

Sequence Data ◽

Majority Vote ◽

Gene Tree ◽

Species Tree ◽

Likelihood Method ◽

Estimation Methods ◽

Vote Method ◽

Multispecies Coalescent ◽

Tree Estimation

Abstract Recent analyses of genomic sequence data suggest cross-species gene flow is common in both plants and animals, posing challenges to species tree estimation. We examine the levels of gene flow needed to mislead species tree estimation with three species and either episodic introgressive hybridization or continuous migration between an outgroup and one ingroup species. Several species tree estimation methods are examined, including the majority-vote method based on the most common gene tree topology (with either the true or reconstructed gene trees used), the UPGMA method based on the average sequence distances (or average coalescent times) between species, and the full-likelihood method based on multilocus sequence data. Our results suggest that the majority-vote method based on gene tree topologies is more robust to gene flow than the UPGMA method based on coalescent times and both are more robust than likelihood assuming a multispecies coalescent (MSC) model with no cross-species gene flow. Comparison of the continuous migration model with the episodic introgression model suggests that a small amount of gene flow per generation can cause drastic changes to the genetic history of the species and mislead species tree methods, especially if the species diverged through radiative speciation events. Estimates of parameters under the MSC with gene flow suggest that African mosquito species in the Anopheles gambiae species complex constitute such an example of extreme impact of gene flow on species phylogeny. [IM; introgression; migration; MSci; multispecies coalescent; species tree.]

Download Full-text

Malicious URL detection Using majority vote method with machine learning and deep learning models

2020 International Conference on Interdisciplinary Cyber Physical Systems (ICPS) ◽

10.1109/icps51508.2020.00013 ◽

2020 ◽

Author(s):

A.C Rakotoasimbahoaka ◽

I. Randria ◽

N.R Razafindrakoto

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Majority Vote ◽

Learning Models ◽

Vote Method

Download Full-text

The Impact of Cross-Species Gene Flow on Species Tree Estimation

10.1101/820019 ◽

2019 ◽

Cited By ~ 1

Author(s):

Xiyun Jiao ◽

Thomas Flouris ◽

Bruce Rannala ◽

Ziheng Yang

Keyword(s):

Gene Flow ◽

Sequence Data ◽

Majority Vote ◽

Gene Tree ◽

Species Tree ◽

Likelihood Method ◽

Estimation Methods ◽

Vote Method ◽

Tree Estimation ◽

The Impact

ABSTRACTRecent analyses of genomic sequence data suggest cross-species gene flow is common in both plants and animals, posing challenges to species tree inference. We examine the levels of gene flow needed to mislead species tree estimation with three species and either episodic introgressive hybridization or continuous migration between an outgroup and one ingroup species. Several species tree estimation methods are examined, including the majority-vote method based on the most common gene tree topology (with either the true or reconstructed gene trees used), the UPGMA method based on the average sequence distances (or average coalescent times) between species, and the full-likelihood method based on multi-locus sequence data. Our results suggest that the majority-vote method is more robust to gene flow than the UPGMA method and both are more robust than likelihood assuming a multispecies coalescent (MSC) model with no cross-species gene flow. A small amount of introgression or migration can mislead species tree methods if the species diverged through speciation events separated by short time intervals. Estimates of parameters under the MSC with gene flow suggest the Anopheles gambia African mosquito species complex is an example where gene flow greatly impacts species phylogeny.

Download Full-text