scholarly journals Classification of unlabeled online media

Author(s):  
Sakthi Kumar Arul Prakash ◽  
Conrad Tucker

Abstract This work investigates the ability to classify misinformation in online social media networks in a manner that avoids the need forground truth labels. Rather than approach the classification problem as a task for humans or machine learning algorithms, thiswork leverages user-user and user-media (i.e.,media likes) interactions to infer the type of information (fake vs. authentic) beingspread, without needing to know the actual details of the information itself. To study the inception and evolution of user-userand user-media interactions over time, we create an experimental platform that mimics the functionality of real world socialmedia networks. We develop a graphical model that considers the evolution of this network topology to model the uncertainty(entropy) propagation when fake and authentic media disseminates across the network. The creation of a real-world socialmedia network enables a wide range of hypotheses to be tested pertaining to users, their interactions with other users, andwith media content. The discovery that the entropy of user-user, and user-media interactions approximates fake and authenticmedia likes, enables us to classify fake media in an unsupervised learning manner.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sakthi Kumar Arul Prakash ◽  
Conrad Tucker

AbstractThis work investigates the ability to classify misinformation in online social media networks in a manner that avoids the need for ground truth labels. Rather than approach the classification problem as a task for humans or machine learning algorithms, this work leverages user–user and user–media (i.e.,media likes) interactions to infer the type of information (fake vs. authentic) being spread, without needing to know the actual details of the information itself. To study the inception and evolution of user–user and user–media interactions over time, we create an experimental platform that mimics the functionality of real-world social media networks. We develop a graphical model that considers the evolution of this network topology to model the uncertainty (entropy) propagation when fake and authentic media disseminates across the network. The creation of a real-world social media network enables a wide range of hypotheses to be tested pertaining to users, their interactions with other users, and with media content. The discovery that the entropy of user–user and user–media interactions approximate fake and authentic media likes, enables us to classify fake media in an unsupervised learning manner.


Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3784 ◽  
Author(s):  
Morteza Homayounfar ◽  
Amirhossein Malekijoo ◽  
Aku Visuri ◽  
Chelsea Dobbins ◽  
Ella Peltonen ◽  
...  

Smartwatch battery limitations are one of the biggest hurdles to their acceptability in the consumer market. To our knowledge, despite promising studies analyzing smartwatch battery data, there has been little research that has analyzed the battery usage of a diverse set of smartwatches in a real-world setting. To address this challenge, this paper utilizes a smartwatch dataset collected from 832 real-world users, including different smartwatch brands and geographic locations. First, we employ clustering to identify common patterns of smartwatch battery utilization; second, we introduce a transparent low-parameter convolutional neural network model, which allows us to identify the latent patterns of smartwatch battery utilization. Our model converts the battery consumption rate into a binary classification problem; i.e., low and high consumption. Our model has 85.3% accuracy in predicting high battery discharge events, outperforming other machine learning algorithms that have been used in state-of-the-art research. Besides this, it can be used to extract information from filters of our deep learning model, based on learned filters of the feature extractor, which is impossible for other models. Third, we introduce an indexing method that includes a longitudinal study to quantify smartwatch battery quality changes over time. Our novel findings can assist device manufacturers, vendors and application developers, as well as end-users, to improve smartwatch battery utilization.


Proceedings ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 20
Author(s):  
Diego Pacheco Prado ◽  
Luis Ángel Ruiz

GEOBIA is an alternative to create and update land cover maps. In this work we assessed the combination of geographic datasets of the Cajas National Park (Ecuador) to detect which is the appropriate dataset-algorithm combination for the classification tasks in the Ecuadorian Andean region. The datasets included high resolution data as photogrammetric orthomosaic, DEM and derivated slope. These data were compared with free Sentinel imagery to classify natural land covers. We evaluated two aspects of the classification problem: the appropriate algorithm and the dataset combination. We evaluated SMO, C4.5 and Random Forest algorithms for the selection of attributes and classification of objects. The best results of kappa in the comparison of algorithms of classification were obtained with SMO (0.8182) and Random Forest (0.8117). In the evaluation of datasets the kappa values of the photogrammetry orthomosaic and the combination of Sentinel 1 and 2 have similar values using the C4.5 algorithm.


Drones ◽  
2021 ◽  
Vol 5 (4) ◽  
pp. 104
Author(s):  
Zaide Duran ◽  
Kubra Ozcan ◽  
Muhammed Enes Atik

With the development of photogrammetry technologies, point clouds have found a wide range of use in academic and commercial areas. This situation has made it essential to extract information from point clouds. In particular, artificial intelligence applications have been used to extract information from point clouds to complex structures. Point cloud classification is also one of the leading areas where these applications are used. In this study, the classification of point clouds obtained by aerial photogrammetry and Light Detection and Ranging (LiDAR) technology belonging to the same region is performed by using machine learning. For this purpose, nine popular machine learning methods have been used. Geometric features obtained from point clouds were used for the feature spaces created for classification. Color information is also added to these in the photogrammetric point cloud. According to the LiDAR point cloud results, the highest overall accuracies were obtained as 0.96 with the Multilayer Perceptron (MLP) method. The lowest overall accuracies were obtained as 0.50 with the AdaBoost method. The method with the highest overall accuracy was achieved with the MLP (0.90) method. The lowest overall accuracy method is the GNB method with 0.25 overall accuracy.


2020 ◽  
Author(s):  
Vincent Prost ◽  
Stéphane Gazut ◽  
Thomas Brüls

AbstractThe advent of high-throughput metagenomic sequencing has prompted the development of efficient taxonomic profiling methods allowing to measure the presence, abundance and phylogeny of organisms in a wide range of environmental samples. Multivariate sequence-derived abundance data further has the potential to enable inference of ecological associations between microbial populations, but several technical issues need to be accounted for, like the compositional nature of the data, its extreme sparsity and overdispersion, as well as the frequent need to operate in under-determined regimes.The ecological network reconstruction problem is frequently cast into the paradigm of Gaussian Graphical Models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso and neighborhood selection. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros (as opposed to sampling zeros) corresponding to true absences of biological signals fail to be properly handled by most statistical methods.We present here a zero-inflated log-normal graphical model (available at https://github.com/vincentprost/Zi-LN) specifically aimed at handling such “biological” zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets.


2016 ◽  
Vol 12 (S325) ◽  
pp. 173-179 ◽  
Author(s):  
Qi Feng ◽  
Tony T. Y. Lin ◽  

AbstractImaging atmospheric Cherenkov telescopes (IACTs) are sensitive to rare gamma-ray photons, buried in the background of charged cosmic-ray (CR) particles, the flux of which is several orders of magnitude greater. The ability to separate gamma rays from CR particles is important, as it is directly related to the sensitivity of the instrument. This gamma-ray/CR-particle classification problem in IACT data analysis can be treated with the rapidly-advancing machine learning algorithms, which have the potential to outperform the traditional box-cut methods on image parameters. We present preliminary results of a precise classification of a small set of muon events using a convolutional neural networks model with the raw images as input features. We also show the possibility of using the convolutional neural networks model for regression problems, such as the radius and brightness measurement of muon events, which can be used to calibrate the throughput efficiency of IACTs.


2021 ◽  
Vol 35 (2) ◽  
pp. 139-144
Author(s):  
Ashok Kumar Nanduri ◽  
G.L. Sravanthi ◽  
K.V.K.V.L. Pavan Kumar ◽  
Sadhu Ratna Babu ◽  
K.V.S.S. Rama Krishna

The extensive use of online media and sharing of data has given considerable benefits to humankind. Sentimental analysis has become the most dynamic and famous application area in current days, which is mainly used in knowing the public's opinion. Most algorithms of machine learning are used as principle methods for sentimental analysis. Even though several methods are available for classification and reviews, all of them belong to a single class of classification which differs among several different classes. No methods are available for the classifying of multi-class instances. Therefore, fuzzy methods are used for classifying the instances depended on multi-class for achieving a clear-cut view by indicating suitable labels to objects during the classification of text. This paper includes the categorization of cyberhate information. If there is a growth in dislike speeches of the online social network may lead to a worse impact amongst social activities, which causes tensions among communication and regional. So, there is the most demand for cyberhate conversation detection automatically through online social media. Generally, an updated process of fuzzy words is designed that includes two stages of training for the classification of cyberhate conversation into 4 forms, race, disability, sexual orientation, and religion. Depended on the types of classification, experiments have been conducted on these four forms by gathering different conversations through online media. Systems based on rules of fuzzy approach have been used. This fuzzy with rule-based is for the classification of features using Machine Learning techniques such as the words that implants for future bag-of-words and extraction methods. In this, the cyberhate conversations are taken from OSN's depended on the attributes defined in a dataset using rule-based fuzzy.


2021 ◽  
Vol 17 (6) ◽  
pp. e1009089
Author(s):  
Vincent Prost ◽  
Stéphane Gazut ◽  
Thomas Brüls

The advent of high-throughput metagenomic sequencing has prompted the development of efficient taxonomic profiling methods allowing to measure the presence, abundance and phylogeny of organisms in a wide range of environmental samples. Multivariate sequence-derived abundance data further has the potential to enable inference of ecological associations between microbial populations, but several technical issues need to be accounted for, like the compositional nature of the data, its extreme sparsity and overdispersion, as well as the frequent need to operate in under-determined regimes. The ecological network reconstruction problem is frequently cast into the paradigm of Gaussian Graphical Models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso and neighborhood selection. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros (as opposed to sampling zeros) corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present here a zero-inflated log-normal graphical model (available at https://github.com/vincentprost/Zi-LN) specifically aimed at handling such “biological” zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0247059
Author(s):  
Yoshitake Kitanishi ◽  
Masakazu Fujiwara ◽  
Bruce Binkowitz

Health insurance and acute hospital-based claims have recently become available as real-world data after marketing in Japan and, thus, classification and prediction using the machine learning approach can be applied to them. However, the methodology used for the analysis of real-world data has been hitherto under debate and research on visualizing the patient journey is still inconclusive. So far, to classify diseases based on medical histories and patient demographic background and to predict the patient prognosis for each disease, the correlation structure of real-world data has been estimated by machine learning. Therefore, we applied association analysis to real-world data to consider a combination of disease events as the patient journey for depression diagnoses. However, association analysis makes it difficult to interpret multiple outcome measures simultaneously and comprehensively. To address this issue, we applied the Topological Data Analysis (TDA) Mapper to sequentially interpret multiple indices, thus obtaining a visual classification of the diseases commonly associated with depression. Under this approach, the visual and continuous classification of related diseases may contribute to precision medicine research and can help pharmaceutical companies provide appropriate personalized medical care.


Sign in / Sign up

Export Citation Format

Share Document