Description and Initial Analysis of Cyberbullying Dataset

In this chapter, the authors focus on datasets used in cyberbullying detection research. They describe and compare several datasets applied in previous research and describe in detail the dataset that they decided to apply in their research. They also perform an initial analysis of the dataset to find various characteristics. They preprocess the dataset in several ways for further use and perform affect analysis to find out whether emotion-related features tend to be characteristic for cyberbullying. Based on the results of affect analysis, they also perform an initial attempt to classify cyberbullying data using a simple machine learning approach, which will be considered as a baseline in forthcoming chapters.

Author(s):  
Ofer M Springer ◽  
Eran O Ofek ◽  
Yair Weiss ◽  
Julian Merten

Abstract Weak lensing shear estimation typically results in per galaxy statistical errors significantly larger than the sought after gravitational signal of only a few percent. These statistical errors are mostly a result of shape-noise — an estimation error due to the diverse (and a-priori unknown) morphology of individual background galaxies. These errors are inversely proportional to the limiting angular resolution at which localized objects, such as galaxy clusters, can be probed with weak lensing shear. In this work we report on our initial attempt to reduce statistical errors in weak lensing shear estimation using a machine learning approach — training a multi-layered convolutional neural network to directly estimate the shear given an observed background galaxy image. We train, calibrate and evaluate the performance and stability of our estimator using simulated galaxy images designed to mimic the distribution of HST observations of lensed background sources in the CLASH galaxy cluster survey. Using the trained estimator, we produce weak lensing shear maps of the cores of 20 galaxy clusters in the CLASH survey, demonstrating an RMS scatter reduced by approximately 26% when compared to maps produced with a commonly used shape estimator. This is equivalent to a survey speed enhancement of approximately 60%. However, given the non-transparent nature of the machine learning approach, this result requires further testing and validation. We provide python code to train and test this estimator on both simulated and real galaxy cluster observations. We also provide updated weak lensing catalogues for the 20 CLASH galaxy clusters studied.


Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 1552-P
Author(s):  
KAZUYA FUJIHARA ◽  
MAYUKO H. YAMADA ◽  
YASUHIRO MATSUBAYASHI ◽  
MASAHIKO YAMAMOTO ◽  
TOSHIHIRO IIZUKA ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document