scholarly journals Unsupervised Detection of Sub-Events in Large Scale Disasters

2020 ◽  
Vol 34 (01) ◽  
pp. 354-361 ◽  
Author(s):  
Chidubem Arachie ◽  
Manas Gaur ◽  
Sam Anzaroot ◽  
William Groves ◽  
Ke Zhang ◽  
...  

Social media plays a major role during and after major natural disasters (e.g., hurricanes, large-scale fires, etc.), as people “on the ground” post useful information on what is actually happening. Given the large amounts of posts, a major challenge is identifying the information that is useful and actionable. Emergency responders are largely interested in finding out what events are taking place so they can properly plan and deploy resources. In this paper we address the problem of automatically identifying important sub-events (within a large-scale emergency “event”, such as a hurricane). In particular, we present a novel, unsupervised learning framework to detect sub-events in Tweets for retrospective crisis analysis. We first extract noun-verb pairs and phrases from raw tweets as sub-event candidates. Then, we learn a semantic embedding of extracted noun-verb pairs and phrases, and rank them against a crisis-specific ontology. We filter out noisy and irrelevant information then cluster the noun-verb pairs and phrases so that the top-ranked ones describe the most important sub-events. Through quantitative experiments on two large crisis data sets (Hurricane Harvey and the 2015 Nepal Earthquake), we demonstrate the effectiveness of our approach over the state-of-the-art. Our qualitative evaluation shows better performance compared to our baseline.

Author(s):  
Emma S. Spiro

Social media have become critical components of all phases of crisis management, including preparedness, response, and recovery. Numerous recent events have demonstrated that during extreme occurrences (such as natural hazards, civil unrest, and domestic terrorist attacks), social media platforms are appropriated for response activities, providing new infrastructure for official responders to disseminate event-related information, interact with members of the public, and monitor public opinion. Emergency responders recognize the potential of social media platforms and actively use these technologies to share information and connect with constituents; however, many questions remain about the effectiveness of social media platforms in reaching members of the public during times of crisis. Moreover, there is a strong tendency for research to focus on the behavior of the public rather than on that of official emergency responders. This chapter reviews prior and ongoing work that contributes to our understanding of usage practices and the effectiveness of networked online communication during times of crisis. In particular, it focuses on empirically driven research that utilizes large-scale data sets of behavioral traces captured from social media platforms. Together this body of work demonstrates how computational techniques combined with rich, curated data sets can be used to explore information and communication behaviors in online networks.


Author(s):  
Zhou Zhao ◽  
Lingtao Meng ◽  
Jun Xiao ◽  
Min Yang ◽  
Fei Wu ◽  
...  

Retweet prediction is a challenging problem in social media sites (SMS). In this paper, we study the problem of image retweet prediction in social media, which predicts the image sharing behavior that the user reposts the image tweets from their followees. Unlike previous studies, we learn user preference ranking model from their past retweeted image tweets in SMS. We first propose heterogeneous image retweet modeling network (IRM) that exploits users' past retweeted image tweets with associated contexts, their following relations in SMS and preference of their followees. We then develop a novel attentional multi-faceted ranking network learning framework with multi-modal neural networks for the proposed heterogenous IRM network to learn the joint image tweet representations and user preference representations for prediction task. The extensive experiments on a large-scale dataset from Twitter site shows that our method achieves better performance than other state-of-the-art solutions to the problem.


Author(s):  
Yuheng Hu ◽  
Yili Hong

Residents often rely on newspapers and television to gather hyperlocal news for community awareness and engagement. More recently, social media have emerged as an increasingly important source of hyperlocal news. Thus far, the literature on using social media to create desirable societal benefits, such as civic awareness and engagement, is still in its infancy. One key challenge in this research stream is to timely and accurately distill information from noisy social media data streams to community members. In this work, we develop SHEDR (social media–based hyperlocal event detection and recommendation), an end-to-end neural event detection and recommendation framework with a particular use case for Twitter to facilitate residents’ information seeking of hyperlocal events. The key model innovation in SHEDR lies in the design of the hyperlocal event detector and the event recommender. First, we harness the power of two popular deep neural network models, the convolutional neural network (CNN) and long short-term memory (LSTM), in a novel joint CNN-LSTM model to characterize spatiotemporal dependencies for capturing unusualness in a region of interest, which is classified as a hyperlocal event. Next, we develop a neural pairwise ranking algorithm for recommending detected hyperlocal events to residents based on their interests. To alleviate the sparsity issue and improve personalization, our algorithm incorporates several types of contextual information covering topic, social, and geographical proximities. We perform comprehensive evaluations based on two large-scale data sets comprising geotagged tweets covering Seattle and Chicago. We demonstrate the effectiveness of our framework in comparison with several state-of-the-art approaches. We show that our hyperlocal event detection and recommendation models consistently and significantly outperform other approaches in terms of precision, recall, and F-1 scores. Summary of Contribution: In this paper, we focus on a novel and important, yet largely underexplored application of computing—how to improve civic engagement in local neighborhoods via local news sharing and consumption based on social media feeds. To address this question, we propose two new computational and data-driven methods: (1) a deep learning–based hyperlocal event detection algorithm that scans spatially and temporally to detect hyperlocal events from geotagged Twitter feeds; and (2) A personalized deep learning–based hyperlocal event recommender system that systematically integrates several contextual cues such as topical, geographical, and social proximity to recommend the detected hyperlocal events to potential users. We conduct a series of experiments to examine our proposed models. The outcomes demonstrate that our algorithms are significantly better than the state-of-the-art models and can provide users with more relevant information about the local neighborhoods that they live in, which in turn may boost their community engagement.


2020 ◽  
Vol 34 (04) ◽  
pp. 4412-4419 ◽  
Author(s):  
Zhao Kang ◽  
Wangtao Zhou ◽  
Zhitong Zhao ◽  
Junming Shao ◽  
Meng Han ◽  
...  

A plethora of multi-view subspace clustering (MVSC) methods have been proposed over the past few years. Researchers manage to boost clustering accuracy from different points of view. However, many state-of-the-art MVSC algorithms, typically have a quadratic or even cubic complexity, are inefficient and inherently difficult to apply at large scales. In the era of big data, the computational issue becomes critical. To fill this gap, we propose a large-scale MVSC (LMVSC) algorithm with linear order complexity. Inspired by the idea of anchor graph, we first learn a smaller graph for each view. Then, a novel approach is designed to integrate those graphs so that we can implement spectral clustering on a smaller graph. Interestingly, it turns out that our model also applies to single-view scenario. Extensive experiments on various large-scale benchmark data sets validate the effectiveness and efficiency of our approach with respect to state-of-the-art clustering methods.


2021 ◽  
Author(s):  
Sergey Levchenko ◽  
Yaqiong Zhong ◽  
Xiaojuan Hu ◽  
Debalaya Sarker ◽  
Qingrui Xia ◽  
...  

Abstract Thermoelectric (TE) materials are among very few sustainable yet feasible energy solutions of present time. This huge promise of energy harvesting is contingent on identifying/designing materials having higher efficiency than presently available ones. However, due to the vastness of the chemical space of materials, only its small fraction was scanned experimentally and/or computationally so far. Employing a compressed-sensing based symbolic regression in an active-learning framework, we have not only identified a trend in materials’ compositions for superior TE performance, but have also predicted and experimentally synthesized several extremely high performing novel TE materials. Among these, we found polycrystalline p-type Cu0.45Ag0.55GaTe2 to possess an experimental figure of merit as high as ~2.8 at 827 K. This is a breakthrough in the field, because all previously known thermoelectric materials with a comparable figure of merit are either unstable or much more difficult to synthesize, rendering them unusable in large-scale applications. The presented methodology demonstrates the importance and tremendous potential of physically informed descriptors in material science, in particular for relatively small data sets typically available from experiments at well-controlled conditions.


Author(s):  
Sonali Gaikwad ◽  
Tejashri Borate ◽  
Nandpriya Ashtekar ◽  
Umadevi Lade

Social Media Platforms involve not millions but billions of users around the globe. Interactions on these easily available social media sites like Twitter have a huge impact on people. Nowadays, there is undesirable negative impact for daily life. These hugely used major platforms of communication have now become a great source of dispersing unwanted data and irrelevant information, Twitter being one of the most extravagant social media platform in our times, the topmost popular microblogging services is now used as a weapon to share unethical, unreasonable amount of opinions, media. In this proposed work the dishonouring comments, tweets towards people are categorized into 9 types. The tweets are further classifies into one of these types or non-shaming tweets towards people. Observation says out of the multitude of taking an interested clients who posts remarks on a specific occasion, lions share are probably going to modify the person in question. Moreover, it is not the nonshaming devotee who checks the increment quicker but of shaming in twitter.


Author(s):  
Zhou Zhao ◽  
Ben Gao ◽  
Vincent W. Zheng ◽  
Deng Cai ◽  
Xiaofei He ◽  
...  

Link prediction is a challenging problem for complex network analysis, arising in many disciplines such as social networks and telecommunication networks. Currently, many existing approaches estimate the proximity of the link endpoints for link prediction from their feature or the local neighborhood around them, which suffer from the localized view of network connections and insufficiency of discriminative feature representation. In this paper, we consider the problem of link prediction from the viewpoint of learning discriminative path-based proximity ranking metric embedding. We propose a novel ranking metric network learning framework by jointly exploiting both node-level and path-level attentional proximity of the endpoints for link prediction. We then develop the path-based dual-level reasoning attentional learning method with recurrent neural network for proximity ranking metric embedding. The extensive experiments on two large-scale datasets show that our method achieves better performance than other state-of-the-art solutions to the problem.


Author(s):  
Xiaoxiao Sun ◽  
Liyi Chen ◽  
Jufeng Yang

Fine-grained classification is absorbed in recognizing the subordinate categories of one field, which need a large number of labeled images, while it is expensive to label these images. Utilizing web data has been an attractive option to meet the demands of training data for convolutional neural networks (CNNs), especially when the well-labeled data is not enough. However, directly training on such easily obtained images often leads to unsatisfactory performance due to factors such as noisy labels. This has been conventionally addressed by reducing the noise level of web data. In this paper, we take a fundamentally different view and propose an adversarial discriminative loss to advocate representation coherence between standard and web data. This is further encapsulated in a simple, scalable and end-to-end trainable multi-task learning framework. We experiment on three public datasets using large-scale web data to evaluate the effectiveness and generalizability of the proposed approach. Extensive experiments demonstrate that our approach performs favorably against the state-of-the-art methods.


2021 ◽  
Author(s):  
Gabriele Scalia ◽  
Chiara Francalanci ◽  
Barbara Pernici

AbstractInformation extracted from social media has proven to be very useful in the domain of emergency management. An important task in emergency management is rapid crisis mapping, which aims to produce timely and reliable maps of affected areas. During an emergency, the volume of emergency-related posts is typically large, but only a small fraction is relevant and help rapid mapping effectively. Furthermore, posts are not useful for mapping purposes unless they are correctly geolocated and, on average, less than 2% of posts are natively georeferenced. This paper presents an algorithm, called CIME, that aims to identify and geolocate emergency-related posts that are relevant for mapping purposes. While native geocoordinates are most often missing, many posts contain geographical references in their metadata, such as texts or links that can be used by CIME to filter and geolocate information. In addition, social media creates a social network and each post can be enhanced with indirect information from the post’s network of relationships with other posts (for example, a retweet can be associated with other geographical references which are useful to geolocate the original tweet). To exploit all this information, CIME uses the concept of context, defined as the information characterizing a post both directly (the post’s metadata) and indirectly (the post’s network of relationships). The algorithm was evaluated on a recent major emergency event demonstrating better performance with respect to the state of the art in terms of total number of geolocated posts, geolocation accuracy and relevance for rapid mapping.


2015 ◽  
Author(s):  
Stinus Lindgreen ◽  
Karen L Adair ◽  
Paul Gardner

Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html


Sign in / Sign up

Export Citation Format

Share Document