Event Prediction in Online Social Networks

2021 ◽  
Vol 2 (1) ◽  
pp. 64-94
Author(s):  
Leonard Tan ◽  
Thuan Pham ◽  
Kei Ho Hang ◽  
Seng Kok Tan

Event prediction is a very important task in numerous applications of interest like fintech, medical, security, etc. However, event prediction is a highly complex task because it is challenging to classify, contains temporally changing themes of discussion and heavy topic drifts. In this research, we present a novel approach which leverages on the RFT framework developed in \cite{tan2020discovering}. This study addresses the challenge of accurately representing relational features in observed complex social communication behavior for the event prediction task; which recent graph learning methodologies are struggling with. The concept here, is to firstly learn the turbulent patterns of relational state transitions between actors preceeding an event and then secondly, to evolve these profiles temporally, in the event prediction process. The event prediction model which leverages on the RFT framework discovers, identifies and adaptively ranks relational turbulence as likelihood predictions of event occurrences. Extensive experiments on large-scale social datasets across important indicator tests for validation, show that the RFT framework performs comparably better by more than 10\% to HPM \cite{amodeo2011hybrid} and other state-of-the-art baselines in event prediction.


2020 ◽  
Vol 34 (04) ◽  
pp. 4412-4419 ◽  
Author(s):  
Zhao Kang ◽  
Wangtao Zhou ◽  
Zhitong Zhao ◽  
Junming Shao ◽  
Meng Han ◽  
...  

A plethora of multi-view subspace clustering (MVSC) methods have been proposed over the past few years. Researchers manage to boost clustering accuracy from different points of view. However, many state-of-the-art MVSC algorithms, typically have a quadratic or even cubic complexity, are inefficient and inherently difficult to apply at large scales. In the era of big data, the computational issue becomes critical. To fill this gap, we propose a large-scale MVSC (LMVSC) algorithm with linear order complexity. Inspired by the idea of anchor graph, we first learn a smaller graph for each view. Then, a novel approach is designed to integrate those graphs so that we can implement spectral clustering on a smaller graph. Interestingly, it turns out that our model also applies to single-view scenario. Extensive experiments on various large-scale benchmark data sets validate the effectiveness and efficiency of our approach with respect to state-of-the-art clustering methods.



Author(s):  
Xing Hu ◽  
Ge Li ◽  
Xin Xia ◽  
David Lo ◽  
Shuai Lu ◽  
...  

Code summarization, aiming to generate succinct natural language description of source code, is extremely useful for code search and code comprehension. It has played an important role in software maintenance and evolution. Previous approaches generate summaries by retrieving summaries from similar code snippets. However, these approaches heavily rely on whether similar code snippets can be retrieved, how similar the snippets are, and fail to capture the API knowledge in the source code, which carries vital information about the functionality of the source code. In this paper, we propose a novel approach, named TL-CodeSum, which successfully uses API knowledge learned in a different but related task to code summarization. Experiments on large-scale real-world industry Java projects indicate that our approach is effective and outperforms the state-of-the-art in code summarization.



2020 ◽  
Vol 12 (9) ◽  
pp. 1451
Author(s):  
Qianhan Wu ◽  
Chunqiao Song ◽  
Kai Liu ◽  
Linghong Ke

Land use and land cover (LULC) is a key variable of the Earth’s system and has become an important indicator to evaluate the impact of human activities on the Earth’s ecosystems. With the increasing demand of mine resources, widespread opencast mining has led to significant changes in LULC and caused substantial damage to the environment. An efficient approach of detecting mining activities at large scales is of critical importance in mitigating their potential impacts on downstream settlements and in assessing LULC characteristics. In this study, we present a novel approach for enabling large-scale automatic detection of opencast mining areas by integrating multitemporal digital elevation models (DEMs, including the SRTM DEM and the recently released TanDEM-X DEM) and multispectral imagery in object-based image analysis and random forest (RF) algorithms. A sequence of data preparation, image segmentation, threshold analysis, calculation of metrics, and influence factor regulation was developed and tested on the Landsat 8 sample dataset in Inner Mongolia in China, which is a mineral-rich area. Aside from spectral metrics, such as brightness and reflectance value, introduced topographical features enhanced the modeling and classification significantly, and the overall performance is greatly influenced by feature selection (the out-of-bag error rate in the RF algorithm is 7.54% for the integrated DEM method in comparison with 12.70% for the only-optical images method). The integrated use of spectral imagery and multitemporal DEMs reveals that the identified mining area is about 1100 km2 in the study area and period, and the topographic changes of opencast mining in terms of elevation difference is between −258 and 162 m. The results show that the method can map the locations and extents of mining areas automatically from spectral and DEM data and can potentially be applied to larger areas.



Author(s):  
Jipeng Zhang ◽  
Roy Ka-Wei Lee ◽  
Ee-Peng Lim ◽  
Wei Qin ◽  
Lei Wang ◽  
...  

Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution.



Author(s):  
Nicola Messina ◽  
Giuseppe Amato ◽  
Andrea Esuli ◽  
Fabrizio Falchi ◽  
Claudio Gennaro ◽  
...  

Despite the evolution of deep-learning-based visual-textual processing systems, precise multi-modal matching remains a challenging task. In this work, we tackle the task of cross-modal retrieval through image-sentence matching based on word-region alignments, using supervision only at the global image-sentence level. Specifically, we present a novel approach called Transformer Encoder Reasoning and Alignment Network (TERAN). TERAN enforces a fine-grained match between the underlying components of images and sentences (i.e., image regions and words, respectively) to preserve the informative richness of both modalities. TERAN obtains state-of-the-art results on the image retrieval task on both MS-COCO and Flickr30k datasets. Moreover, on MS-COCO, it also outperforms current approaches on the sentence retrieval task. Focusing on scalable cross-modal information retrieval, TERAN is designed to keep the visual and textual data pipelines well separated. Cross-attention links invalidate any chance to separately extract visual and textual features needed for the online search and the offline indexing steps in large-scale retrieval systems. In this respect, TERAN merges the information from the two domains only during the final alignment phase, immediately before the loss computation. We argue that the fine-grained alignments produced by TERAN pave the way toward the research for effective and efficient methods for large-scale cross-modal information retrieval. We compare the effectiveness of our approach against relevant state-of-the-art methods. On the MS-COCO 1K test set, we obtain an improvement of 5.7% and 3.5% respectively on the image and the sentence retrieval tasks on the Recall@1 metric. The code used for the experiments is publicly available on GitHub at https://github.com/mesnico/TERAN .



2019 ◽  
Author(s):  
Chem Int

This research work presents a facile and green route for synthesis silver sulfide (Ag2SNPs) nanoparticles from silver nitrate (AgNO3) and sodium sulfide nonahydrate (Na2S.9H2O) in the presence of rosemary leaves aqueous extract at ambient temperature (27 oC). Structural and morphological properties of Ag2SNPs nanoparticles were analyzed by X-ray diffraction (XRD) and transmission electron microscopy (TEM). The surface Plasmon resonance for Ag2SNPs was obtained around 355 nm. Ag2SNPs was spherical in shape with an effective diameter size of 14 nm. Our novel approach represents a promising and effective method to large scale synthesis of eco-friendly antibacterial activity silver sulfide nanoparticles.



2018 ◽  
Vol 14 (12) ◽  
pp. 1915-1960 ◽  
Author(s):  
Rudolf Brázdil ◽  
Andrea Kiss ◽  
Jürg Luterbacher ◽  
David J. Nash ◽  
Ladislava Řezníčková

Abstract. The use of documentary evidence to investigate past climatic trends and events has become a recognised approach in recent decades. This contribution presents the state of the art in its application to droughts. The range of documentary evidence is very wide, including general annals, chronicles, memoirs and diaries kept by missionaries, travellers and those specifically interested in the weather; records kept by administrators tasked with keeping accounts and other financial and economic records; legal-administrative evidence; religious sources; letters; songs; newspapers and journals; pictographic evidence; chronograms; epigraphic evidence; early instrumental observations; society commentaries; and compilations and books. These are available from many parts of the world. This variety of documentary information is evaluated with respect to the reconstruction of hydroclimatic conditions (precipitation, drought frequency and drought indices). Documentary-based drought reconstructions are then addressed in terms of long-term spatio-temporal fluctuations, major drought events, relationships with external forcing and large-scale climate drivers, socio-economic impacts and human responses. Documentary-based drought series are also considered from the viewpoint of spatio-temporal variability for certain continents, and their employment together with hydroclimate reconstructions from other proxies (in particular tree rings) is discussed. Finally, conclusions are drawn, and challenges for the future use of documentary evidence in the study of droughts are presented.



Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1962
Author(s):  
Enrico Buratto ◽  
Adriano Simonetto ◽  
Gianluca Agresti ◽  
Henrik Schäfer ◽  
Pietro Zanuttigh

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.



GigaScience ◽  
2020 ◽  
Vol 9 (12) ◽  
Author(s):  
Ariel Rokem ◽  
Kendrick Kay

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.



Author(s):  
Silvia Huber ◽  
Lars B. Hansen ◽  
Lisbeth T. Nielsen ◽  
Mikkel L. Rasmussen ◽  
Jonas Sølvsteen ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document