scholarly journals Mobility in Unsupervised Word Embeddings for Knowledge Extraction—The Scholars’ Trajectories across Research Topics

2022 ◽  
Vol 14 (1) ◽  
pp. 25
Author(s):  
Gianfranco Lombardo ◽  
Michele Tomaiuolo ◽  
Monica Mordonini ◽  
Gaia Codeluppi ◽  
Agostino Poggi

In the knowledge discovery field of the Big Data domain the analysis of geographic positioning and mobility information plays a key role. At the same time, in the Natural Language Processing (NLP) domain pre-trained models such as BERT and word embedding algorithms such as Word2Vec enabled a rich encoding of words that allows mapping textual data into points of an arbitrary multi-dimensional space, in which the notion of proximity reflects an association among terms or topics. The main contribution of this paper is to show how analytical tools, traditionally adopted to deal with geographic data to measure the mobility of an agent in a time interval, can also be effectively applied to extract knowledge in a semantic realm, such as a semantic space of words and topics, looking for latent trajectories that can benefit the properties of neural network latent representations. As a case study, the Scopus database was queried about works of highly cited researchers in recent years. On this basis, we performed a dynamic analysis, for measuring the Radius of Gyration as an index of the mobility of researchers across scientific topics. The semantic space is built from the automatic analysis of the paper abstracts of each author. In particular, we evaluated two different methodologies to build the semantic space and we found that Word2Vec embeddings perform better than the BERT ones for this task. Finally, The scholars’ trajectories show some latent properties of this model, which also represent new scientific contributions of this work. These properties include (i) the correlation between the scientific mobility and the achievement of scientific results, measured through the H-index; (ii) differences in the behavior of researchers working in different countries and subjects; and (iii) some interesting similarities between mobility patterns in this semantic realm and those typically observed in the case of human mobility.

Urban Science ◽  
2019 ◽  
Vol 3 (3) ◽  
pp. 87 ◽  
Author(s):  
Ahmed Ahmouda ◽  
Hartwig H. Hochmair ◽  
Sreten Cvetojevic

Understanding human mobility patterns becomes essential in crisis management and response. This study analyzes the effect of two hurricanes in the United States on human mobility patterns, more specifically on trip distance (displacement), radius of gyration, and mean square displacement, using Twitter data. The study examines three geographical regions which include urbanized areas (Houston, Texas; Miami-Dade County, Florida) and both rural and urbanized areas (North and South Carolina) affected by hurricanes Matthew (2016) and Harvey (2017). Comparison of movement patterns before, during, and after each hurricane shows that displacement and activity space decreased during the events in the regions. Part of this decline can be potentially tied to observed lower tweet numbers around supply facilities during hurricanes, when many of them are closed, as well as to numerous flooded and blocked roads reported in the affected regions. Furthermore, it is shown that displacement patterns can be modeled through a truncated power-law before, during, and after the analyzed hurricanes, which demonstrates the resilience of human mobility behavior in this regard. Analysis of hashtag use in the three study areas indicates that Twitter contributors post about the events primarily during the hurricane landfall and to some extent also during hurricane preparation. This increase in hurricane-related Twitter topics and decrease in activity space provides a tie between changed travel behavior in affected areas and user perception of hurricanes in the Twitter community. Overall, this study adds to the body of knowledge that connects human mobility to natural crises at the local level. It suggests that governmental and rescue operations need to respond to and be prepared for reduced mobility of residents in affected regions during natural crisis events.


2019 ◽  
Vol 8 (7) ◽  
pp. 308 ◽  
Author(s):  
Zhenzhou Xu ◽  
Ge Cui ◽  
Ming Zhong ◽  
Xin Wang

Anomalous urban mobility pattern refers to abnormal human mobility flow in a city. Anomalous urban mobility pattern detection is important in the study of urban mobility. In this paper, a framework is proposed to identify anomalous urban mobility patterns based on taxi GPS trajectories and Point of Interest (POI) data. In the framework, functional regions are first generated based on the distribution of POIs by the DBSCAN clustering algorithm. A Weighted Term Frequency-Inverse Document Frequency (WTF-IDF) method is proposed to identify function values in each region. Then, the Origin-Destination (OD) of trips between functional regions is extracted from GPS trajectories to detect anomalous urban mobility patterns. Mobility vectors are established for each time interval based on the OD of trips and are classified into clusters by the mean shift algorithm. Abnormal urban mobility patterns are identified by processing the mobility vectors. A case study in the city of Wuhan, China, is conducted; the experimental results show that the proposed method can effectively identify daily and hourly anomalous urban mobility patterns.


2019 ◽  
Vol 8 (1) ◽  
Author(s):  
Lorenzo Lucchini ◽  
Sara Tonelli ◽  
Bruno Lepri

AbstractThe steady growth of digitized historical information is continuously stimulating new different approaches to the fields of Digital Humanities and Computational Social Science. In this work we use Natural Language Processing techniques to retrieve large amounts of historical information from Wikipedia. In particular, the pages of a set of historically notable individuals are processed to catch the locations and the date of people’s movements. This information is then structured in a geographical network of mobility patterns.We analyze the mobility of historically notable individuals from different perspectives to better understand the role of migrations and international collaborations in the context of innovation and cultural development. In this work, we first present some general characteristics of the dataset from a social and geographical perspective. Then, we build a spatial network of cities, and we model and quantify the tendency to explore of a set of people that can be considered as historically and culturally notable. In this framework, we show that by using a multilevel radiation model for human mobility, we are able to catch important features of migration’s behavior. Results show that the choice of the target migration place for historically and culturally relevant people is limited to a small number of locations and that it depends on the discipline a notable is interested in and on the number of opportunities she/he can find there.


2021 ◽  
Author(s):  
Diego Kozlowski ◽  
Jennifer Dusdal ◽  
Jun Pang ◽  
Andreas Zilian

AbstractOver the last century, we observe a steady and exponential growth of scientific publications globally. The overwhelming amount of available literature makes a holistic analysis of the research within a field and between fields based on manual inspection impossible. Automatic techniques to support the process of literature review are required to find the epistemic and social patterns that are embedded in scientific publications. In computer sciences, new tools have been developed to deal with large volumes of data. In particular, deep learning techniques open the possibility of automated end-to-end models to project observations to a new, low-dimensional space where the most relevant information of each observation is highlighted. Using deep learning to build new representations of scientific publications is a growing but still emerging field of research. The aim of this paper is to discuss the potential and limits of deep learning for gathering insights about scientific research articles. We focus on document-level embeddings based on the semantic and relational aspects of articles, using Natural Language Processing (NLP) and Graph Neural Networks (GNNs). We explore the different outcomes generated by those techniques. Our results show that using NLP we can encode a semantic space of articles, while GNN we enable us to build a relational space where the social practices of a research community are also encoded.


2019 ◽  
Vol 33 (22) ◽  
pp. 1950251
Author(s):  
Qing-Chao Shan ◽  
Hong-Hui Dong ◽  
Hai-Jian Li ◽  
Li-Min Jia

With the change in people’s lifestyle and travel mode, understanding the individual and population mobility patterns in urban areas remains to an outstanding problem. Pervasive mobile communication technologies generate voluminous data related to human mobility, such as mobile phone data. To further study the characteristics of returning and exploration patterns of human movement in urban space, a multi-index model is proposed based on the original radius of the gyration index. In this paper, the classification mechanism of a single ratio of the radius of gyration for k-explorers and k-returners is illustrated. Some disadvantages of this mechanism are noted. A few indices of the model are proposed for deep mining of data on human mobility exploration and returning characteristics. Taking a mobile phone data during an entire month as a sample, and after data processing on the Spark platform, the characteristics of various indicators and their correlations are analyzed. The classification effects of different spatial indices for human exploration and returning are compared by using a support vector machine and the binary classification algorithm and are further compared with existing research results. The differences in the classification effects of these indicators are analyzed, which is helpful for in-depth studies of urban mobility patterns.


2021 ◽  
Vol 13 (4) ◽  
pp. 2178
Author(s):  
Songkorn Siangsuebchart ◽  
Sarawut Ninsawat ◽  
Apichon Witayangkurn ◽  
Surachet Pravinvongvuth

Bangkok, the capital city of Thailand, is one of the most developed and expansive cities. Due to the ongoing development and expansion of Bangkok, urbanization has continued to expand into adjacent provinces, creating the Bangkok Metropolitan Region (BMR). Continuous monitoring of human mobility in BMR aids in public transport planning and design, and efficient performance assessment. The purpose of this study is to design and develop a process to derive human mobility patterns from the real movement of people who use both fixed-route and non-fixed-route public transport modes, including taxis, vans, and electric rail. Taxi GPS open data were collected by the Intelligent Traffic Information Center Foundation (iTIC) from all GPS-equipped taxis of one operator in BMR. GPS probe data of all operating GPS-equipped vans were collected by the Ministry of Transport’s Department of Land Transport for daily speed and driving behavior monitoring. Finally, the ridership data of all electric rail lines were collected from smartcards by the Automated Fare Collection (AFC). None of the previous works on human mobility extraction from multi-sourced big data have used van data; therefore, it is a challenge to use this data with other sources in the study of human mobility. Each public transport mode has traveling characteristics unique to its passengers and, therefore, specific analytical tools. Firstly, the taxi trip extraction process was developed using Hadoop Hive to process a large quantity of data spanning a one-month period to derive the origin and destination (OD) of each trip. Secondly, for van data, a Java program was used to construct the ODs of van trips. Thirdly, another Java program was used to create the ODs of the electric rail lines. All OD locations of these three modes were aggregated into transportation analysis zones (TAZ). The major taxi trip destinations were found to be international airports and provincial bus terminals. The significant trip destinations of vans were provincial bus terminals in Bangkok, electric rail stations, and the industrial estates in other provinces of BMR. In contrast, electric rail destinations were electric rail line interchange stations, the central business district (CBD), and commercial office areas. Therefore, these significant destinations of taxis and vans should be considered in electric rail planning to reduce the air pollution from gasoline vehicles (taxis and vans). Using the designed procedures, the up-to-date dataset of public transport can be processed to derive a time series of human mobility as an input into continuous and sustainable public transport planning and performance assessment. Based on the results of the study, the procedures can benefit other cities in Thailand and other countries.


2021 ◽  
Vol 94 ◽  
pp. 103117
Author(s):  
Rongxiang Su ◽  
Jingyi Xiao ◽  
Elizabeth C. McBride ◽  
Konstadinos G. Goulias

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Alexandru Topîrceanu ◽  
Radu-Emil Precup

AbstractComputational models for large, resurgent epidemics are recognized as a crucial tool for predicting the spread of infectious diseases. It is widely agreed, that such models can be augmented with realistic multiscale population models and by incorporating human mobility patterns. Nevertheless, a large proportion of recent studies, aimed at better understanding global epidemics, like influenza, measles, H1N1, SARS, and COVID-19, underestimate the role of heterogeneous mixing in populations, characterized by strong social structures and geography. Motivated by the reduced tractability of studies employing homogeneous mixing, which make conclusions hard to deduce, we propose a new, very fine-grained model incorporating the spatial distribution of population into geographical settlements, with a hierarchical organization down to the level of households (inside which we assume homogeneous mixing). In addition, population is organized heterogeneously outside households, and we model the movement of individuals using travel distance and frequency parameters for inter- and intra-settlement movement. Discrete event simulation, employing an adapted SIR model with relapse, reproduces important qualitative characteristics of real epidemics, like high variation in size and temporal heterogeneity (e.g., waves), that are challenging to reproduce and to quantify with existing measures. Our results pinpoint an important aspect, that epidemic size is more sensitive to the increase in distance of travel, rather that the frequency of travel. Finally, we discuss implications for the control of epidemics by integrating human mobility restrictions, as well as progressive vaccination of individuals.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1372
Author(s):  
Sanjanasri JP ◽  
Vijay Krishna Menon ◽  
Soman KP ◽  
Rajendran S ◽  
Agnieszka Wolk

Linguists have been focused on a qualitative comparison of the semantics from different languages. Evaluation of the semantic interpretation among disparate language pairs like English and Tamil is an even more formidable task than for Slavic languages. The concept of word embedding in Natural Language Processing (NLP) has enabled a felicitous opportunity to quantify linguistic semantics. Multi-lingual tasks can be performed by projecting the word embeddings of one language onto the semantic space of the other. This research presents a suite of data-efficient deep learning approaches to deduce the transfer function from the embedding space of English to that of Tamil, deploying three popular embedding algorithms: Word2Vec, GloVe and FastText. A novel evaluation paradigm was devised for the generation of embeddings to assess their effectiveness, using the original embeddings as ground truths. Transferability across other target languages of the proposed model was assessed via pre-trained Word2Vec embeddings from Hindi and Chinese languages. We empirically prove that with a bilingual dictionary of a thousand words and a corresponding small monolingual target (Tamil) corpus, useful embeddings can be generated by transfer learning from a well-trained source (English) embedding. Furthermore, we demonstrate the usability of generated target embeddings in a few NLP use-case tasks, such as text summarization, part-of-speech (POS) tagging, and bilingual dictionary induction (BDI), bearing in mind that those are not the only possible applications.


Author(s):  
Shuhei Nomura ◽  
Yuta Tanoue ◽  
Daisuke Yoneoka ◽  
Stuart Gilmour ◽  
Takayuki Kawashima ◽  
...  

AbstractIn the COVID-19 era, movement restrictions are crucial to slow virus transmission and have been implemented in most parts of the world, including Japan. To find new insights on human mobility and movement restrictions encouraged (but not forced) by the emergency declaration in Japan, we analyzed mobility data at 35 major stations and downtown areas in Japan—each defined as an area overlaid by several 125-meter grids—from September 1, 2019 to March 19, 2021. Data on the total number of unique individuals per hour passing through each area were obtained from Yahoo Japan Corporation (i.e., more than 13,500 data points for each area). We examined the temporal trend in the ratio of the rolling seven-day daily average of the total population to a baseline on January 16, 2020, by ten-year age groups in five time frames. We demonstrated that the degree and trend of mobility decline after the declaration of a state of emergency varies across age groups and even at the subregional level. We demonstrated that monitoring dynamic geographic and temporal mobility information stratified by detailed population characteristics can help guide not only exit strategies from an ongoing emergency declaration, but also initial response strategies before the next possible resurgence. Combining such detailed data with data on vaccination coverage and COVID-19 incidence (including the status of the health care delivery system) can help governments and local authorities develop community-specific mobility restriction policies. This could include strengthening incentives to stay home and raising awareness of cognitive errors that weaken people's resolve to refrain from nonessential movement.


Sign in / Sign up

Export Citation Format

Share Document