Trajectory-User Linking via Variational AutoEncoder

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/446 ◽

2018 ◽

Cited By ~ 20

Author(s):

Fan Zhou ◽

Qiang Gao ◽

Goce Trajcevski ◽

Kunpeng Zhang ◽

Ting Zhong ◽

...

Keyword(s):

Latent Variables ◽

Large Scale ◽

Human Mobility ◽

Hierarchical Structures ◽

High Dimensional ◽

Generation Process ◽

Mobility Patterns ◽

Variational Autoencoder ◽

Hidden States ◽

Structural Semantics

Trajectory-User Linking (TUL) is an essential task in Geo-tagged social media (GTSM) applications, enabling personalized Point of Interest (POI) recommendation and activity identification. Existing works on mining mobility patterns often model trajectories using Markov Chains (MC) or recurrent neural networks (RNN) -- either assuming independence between non-adjacent locations or following a shallow generation process. However, most of them ignore the fact that human trajectories are often sparse, high-dimensional and may contain embedded hierarchical structures. We tackle the TUL problem with a semi-supervised learning framework, called TULVAE (TUL via Variational AutoEncoder), which learns the human mobility in a neural generative architecture with stochastic latent variables that span hidden states in RNN. TULVAE alleviates the data sparsity problem by leveraging large-scale unlabeled data and represents the hierarchical and structural semantics of trajectories with high-dimensional latent variables. Our experiments demonstrate that TULVAE improves efficiency and linking performance in real GTSM datasets, in comparison to existing methods.

Get full-text (via PubEx)

Facetto: Combining Unsupervised and Supervised Learning for Hierarchical Phenotype Analysis in Multi-Channel Image Data

10.1101/722918 ◽

2019 ◽

Cited By ~ 1

Author(s):

Robert Krueger ◽

Johanna Beyer ◽

Won-Dong Jang ◽

Nam Wook Kim ◽

Artem Sokolov ◽

...

Keyword(s):

Supervised Learning ◽

Visual Analytics ◽

Cancer Biology ◽

Large Scale ◽

Hierarchical Structures ◽

Image Data ◽

Automated Analysis ◽

Cell Types ◽

High Dimensional ◽

Exploration Process

AbstractFacetto is a scalable visual analytics application that is used to discover single-cell phenotypes in high-dimensional multi-channel microscopy images of human tumors and tissues. Such images represent the cutting edge of digital histology and promise to revolutionize how diseases such as cancer are studied, diagnosed, and treated. Highly multiplexed tissue images are complex, comprising 109or more pixels, 60-plus channels, and millions of individual cells. This makes manual analysis challenging and error-prone. Existing automated approaches are also inadequate, in large part, because they are unable to effectively exploit the deep knowledge of human tissue biology available to anatomic pathologists. To overcome these challenges, Facetto enables a semi-automated analysis of cell types and states. It integrates unsupervised and supervised learning into the image and feature exploration process and offers tools for analytical provenance. Experts can cluster the data to discover new types of cancer and immune cells and use clustering results to train a convolutional neural network that classifies new cells accordingly. Likewise, the output of classifiers can be clustered to discover aggregate patterns and phenotype subsets. We also introduce a new hierarchical approach to keep track of analysis steps and data subsets created by users; this assists in the identification of cell types. Users can build phenotype trees and interact with the resulting hierarchical structures of both high-dimensional feature and image spaces. We report on use-cases in which domain scientists explore various large-scale fluorescence imaging datasets. We demonstrate how Facetto assists users in steering the clustering and classification process, inspecting analysis results, and gaining new scientific insights into cancer biology.

Get full-text (via PubEx)

Identifying Foreign Tourists’ Nationality from Mobility Traces via LSTM Neural Network and Location Embeddings

Applied Sciences ◽

10.3390/app9142861 ◽

2019 ◽

Vol 9 (14) ◽

pp. 2861 ◽

Cited By ~ 2

Author(s):

Alessandro Crivellari ◽

Euro Beinat

Keyword(s):

Neural Network ◽

Motion Tracking ◽

Large Scale ◽

Short Term Memory ◽

Human Mobility ◽

User Profiling ◽

Personalized Recommendation ◽

Mobility Patterns ◽

Trajectory Data ◽

Short Term

The interest in human mobility analysis has increased with the rapid growth of positioning technology and motion tracking, leading to a variety of studies based on trajectory recordings. Mapping the routes that people commonly perform was revealed to be very useful for location-based service applications, where individual mobility behaviors can potentially disclose meaningful information about each customer and be fruitfully used for personalized recommendation systems. This paper tackles a novel trajectory labeling problem related to the context of user profiling in “smart” tourism, inferring the nationality of individual users on the basis of their motion trajectories. In particular, we use large-scale motion traces of short-term foreign visitors as a way of detecting the nationality of individuals. This task is not trivial, relying on the hypothesis that foreign tourists of different nationalities may not only visit different locations, but also move in a different way between the same locations. The problem is defined as a multinomial classification with a few tens of classes (nationalities) and sparse location-based trajectory data. We hereby propose a machine learning-based methodology, consisting of a long short-term memory (LSTM) neural network trained on vector representations of locations, in order to capture the underlying semantics of user mobility patterns. Experiments conducted on a real-world big dataset demonstrate that our method achieves considerably higher performances than baseline and traditional approaches.

Get full-text (via PubEx)

Non-compulsory measures sufficiently reduced human mobility in Tokyo during the COVID-19 epidemic

Scientific Reports ◽

10.1038/s41598-020-75033-5 ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 4

Author(s):

Takahiro Yabe ◽

Kota Tsubouchi ◽

Naoya Fujiwara ◽

Takayuki Wada ◽

Yoshihide Sekimoto ◽

...

Keyword(s):

Large Scale ◽

Human Mobility ◽

Social Contact ◽

Mobility Patterns ◽

State Of Emergency ◽

Social Contacts ◽

Policy Makers ◽

Mobility Data ◽

Contact Rates ◽

Mobility Behavior

Abstract While large scale mobility data has become a popular tool to monitor the mobility patterns during the COVID-19 pandemic, the impacts of non-compulsory measures in Tokyo, Japan on human mobility patterns has been under-studied. Here, we analyze the temporal changes in human mobility behavior, social contact rates, and their correlations with the transmissibility of COVID-19, using mobility data collected from more than 200K anonymized mobile phone users in Tokyo. The analysis concludes that by April 15th (1 week into state of emergency), human mobility behavior decreased by around 50%, resulting in a 70% reduction of social contacts in Tokyo, showing the strong relationships with non-compulsory measures. Furthermore, the reduction in data-driven human mobility metrics showed correlation with the decrease in estimated effective reproduction number of COVID-19 in Tokyo. Such empirical insights could inform policy makers on deciding sufficient levels of mobility reduction to contain the disease.

Get full-text (via PubEx)

Human Mobility Patterns at the Smallest Scales

Communications in Computational Physics ◽

10.4208/cicp.120614.190115a ◽

2015 ◽

Vol 18 (2) ◽

pp. 417-428 ◽

Cited By ~ 3

Author(s):

Pedro G. Lind ◽

Adriano Moreira

Keyword(s):

Large Scale ◽

Spatial Scales ◽

Human Mobility ◽

Numerical Approach ◽

Human Motion ◽

Mobile Phone Data ◽

Data Sets ◽

Mobility Patterns ◽

Access Points ◽

Dollar Bill

AbstractWe present a study on human mobility at small spatial scales. Differently from large scale mobility, recently studied through dollar-bill tracking and mobile phone data sets within one big country or continent, we report Brownian features of human mobility at smaller scales. In particular, the scaling exponents found at the smallest scales is typically close to one-half, differently from the larger values for the exponent characterizing mobility at larger scales. We carefully analyze 12-month data of the Eduroam database within the Portuguese university of Minho. A full procedure is introduced with the aim of properly characterizing the human mobility within the network of access points composing the wireless system of the university. In particular, measures of flux are introduced for estimating a distance between access points. This distance is typically non-Euclidean, since the spatial constraints at such small scales distort the continuum space on which human mobility occurs. Since two different exponents are found depending on the scale human motion takes place, we raise the question at which scale the transition from Brownian to non-Brownian motion takes place. In this context, we discuss how the numerical approach can be extended to larger scales, using the full Eduroam in Europe and in Asia, for uncovering the transition between both dynamical regimes.

Get full-text (via PubEx)

Assessing the use of mobile phone data to describe recurrent mobility patterns in spatial epidemic models

Royal Society Open Science ◽

10.1098/rsos.160950 ◽

2017 ◽

Vol 4 (5) ◽

pp. 160950 ◽

Cited By ~ 19

Author(s):

Cecilia Panigutti ◽

Michele Tizzoni ◽

Paolo Bajardi ◽

Zbigniew Smoreda ◽

Vittoria Colizza

Keyword(s):

Infectious Disease ◽

Mobile Phone ◽

Urban Areas ◽

Large Scale ◽

Human Mobility ◽

Mobile Phone Data ◽

Disease Models ◽

Mobility Patterns ◽

Mobility Data ◽

Infectious Disease Models

The recent availability of large-scale call detail record data has substantially improved our ability of quantifying human travel patterns with broad applications in epidemiology. Notwithstanding a number of successful case studies, previous works have shown that using different mobility data sources, such as mobile phone data or census surveys, to parametrize infectious disease models can generate divergent outcomes. Thus, it remains unclear to what extent epidemic modelling results may vary when using different proxies for human movements. Here, we systematically compare 658 000 simulated outbreaks generated with a spatially structured epidemic model based on two different human mobility networks: a commuting network of France extracted from mobile phone data and another extracted from a census survey. We compare epidemic patterns originating from all the 329 possible outbreak seed locations and identify the structural network properties of the seeding nodes that best predict spatial and temporal epidemic patterns to be alike. We find that similarity of simulated epidemics is significantly correlated to connectivity, traffic and population size of the seeding nodes, suggesting that the adequacy of mobile phone data for infectious disease models becomes higher when epidemics spread between highly connected and heavily populated locations, such as large urban areas.

Get full-text (via PubEx)

Exploring the Citywide Human Mobility Patterns of Taxi Trips through a Topic-Modeling Analysis

Journal of Advanced Transportation ◽

10.1155/2021/6697827 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hui Xiong ◽

Kaiqiang Xie ◽

Lu Ma ◽

Feng Yuan ◽

Rui Shen

Keyword(s):

Power Law ◽

Traffic Management ◽

Large Scale ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Human Mobility ◽

Mobility Patterns ◽

Power Law Distribution ◽

Modeling Analysis ◽

Wide Range

Understanding human mobility patterns is of great importance for a wide range of applications from social networks to transportation planning. Toward this end, the spatial-temporal information of a large-scale dataset of taxi trips was collected via GPS, from March 10 to 23, 2014, in Beijing. The data contain trips generated by a great portion of taxi vehicles citywide. We revealed that the geographic displacement of those trips follows the power law distribution and the corresponding travel time follows a mixture of the exponential and power law distribution. To identify human mobility patterns, a topic model with the latent Dirichlet allocation (LDA) algorithm was proposed to infer the sixty-five key topics. By measuring the variation of trip displacement over time, we find that the travel distance in the morning rush hour is much shorter than that in the other time. As for daily patterns, it shows that taxi mobility presents weekly regularity both on weekdays and on weekends. Among different days in the same week, mobility patterns on Tuesday and Wednesday are quite similar. By quantifying the trip distance along time, we find that Topic 44 exhibits dominant patterns, which means distance less than 10 km is predominant no matter what time in a day. The findings could be references for travelers to arrange trips and policymakers to formulate sound traffic management policies.

Get full-text (via PubEx)

Automatically Neutralizing Subjective Bias in Text

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5385 ◽

2020 ◽

Vol 34 (01) ◽

pp. 480-489 ◽

Cited By ~ 2

Author(s):

Reid Pryzant ◽

Richard Diehl Martinez ◽

Nathan Dass ◽

Sadao Kurohashi ◽

Dan Jurafsky ◽

...

Keyword(s):

Large Scale ◽

Point Of View ◽

Automatic Identification ◽

Generation Process ◽

Modular Algorithm ◽

Political Speeches ◽

Collective Trust ◽

Biased Language ◽

Hidden States ◽

News Headlines

Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity — introducing attitudes via framing, presupposing truth, and casting doubt — remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view (“neutralizing” biased text). We also offer the first parallel corpus of biased language. The corpus contains 180,000 sentence pairs and originates from Wikipedia edits that removed various framings, presuppositions, and attitudes from biased sentences. Last, we propose two strong encoder-decoder baselines for the task. A straightforward yet opaque concurrent system uses a BERT encoder to identify subjective words as part of the generation process. An interpretable and controllable modular algorithm separates these steps, using (1) a BERT-based classifier to identify problematic words and (2) a novel join embedding through which the classifier can edit the hidden states of the encoder. Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.

Get full-text (via PubEx)

Identifying multiscale spatio-temporal patterns in human mobility using manifold learning

PeerJ Computer Science ◽

10.7717/peerj-cs.276 ◽

2020 ◽

Vol 6 ◽

pp. e276 ◽

Cited By ~ 1

Author(s):

James R. Watson ◽

Zach Gelbaum ◽

Mathew Titus ◽

Grant Zoch ◽

David Wrathall

Keyword(s):

Manifold Learning ◽

Large Scale ◽

Human Mobility ◽

Mobile Network ◽

Human Migration ◽

Mobility Patterns ◽

Great Opportunity ◽

Spectral Graph ◽

Spectral Graph Wavelets ◽

Spatio Temporal

When, where and how people move is a fundamental part of how human societies organize around every-day needs as well as how people adapt to risks, such as economic scarcity or instability, and natural disasters. Our ability to characterize and predict the diversity of human mobility patterns has been greatly expanded by the availability of Call Detail Records (CDR) from mobile phone cellular networks. The size and richness of these datasets is at the same time a blessing and a curse: while there is great opportunity to extract useful information from these datasets, it remains a challenge to do so in a meaningful way. In particular, human mobility is multiscale, meaning a diversity of patterns of mobility occur simultaneously, which vary according to timing, magnitude and spatial extent. To identify and characterize the main spatio-temporal scales and patterns of human mobility we examined CDR data from the Orange mobile network in Senegal using a new form of spectral graph wavelets, an approach from manifold learning. This unsupervised analysis reduces the dimensionality of the data to reveal seasonal changes in human mobility, as well as mobility patterns associated with large-scale but short-term religious events. The novel insight into human mobility patterns afforded by manifold learning methods like spectral graph wavelets have clear applications for urban planning, infrastructure design as well as hazard risk management, especially as climate change alters the biophysical landscape on which people work and live, leading to new patterns of human migration around the world.

Get full-text (via PubEx)

Passive Wi-Fi monitoring in the wild: a long-term study across multiple location typologies

Personal and Ubiquitous Computing ◽

10.1007/s00779-020-01441-z ◽

2020 ◽

Author(s):

Miguel Ribeiro ◽

Nuno Nunes ◽

Valentina Nisi ◽

Johannes Schöning

Keyword(s):

Public Transportation ◽

Rural Areas ◽

Urban Areas ◽

Large Scale ◽

Tracking System ◽

Human Mobility ◽

Transportation Systems ◽

Mobility Patterns ◽

Systematic Analysis ◽

Long Term Study

Abstract In this paper, we present a systematic analysis of large-scale human mobility patterns obtained from a passive Wi-Fi tracking system, deployed across different location typologies. We have deployed a system to cover urban areas served by public transportation systems as well as very isolated and rural areas. Over 4 years, we collected 572 million data points from a total of 82 routers covering an area of 2.8 km2. In this paper we provide a systematic analysis of the data and discuss how our low-cost approach can be used to help communities and policymakers to make decisions to improve people’s mobility at high temporal and spatial resolution by inferring presence characteristics against several sources of ground truth. Also, we present an automatic classification technique that can identify location types based on collected data.

Get full-text (via PubEx)

CellSense

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3478087 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-22

Author(s):

Zhihan Fang ◽

Yu Yang ◽

Guang Yang ◽

Yikuan Xian ◽

Fan Zhang ◽

...

Keyword(s):

Cellular Network ◽

Large Scale ◽

Human Mobility ◽

Ground Truth ◽

Mobility Patterns ◽

Billing Data ◽

Ground Truth Data ◽

Mobility Modeling ◽

High Penetration ◽

Recovery System

Data from the cellular network have been proved as one of the most promising way to understand large-scale human mobility for various ubiquitous computing applications due to the high penetration of cellphones and low collection cost. Existing mobility models driven by cellular network data suffer from sparse spatial-temporal observations because user locations are recorded with cellphone activities, e.g., calls, text, or internet access. In this paper, we design a human mobility recovery system called CellSense to take the sparse cellular billing data (CBR) as input and outputs dense continuous records to recover the sensing gap when using cellular networks as sensing systems to sense the human mobility. There is limited work on this kind of recovery systems at large scale because even though it is straightforward to design a recovery system based on regression models, it is very challenging to evaluate these models at large scale due to the lack of the ground truth data. In this paper, we explore a new opportunity based on the upgrade of cellular infrastructures to obtain cellular network signaling data as the ground truth data, which log the interaction between cellphones and cellular towers at signal levels (e.g., attaching, detaching, paging) even without billable activities. Based on the signaling data, we design a system CellSense for human mobility recovery by integrating collective mobility patterns with individual mobility modeling, which achieves the 35.3% improvement over the state-of-the-art models. The key application of our recovery model is to take regular sparse CBR data that a researcher already has, and to recover the missing data due to sensing gaps of CBR data to produce a dense cellular data for them to train a machine learning model for their use cases, e.g., next location prediction.

Get full-text (via PubEx)