multiple data sources Latest Research Papers

An inclusive city water account by integrating multiple data sources for South-East Queensland (SEQ), Australia

10.36334/modsim.2021.j7.islam ◽

2021 ◽

Keyword(s):

Data Sources ◽

Multiple Data Sources ◽

Multiple Data ◽

City Water

#BlackGirlMagic: Using multiple data sources to learn about Black adolescent girls’ identities, intersectionality, and media socialization

Journal of Social Issues ◽

10.1111/josi.12483 ◽

2021 ◽

Author(s):

Leoandra Onnie Rogers ◽

Sheretta Butler Barnes ◽

Lily Sahaguian ◽

Dayanara Padilla ◽

Imani Minor

Keyword(s):

Adolescent Girls ◽

Data Sources ◽

Multiple Data Sources ◽

Black Adolescent ◽

Multiple Data

Outlier detection from multiple data sources

Information Sciences ◽

10.1016/j.ins.2021.09.053 ◽

2021 ◽

Vol 580 ◽

pp. 819-837

Author(s):

Yang Ma ◽

Xujun Zhao ◽

Chaowei Zhang ◽

Jifu Zhang ◽

Xiao Qin

Keyword(s):

Outlier Detection ◽

Data Sources ◽

Multiple Data Sources ◽

Multiple Data

Identifying Illicit Drug Dealers on Instagram with Large-scale Multimodal Data Fusion

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3472713 ◽

2021 ◽

Vol 12 (5) ◽

pp. 1-23

Author(s):

Chuanbo Hu ◽

Minglei Yin ◽

Bin Liu ◽

Xin Li ◽

Yanfang Ye

Keyword(s):

Social Media ◽

Drug Users ◽

Large Scale ◽

Illicit Drug ◽

Data Sources ◽

Multimodal Fusion ◽

Severe Problem ◽

Drug Dealer ◽

Multiple Data Sources ◽

Multiple Data

Illicit drug trafficking via social media sites such as Instagram have become a severe problem, thus drawing a great deal of attention from law enforcement and public health agencies. How to identify illicit drug dealers from social media data has remained a technical challenge for the following reasons. On the one hand, the available data are limited because of privacy concerns with crawling social media sites; on the other hand, the diversity of drug dealing patterns makes it difficult to reliably distinguish drug dealers from common drug users. Unlike existing methods that focus on posting-based detection, we propose to tackle the problem of illicit drug dealer identification by constructing a large-scale multimodal dataset named Identifying Drug Dealers on Instagram (IDDIG). Nearly 4,000 user accounts, of which more than 1,400 are drug dealers, have been collected from Instagram with multiple data sources including post comments, post images, homepage bio, and homepage images. We then design a quadruple-based multimodal fusion method to combine the multiple data sources associated with each user account for drug dealer identification. Experimental results on the constructed IDDIG dataset demonstrate the effectiveness of the proposed method in identifying drug dealers (almost 95% accuracy). Moreover, we have developed a hashtag-based community detection technique for discovering evolving patterns, especially those related to geography and drug types.

Particular matter prediction using synergy of multiple source urban big data in smart cities

Intelligent Decision Technologies ◽

10.3233/idt-200147 ◽

2021 ◽

pp. 1-15

Author(s):

Ali Reza Honarvar ◽

Ashkan Sami

Keyword(s):

Air Pollution ◽

Big Data ◽

Air Quality ◽

Smart Cities ◽

Data Sources ◽

Quality Data ◽

Data Set ◽

Multiple Data Sources ◽

Multiple Data ◽

Air Quality Data

At present, the issue of air quality in populated urban areas is recognized as an environmental crisis. Air pollution affects the sustainability of the city. In controlling air pollution and protecting its hazards from humans, air quality data are very important. However, the costs of constructing and maintaining air quality registration infrastructure are very expensive and high, and air quality data recording at one point will not be generalizable to even a few kilometers. Some of the gains come from the integration of multiple data sources, which can never be achieved through independent single-source processing. Urban organizations in each city independently produce and record data relevant to the organization’s goals and objectives. These issues create separate data silos associated with an urban system. These data are varied in model and structure, and the integration of such data provides an appropriate opportunity to discover knowledge that can be useful in urban planning and decision making. This paper aims to show the generality of our previous research, which proposed a novel model to predict Particulate Matter (PM) as the main factor of air quality in the regions of the cities where air quality sensors are not available through urban big data resources integration, by extending the model and experiments with various configuration for different settings in smart cities. This work extends the evaluation scenarios of the model with the extended dataset of city of Aarhus, in Denmark, and compare the model performance against various specified baselines. Details of removing the heterogeneity of multiple data sources in the Multiple Data Set Aggregator & Heterogeneity Remover (MDA&HR) and improving the operation of Train Data Splitter (TDS) part of the model by focusing on the finding more similar pattern of air quality also are presented in this paper. The acceptable accuracy of the results shows the generality of the model.

The value of multiple data sources in machine learning models for power system event prediction

10.1109/sest50973.2021.9543226 ◽

2021 ◽

Author(s):

Volker Hoffmann ◽

Jonatan Ralf Axel Klemets ◽

Bendik Nybakk Torsaeter ◽

Gjert H. Rosenlund ◽

Christian A. Andresen

Keyword(s):

Machine Learning ◽

Power System ◽

Data Sources ◽

Learning Models ◽

Multiple Data Sources ◽

Event Prediction ◽

Multiple Data ◽

Machine Learning Models

Extending density surface models to include multiple and double-observer survey data

PeerJ ◽

10.7717/peerj.12113 ◽

2021 ◽

Vol 9 ◽

pp. e12113

Author(s):

David L. Miller ◽

David Fifield ◽

Ewan Wakefield ◽

Douglas B. Sigourney

Keyword(s):

Spatial Model ◽

Distance Sampling ◽

Additive Model ◽

Data Sources ◽

Multiple Sources ◽

Multiple Data Sources ◽

Multiple Data ◽

Surface Models ◽

Density Surface ◽

Double Observer

Spatial models of density and abundance are widely used in both ecological research (e.g., to study habitat use) and wildlife management (e.g., for population monitoring and environmental impact assessment). Increasingly, modellers are tasked with integrating data from multiple sources, collected via different observation processes. Distance sampling is an efficient and widely used survey and analysis technique. Within this framework, observation processes are modelled via detection functions. We seek to take multiple data sources and fit them in a single spatial model. Density surface models (DSMs) are a two-stage approach: first accounting for detectability via distance sampling methods, then modelling distribution via a generalized additive model. However, current software and theory does not address the issue of multiple data sources. We extend the DSM approach to accommodate data from multiple surveys, collected via conventional distance sampling, double-observer distance sampling (used to account for incomplete detection at zero distance) and strip transects. Variance propagation ensures that uncertainty is correctly accounted for in final estimates of abundance. Methods described here are implemented in the dsm R package. We briefly analyse two datasets to illustrate these new developments. Our new methodology enables data from multiple distance sampling surveys of different types to be treated in a single spatial model, enabling more robust abundance estimation, potentially over wider geographical or temporal domains.

Understanding the Cycling Behaviour in Melbourne, Australia, through Analysis of Multiple Data Sources

Journal of Transport & Health ◽

10.1016/j.jth.2021.101187 ◽

2021 ◽

Vol 22 ◽

pp. 101187

Author(s):

Afshin Jafari ◽

Dhirendra Singh ◽

Billie Giles-Corti

Keyword(s):

Data Sources ◽

Multiple Data Sources ◽

Multiple Data

ALGORITHM FOR COMPLEXING MULTIPLE DATA SOURCES INTO A SINGLE OCCUPANCY MAP

IZVESTIYA SFedU ENGINEERING SCIENCES ◽

10.18522/2311-3103-2021-3-64-71 ◽

2021 ◽

pp. 64-71

Author(s):

I.O. Shepel

Keyword(s):

Data Sources ◽

Multiple Data Sources ◽

Multiple Data

A dataset on affiliation of venture capitalists in China between 2000 and 2016

Scientific Data ◽

10.1038/s41597-021-00993-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Jin Chen ◽

Tianyuan Chen ◽

Yifei Song ◽

Bin Hao ◽

Ling Ma

Keyword(s):

Data Sources ◽

Venture Capitalists ◽

High Quality ◽

Public Agency ◽

Multiple Data Sources ◽

Multiple Data ◽

Multi Stage ◽

The World ◽

Prior Literature ◽

Innovation And Entrepreneurship

AbstractPrior literature emphasizes the distinct roles of differently affiliated venture capitalists (VCs) in nurturing innovation and entrepreneurship. Although China has become the second largest VC market in the world, the unavailability of high-quality datasets on VC affiliation in China’s market hinders such research efforts. To fill up this important gap, we compiled a new panel dataset of VC affiliation in China’s market from multiple data sources. Specifically, we drew on a list of 6,553 VCs that have invested in China between 2000 and 2016 from CVSource database, collected VC’s shareholder information from public sources, and developed a multi-stage procedure to label each VC as the following types: GVC (public agency-affiliated, state-owned enterprise-affiliated), CVC (corporate VC), IVC (independent VC), BVC (bank-affiliated VC), FVC (financial/non-bank-affiliated VC), UVC (university endowment/spin-out unit), and PenVC (pension-affiliated VC). We also denoted whether a VC has foreign background. This dataset helps researchers conduct more nuanced investigations into the investment behaviors of different VCs and their distinct impacts on innovation and entrepreneurship in China’s context.

multiple data sources
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

An inclusive city water account by integrating multiple data sources for South-East Queensland (SEQ), Australia

#BlackGirlMagic: Using multiple data sources to learn about Black adolescent girls’ identities, intersectionality, and media socialization

Outlier detection from multiple data sources

Identifying Illicit Drug Dealers on Instagram with Large-scale Multimodal Data Fusion

Particular matter prediction using synergy of multiple source urban big data in smart cities

The value of multiple data sources in machine learning models for power system event prediction

Extending density surface models to include multiple and double-observer survey data

Understanding the Cycling Behaviour in Melbourne, Australia, through Analysis of Multiple Data Sources

ALGORITHM FOR COMPLEXING MULTIPLE DATA SOURCES INTO A SINGLE OCCUPANCY MAP

A dataset on affiliation of venture capitalists in China between 2000 and 2016

Export Citation Format

multiple data sourcesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

An inclusive city water account by integrating multiple data sources for South-East Queensland (SEQ), Australia

#BlackGirlMagic: Using multiple data sources to learn about Black adolescent girls’ identities, intersectionality, and media socialization

Outlier detection from multiple data sources

Identifying Illicit Drug Dealers on Instagram with Large-scale Multimodal Data Fusion

Particular matter prediction using synergy of multiple source urban big data in smart cities

The value of multiple data sources in machine learning models for power system event prediction

Extending density surface models to include multiple and double-observer survey data

Understanding the Cycling Behaviour in Melbourne, Australia, through Analysis of Multiple Data Sources

ALGORITHM FOR COMPLEXING MULTIPLE DATA SOURCES INTO A SINGLE OCCUPANCY MAP

A dataset on affiliation of venture capitalists in China between 2000 and 2016

multiple data sources
Recently Published Documents