source data Latest Research Papers

Spatio-Temporal Event Forecasting Using Incremental Multi-Source Feature Learning

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3464976 ◽

2022 ◽

Vol 16 (2) ◽

pp. 1-28

Author(s):

Liang Zhao ◽

Yuyang Gao ◽

Jieping Ye ◽

Feng Chen ◽

Yanfang Ye ◽

...

Keyword(s):

Real Time ◽

Missing Values ◽

Learning Algorithm ◽

Feature Learning ◽

Learning Model ◽

Model Parameters ◽

Active Set ◽

Content Type ◽

Source Data ◽

Event Forecasting

The forecasting of significant societal events such as civil unrest and economic crisis is an interesting and challenging problem which requires both timeliness, precision, and comprehensiveness. Significant societal events are influenced and indicated jointly by multiple aspects of a society, including its economics, politics, and culture. Traditional forecasting methods based on a single data source find it hard to cover all these aspects comprehensively, thus limiting model performance. Multi-source event forecasting has proven promising but still suffers from several challenges, including (1) geographical hierarchies in multi-source data features, (2) hierarchical missing values, (3) characterization of structured feature sparsity, and (4) difficulty in model’s online update with incomplete multiple sources. This article proposes a novel feature learning model that concurrently addresses all the above challenges. Specifically, given multi-source data from different geographical levels, we design a new forecasting model by characterizing the lower-level features’ dependence on higher-level features. To handle the correlations amidst structured feature sets and deal with missing values among the coupled features, we propose a novel feature learning model based on an N th-order strong hierarchy and fused-overlapping group Lasso. An efficient algorithm is developed to optimize model parameters and ensure global optima. More importantly, to enable the model update in real time, the online learning algorithm is formulated and active set techniques are leveraged to resolve the crucial challenge when new patterns of missing features appear in real time. Extensive experiments on 10 datasets in different domains demonstrate the effectiveness and efficiency of the proposed models.

Download Full-text

Travel time prediction of urban public transportation based on detection of single routes

PLoS ONE ◽

10.1371/journal.pone.0262535 ◽

2022 ◽

Vol 17 (1) ◽

pp. e0262535

Author(s):

Xinhuan Zhang ◽

Les Lauber ◽

Hongjie Liu ◽

Junqing Shi ◽

Meili Xie ◽

...

Keyword(s):

Kalman Filter ◽

Prediction Model ◽

Travel Time ◽

Public Transportation ◽

Service Reliability ◽

Travel Time Prediction ◽

Filter Model ◽

Time Prediction ◽

Study Results ◽

Source Data

Improving travel time prediction for public transit effectively enhances service reliability, optimizes travel structure, and alleviates traffic problems. Its greater time-variance and uncertainty make predictions for short travel times (≤35min) more subject to be influenced by random factors. It requires higher precision and is more complicated than long-term predictions. Effectively extracting and mining real-time, accurate, reliable, and low-cost multi-source data such as GPS, AFC, and IC can provide data support for travel time prediction. Kalman filter model has high accuracy in one-step prediction and can be used to calculate a large amount of data. This paper adopts the Kalman filter as a travel time prediction model for a single bus based on single-line detection: including the travel time prediction model of route (RTM) and the stop dwell time prediction model (DTM); the evaluation criteria and indexes of the models are given. The error analysis of the prediction results is carried out based on AVL data by case study. Results show that under the precondition of multi-source data, the public transportation prediction model can meet the accuracy requirement for travel time prediction and the prediction effect of the whole route is superior to that of the route segment between stops.

Download Full-text

BIDScoin: A User-Friendly Application to Convert Source Data to Brain Imaging Data Structure

Frontiers in Neuroinformatics ◽

10.3389/fninf.2021.770608 ◽

2022 ◽

Vol 15 ◽

Author(s):

Marcel Peter Zwiers ◽

Stefano Moia ◽

Robert Oostenveld

Keyword(s):

Data Structure ◽

Brain Imaging ◽

Imaging Data ◽

Data Set ◽

Source Data ◽

Neuroimaging Data ◽

Programming Skills ◽

Brain Imaging Data ◽

Set Up ◽

User Friendly

Analyses of brain function and anatomy using shared neuroimaging data is an important development, and have acquired the potential to be scaled up with the specification of a new Brain Imaging Data Structure (BIDS) standard. To date, a variety of software tools help researchers in converting their source data to BIDS but often require programming skills or are tailored to specific institutes, data sets, or data formats. In this paper, we introduce BIDScoin, a cross-platform, flexible, and user-friendly converter that provides a graphical user interface (GUI) to help users finding their way in BIDS standard. BIDScoin does not require programming skills to be set up and used and supports plugins to extend their functionality. In this paper, we show its design and demonstrate how it can be applied to a downloadable tutorial data set. BIDScoin is distributed as free and open-source software to foster the community-driven effort to promote and facilitate the use of BIDS standard.

Download Full-text

Feasibility of Using Wearable EMG Armbands combined with Unsupervised Transfer Learning for Seamless Myoelectric Control

10.1101/2022.01.06.475232 ◽

2022 ◽

Author(s):

M. Hongchul Sohn ◽

Sonia Yuxiao Lai ◽

Matthew L. Elwin ◽

Julius P. A. Dewald

Keyword(s):

Neural Network ◽

Transfer Learning ◽

Operation Mode ◽

Classification Performance ◽

Myoelectric Control ◽

Rehabilitation Robotics ◽

Case Scenario ◽

User Intent ◽

Source Data ◽

Intuitive Interfaces

Myoelectric control uses electromyography (EMG) signals as human-originated input to enable intuitive interfaces with machines. As such, recent rehabilitation robotics employs myoelectric control to autonomously classify user intent or operation mode using machine learning. However, performance in such applications inherently suffers from the non-stationarity of EMG signals across measurement conditions. Current laboratory-based solutions rely on careful, time-consuming control of the recordings or periodic recalibration, impeding real-world deployment. We propose that robust yet seamless myoelectric control can be achieved using a low-end, easy-to-don and doff wearable EMG sensor combined with unsupervised transfer learning. Here, we test the feasibility of one such application using a consumer-grade sensor (Myo armband, 8 EMG channels @ 200 Hz) for gesture classification across measurement conditions using an existing dataset: 5 users x 10 days x 3 sensor locations. Specifically, we first train a deep neural network using Temporal-Spatial Descriptors (TSD) with labeled source data from any particular user, day, or location. We then apply the Self-Calibrating Asynchronous Domain Adversarial Neural Network (SCADANN), which automatically adjusts the trained TSD to improve classification performance for unlabeled target data from a different user, day, or sensor location. Compared to the original TSD, SCADANN improves accuracy by 12±5.2% (avg±sd), 9.6±5.0%, and 8.6±3.3% across all possible user-to-user, day-to-day, and location-to-location cases, respectively. In one best-case scenario, accuracy improves by 26% (from 67% to 93%), whereas sometimes the gain is modest (e.g., from 76% to 78%). We also show that the performance of transfer learning can be improved by using a better model trained with good (e.g., incremental) source data. We postulate that the proposed approach is feasible and promising and can be further tailored for seamless myoelectric control of powered prosthetics or exoskeletons.

Download Full-text

Intelligent optimization of drill bits by combining multi-source data fusion and deep neural networks

Energy Sources Part A Recovery Utilization and Environmental Effects ◽

10.1080/15567036.2021.2007314 ◽

2022 ◽

pp. 1-19

Author(s):

Youwei Wan ◽

Xiangjun Liu ◽

Jian Xiong ◽

Lixi Liang

Keyword(s):

Neural Networks ◽

Data Fusion ◽

Deep Neural Networks ◽

Intelligent Optimization ◽

Drill Bits ◽

Source Data

Download Full-text

Analysis of the MODIS above-cloud aerosol retrieval algorithm using MCARS

Geoscientific Model Development ◽

10.5194/gmd-15-1-2022 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-14

Author(s):

Galina Wind ◽

Arlindo M. da Silva ◽

Kerry G. Meyer ◽

Steven Platnick ◽

Peter M. Norris

Keyword(s):

Boundary Layer ◽

Optical Thickness ◽

Marine Boundary Layer ◽

Single Case ◽

Ground Truth ◽

Retrieval Algorithm ◽

Boundary Layer Clouds ◽

Aerosol Retrieval ◽

Source Data ◽

Moderate Resolution Imaging Spectroradiometer

Abstract. The Multi-sensor Cloud and Aerosol Retrieval Simulator (MCARS) presently produces synthetic radiance data from Goddard Earth Observing System version 5 (GEOS-5) model output as if the Moderate Resolution Imaging Spectroradiometer (MODIS) were viewing a combination of atmospheric column inclusive of clouds, aerosols, and a variety of gases and land–ocean surface at a specific location. In this paper we use MCARS to study the MODIS Above-Cloud AEROsol retrieval algorithm (MOD06ACAERO). MOD06ACAERO is presently a regional research algorithm able to retrieve aerosol optical thickness over clouds, in particular absorbing biomass-burning aerosols overlying marine boundary layer clouds in the southeastern Atlantic Ocean. The algorithm's ability to provide aerosol information in cloudy conditions makes it a valuable source of information for modeling and climate studies in an area where current clear-sky-only operational MODIS aerosol retrievals effectively have a data gap between the months of June and October. We use MCARS for a verification and closure study of the MOD06ACAERO algorithm. The purpose of this study is to develop a set of constraints a model developer might use during assimilation of MOD06ACAERO data. Our simulations indicate that the MOD06ACAERO algorithm performs well for marine boundary layer clouds in the SE Atlantic provided some specific screening rules are observed. For the present study, a combination of five simulated MODIS data granules were used for a dataset of 13.5 million samples with known input conditions. When pixel retrieval uncertainty was less than 30 %, optical thickness of the underlying cloud layer was greater than 4, and scattering angle range within the cloud bow was excluded, MOD06ACAERO retrievals agreed with the underlying ground truth (GEOS-5 cloud and aerosol profiles used to generate the synthetic radiances) with a slope of 0.913, offset of 0.06, and RMSE=0.107. When only near-nadir pixels were considered (view zenith angle within ±20∘) the agreement with source data further improved (0.977, 0.051, and 0.096 respectively). Algorithm closure was examined using a single case out of the five used for verification. For closure, the MOD06ACAERO code was modified to use GEOS-5 temperature and moisture profiles as an ancillary. Agreement of MOD06ACAERO retrievals with source data for the closure study had a slope of 0.996 with an offset of −0.007 and RMSE of 0.097 at a pixel uncertainty level of less than 40 %, illustrating the benefits of high-quality ancillary atmospheric data for such retrievals.

Download Full-text

Efficacy prediction based on attribute and multi-source data collaborative for auxiliary medical system in developing countries

Neural Computing and Applications ◽

10.1007/s00521-021-06713-0 ◽

2022 ◽

Author(s):

Genghua Yu ◽

Jia Wu

Keyword(s):

Developing Countries ◽

Medical System ◽

Source Data ◽

Efficacy Prediction

Download Full-text

Model-Induced Generalization Error Bound for Information-Theoretic Representation Learning in Source-Data-Free Unsupervised Domain Adaptation

IEEE Transactions on Image Processing ◽

10.1109/tip.2021.3130530 ◽

2022 ◽

Vol 31 ◽

pp. 419-432

Author(s):

Baoyao Yang ◽

Hao-Wei Yeh ◽

Tatsuya Harada ◽

Pong C. Yuen

Keyword(s):

Error Bound ◽

Domain Adaptation ◽

Representation Learning ◽

Generalization Error ◽

Information Theoretic ◽

Unsupervised Domain Adaptation ◽

Source Data ◽

Generalization Error Bound

Download Full-text

Geological hazard assessment of the coastal area of Rome (Central Italy) from multi-source data integration

Engineering Geology ◽

10.1016/j.enggeo.2022.106527 ◽

2022 ◽

pp. 106527

Author(s):

Roberta Maffucci ◽

Giancarlo Ciotoli ◽

Andrea Pietrosante ◽

Gian Paolo Cavinato ◽

Salvatore Milli ◽

...

Keyword(s):

Data Integration ◽

Hazard Assessment ◽

Coastal Area ◽

Central Italy ◽

Geological Hazard ◽

Source Data

Download Full-text

Airbnb (Air Bed and Breakfast) Listing Analysis Through Machine Learning Techniques

10.4018/978-1-7998-8455-2.ch008 ◽

2022 ◽

pp. 209-232

Author(s):

Xiang Li ◽

Jingxi Liao ◽

Tianchuan Gao

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Data Science ◽

Principal Component ◽

Machine Learning Techniques ◽

Classification Models ◽

Performance Measurements ◽

Learning Techniques ◽

Source Data ◽

Bed And Breakfast

Machine learning is a broad field that contains multiple fields of discipline including mathematics, computer science, and data science. Some of the concepts, like deep neural networks, can be complicated and difficult to explain in several words. This chapter focuses on essential methods like classification from supervised learning, clustering, and dimensionality reduction that can be easily interpreted and explained in an acceptable way for beginners. In this chapter, data for Airbnb (Air Bed and Breakfast) listings in London are used as the source data to study the effect of each machine learning technique. By using the K-means clustering, principal component analysis (PCA), random forest, and other methods to help build classification models from the features, it is able to predict the classification results and provide some performance measurements to test the model.

Download Full-text

source data
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Spatio-Temporal Event Forecasting Using Incremental Multi-Source Feature Learning

Travel time prediction of urban public transportation based on detection of single routes

BIDScoin: A User-Friendly Application to Convert Source Data to Brain Imaging Data Structure

Feasibility of Using Wearable EMG Armbands combined with Unsupervised Transfer Learning for Seamless Myoelectric Control

Intelligent optimization of drill bits by combining multi-source data fusion and deep neural networks

Analysis of the MODIS above-cloud aerosol retrieval algorithm using MCARS

Efficacy prediction based on attribute and multi-source data collaborative for auxiliary medical system in developing countries

Model-Induced Generalization Error Bound for Information-Theoretic Representation Learning in Source-Data-Free Unsupervised Domain Adaptation

Geological hazard assessment of the coastal area of Rome (Central Italy) from multi-source data integration

Airbnb (Air Bed and Breakfast) Listing Analysis Through Machine Learning Techniques

Export Citation Format

source dataRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Spatio-Temporal Event Forecasting Using Incremental Multi-Source Feature Learning

Travel time prediction of urban public transportation based on detection of single routes

BIDScoin: A User-Friendly Application to Convert Source Data to Brain Imaging Data Structure

Feasibility of Using Wearable EMG Armbands combined with Unsupervised Transfer Learning for Seamless Myoelectric Control

Intelligent optimization of drill bits by combining multi-source data fusion and deep neural networks

Analysis of the MODIS above-cloud aerosol retrieval algorithm using MCARS

Efficacy prediction based on attribute and multi-source data collaborative for auxiliary medical system in developing countries

Model-Induced Generalization Error Bound for Information-Theoretic Representation Learning in Source-Data-Free Unsupervised Domain Adaptation

Geological hazard assessment of the coastal area of Rome (Central Italy) from multi-source data integration

Airbnb (Air Bed and Breakfast) Listing Analysis Through Machine Learning Techniques

source data
Recently Published Documents