scholarly journals Measuring Commuting and Economic Activity Inside Cities with Cell Phone Records

2021 ◽  
pp. 1-48
Author(s):  
Gabriel E. Kreindler ◽  
Yuhei Miyauchi

Abstract We show how to use commuting flows to infer the spatial distribution of income within a city. A simple workplace choice model predicts a gravity equation for commuting flows whose destination fixed effects correspond to wages. We implement this method with cell phone transaction data from Dhaka and Colombo. Model-predicted income predicts separate income data, at the workplace and residential level, and by skill group. Unlike machine learning approaches, our method does not require training data, yet achieves comparable predictive power. We show that hartals (transportation strikes) in Dhaka reduce commuting more for high model-predicted wage and high skill commuters.

2019 ◽  
Vol 11 (3) ◽  
pp. 284 ◽  
Author(s):  
Linglin Zeng ◽  
Shun Hu ◽  
Daxiang Xiang ◽  
Xiang Zhang ◽  
Deren Li ◽  
...  

Soil moisture mapping at a regional scale is commonplace since these data are required in many applications, such as hydrological and agricultural analyses. The use of remotely sensed data for the estimation of deep soil moisture at a regional scale has received far less emphasis. The objective of this study was to map the 500-m, 8-day average and daily soil moisture at different soil depths in Oklahoma from remotely sensed and ground-measured data using the random forest (RF) method, which is one of the machine-learning approaches. In order to investigate the estimation accuracy of the RF method at both a spatial and a temporal scale, two independent soil moisture estimation experiments were conducted using data from 2010 to 2014: a year-to-year experiment (with a root mean square error (RMSE) ranging from 0.038 to 0.050 m3/m3) and a station-to-station experiment (with an RMSE ranging from 0.044 to 0.057 m3/m3). Then, the data requirements, importance factors, and spatial and temporal variations in estimation accuracy were discussed based on the results using the training data selected by iterated random sampling. The highly accurate estimations of both the surface and the deep soil moisture for the study area reveal the potential of RF methods when mapping soil moisture at a regional scale, especially when considering the high heterogeneity of land-cover types and topography in the study area.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Meisam Ghasedi ◽  
Maryam Sarfjoo ◽  
Iraj Bargegol

AbstractThe purpose of this study is to investigate and determine the factors affecting vehicle and pedestrian accidents taking place in the busiest suburban highway of Guilan Province located in the north of Iran and provide the most accurate prediction model. Therefore, the effective principal variables and the probability of occurrence of each category of crashes are analyzed and computed utilizing the factor analysis, logit, and Machine Learning approaches simultaneously. This method not only could contribute to achieving the most comprehensive and efficient model to specify the major contributing factor, but also it can provide officials with suggestions to take effective measures with higher precision to lessen accident impacts and improve road safety. Both the factor analysis and logit model show the significant roles of exceeding lawful speed, rainy weather and driver age (30–50) variables in the severity of vehicle accidents. On the other hand, the rainy weather and lighting condition variables as the most contributing factors in pedestrian accidents severity, underline the dominant role of environmental factors in the severity of all vehicle-pedestrian accidents. Moreover, considering both utilized methods, the machine-learning model has higher predictive power in all cases, especially in pedestrian accidents, with 41.6% increase in the predictive power of fatal accidents and 12.4% in whole accidents. Thus, the Artificial Neural Network model is chosen as the superior approach in predicting the number and severity of crashes. Besides, the good performance and validation of the machine learning is proved through performance and sensitivity analysis.


Author(s):  
Karsten Müller

AbstractBased on German business cycle forecast reports covering 10 German institutions for the period 1993–2017, the paper analyses the information content of German forecasters’ narratives for German business cycle forecasts. The paper applies textual analysis to convert qualitative text data into quantitative sentiment indices. First, a sentiment analysis utilizes dictionary methods and text regression methods, using recursive estimation. Next, the paper analyses the different characteristics of sentiments. In a third step, sentiment indices are used to test the efficiency of numerical forecasts. Using 12-month-ahead fixed horizon forecasts, fixed-effects panel regression results suggest some informational content of sentiment indices for growth and inflation forecasts. Finally, a forecasting exercise analyses the predictive power of sentiment indices for GDP growth and inflation. The results suggest weak evidence, at best, for in-sample and out-of-sample predictive power of the sentiment indices.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1757
Author(s):  
María J. Gómez-Silva ◽  
Arturo de la Escalera ◽  
José M. Armingol

Recognizing the identity of a query individual in a surveillance sequence is the core of Multi-Object Tracking (MOT) and Re-Identification (Re-Id) algorithms. Both tasks can be addressed by measuring the appearance affinity between people observations with a deep neural model. Nevertheless, the differences in their specifications and, consequently, in the characteristics and constraints of the available training data for each one of these tasks, arise from the necessity of employing different learning approaches to attain each one of them. This article offers a comparative view of the Double-Margin-Contrastive and the Triplet loss function, and analyzes the benefits and drawbacks of applying each one of them to learn an Appearance Affinity model for Tracking and Re-Identification. A batch of experiments have been conducted, and their results support the hypothesis concluded from the presented study: Triplet loss function is more effective than the Contrastive one when an Re-Id model is learnt, and, conversely, in the MOT domain, the Contrastive loss can better discriminate between pairs of images rendering the same person or not.


2021 ◽  
Vol 69 (4) ◽  
pp. 297-306
Author(s):  
Julius Krause ◽  
Maurice Günder ◽  
Daniel Schulz ◽  
Robin Gruna

Abstract The selection of training data determines the quality of a chemometric calibration model. In order to cover the entire parameter space of known influencing parameters, an experimental design is usually created. Nevertheless, even with a carefully prepared Design of Experiment (DoE), redundant reference analyses are often performed during the analysis of agricultural products. Because the number of possible reference analyses is usually very limited, the presented active learning approaches are intended to provide a tool for better selection of training samples.


2020 ◽  
Author(s):  
Paul Francoeur ◽  
Tomohide Masuda ◽  
David R. Koes

One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard dataset of sufficient size to compare performance between models. We present a new dataset for structure-based machine learning, the CrossDocked2020 set, with 22.5 million poses of ligands docked into multiple similar binding pockets across the Protein Data Bank and perform a comprehensive evaluation of grid-based convolutional neural network models on this dataset. We also demonstrate how the partitioning of the training data and test data can impact the results of models trained with the PDBbind dataset, how performance improves by adding more, lower-quality training data, and how training with docked poses imparts pose sensitivity to the predicted affinity of a complex. Our best performing model, an ensemble of 5 densely connected convolutional newtworks, achieves a root mean squared error of 1.42 and Pearson R of 0.612 on the affinity prediction task, an AUC of 0.956 at binding pose classification, and a 68.4% accuracy at pose selection on the CrossDocked2020 set. By providing data splits for clustered cross-validation and the raw data for the CrossDocked2020 set, we establish the first standardized dataset for training machine learning models to recognize ligands in non-cognate target structures while also greatly expanding the number of poses available for training. In order to facilitate community adoption of this dataset for benchmarking protein-ligand binding affinity prediction, we provide our models, weights, and the CrossDocked2020 set at https://github.com/gnina/models.


2021 ◽  
Vol 13 (19) ◽  
pp. 3859
Author(s):  
Joby M. Prince Czarnecki ◽  
Sathishkumar Samiappan ◽  
Meilun Zhou ◽  
Cary Daniel McCraine ◽  
Louis L. Wasson

The radiometric quality of remotely sensed imagery is crucial for precision agriculture applications because estimations of plant health rely on the underlying quality. Sky conditions, and specifically shadowing from clouds, are critical determinants in the quality of images that can be obtained from low-altitude sensing platforms. In this work, we first compare common deep learning approaches to classify sky conditions with regard to cloud shadows in agricultural fields using a visible spectrum camera. We then develop an artificial-intelligence-based edge computing system to fully automate the classification process. Training data consisting of 100 oblique angle images of the sky were provided to a convolutional neural network and two deep residual neural networks (ResNet18 and ResNet34) to facilitate learning two classes, namely (1) good image quality expected, and (2) degraded image quality expected. The expectation of quality stemmed from the sky condition (i.e., density, coverage, and thickness of clouds) present at the time of the image capture. These networks were tested using a set of 13,000 images. Our results demonstrated that ResNet18 and ResNet34 classifiers produced better classification accuracy when compared to a convolutional neural network classifier. The best overall accuracy was obtained by ResNet34, which was 92% accurate, with a Kappa statistic of 0.77. These results demonstrate a low-cost solution to quality control for future autonomous farming systems that will operate without human intervention and supervision.


2015 ◽  
Vol 3 (3) ◽  
pp. 1-13 ◽  
Author(s):  
Hiroki Nomiya ◽  
Atsushi Morikuni ◽  
Teruhisa Hochin

A lifelog video retrieval framework is proposed for the better utilization of a large amount of lifelog video data. The proposed method retrieves emotional scenes such as the scenes in which a person in the video is smiling, considering that a certain important event could happen in most of emotional scenes. The emotional scene is detected on the basis of facial expression recognition using a wide variety of facial features. The authors adopt an unsupervised learning approach called ensemble clustering in order to recognize the facial expressions because supervised learning approaches require sufficient training data, which make it quite troublesome to apply to large-scale video databases. The retrieval performance of the proposed method is evaluated by means of an emotional scene detection experiment from the viewpoints of accuracy and efficiency. In addition, a prototype retrieval system is implemented based on the proposed emotional scene detection method.


2020 ◽  
Vol 6 (10) ◽  
pp. 101
Author(s):  
Mauricio Alberto Ortega-Ruiz ◽  
Cefa Karabağ ◽  
Victor García Garduño ◽  
Constantino Carlos Reyes-Aldasoro

This paper describes a methodology that extracts key morphological features from histological breast cancer images in order to automatically assess Tumour Cellularity (TC) in Neo-Adjuvant treatment (NAT) patients. The response to NAT gives information on therapy efficacy and it is measured by the residual cancer burden index, which is composed of two metrics: TC and the assessment of lymph nodes. The data consist of whole slide images (WSIs) of breast tissue stained with Hematoxylin and Eosin (H&E) released in the 2019 SPIE Breast Challenge. The methodology proposed is based on traditional computer vision methods (K-means, watershed segmentation, Otsu’s binarisation, and morphological operations), implementing colour separation, segmentation, and feature extraction. Correlation between morphological features and the residual TC after a NAT treatment was examined. Linear regression and statistical methods were used and twenty-two key morphological parameters from the nuclei, epithelial region, and the full image were extracted. Subsequently, an automated TC assessment that was based on Machine Learning (ML) algorithms was implemented and trained with only selected key parameters. The methodology was validated with the score assigned by two pathologists through the intra-class correlation coefficient (ICC). The selection of key morphological parameters improved the results reported over other ML methodologies and it was very close to deep learning methodologies. These results are encouraging, as a traditionally-trained ML algorithm can be useful when limited training data are available preventing the use of deep learning approaches.


Database ◽  
2019 ◽  
Vol 2019 ◽  
Author(s):  
Tao Chen ◽  
Mingfen Wu ◽  
Hexi Li

Abstract The automatic extraction of meaningful relations from biomedical literature or clinical records is crucial in various biomedical applications. Most of the current deep learning approaches for medical relation extraction require large-scale training data to prevent overfitting of the training model. We propose using a pre-trained model and a fine-tuning technique to improve these approaches without additional time-consuming human labeling. Firstly, we show the architecture of Bidirectional Encoder Representations from Transformers (BERT), an approach for pre-training a model on large-scale unstructured text. We then combine BERT with a one-dimensional convolutional neural network (1d-CNN) to fine-tune the pre-trained model for relation extraction. Extensive experiments on three datasets, namely the BioCreative V chemical disease relation corpus, traditional Chinese medicine literature corpus and i2b2 2012 temporal relation challenge corpus, show that the proposed approach achieves state-of-the-art results (giving a relative improvement of 22.2, 7.77, and 38.5% in F1 score, respectively, compared with a traditional 1d-CNN classifier). The source code is available at https://github.com/chentao1999/MedicalRelationExtraction.


Sign in / Sign up

Export Citation Format

Share Document