scholarly journals Analysis and Prediction of Human Mobility in the United States during the Early Stages of the COVID-19 Pandemic using Regularized Linear Models

Author(s):  
Meghna Chakraborty ◽  
Md Shakir Mahmud ◽  
Timothy J. Gates ◽  
Subhrajit Sinha

Since the United States started grappling with the COVID-19 pandemic, with the highest number of confirmed cases and deaths in the world as of August 2020, most states have enforced travel restrictions resulting in drastic reductions in mobility and travel. However, the long-term implications of this crisis to mobility still remain uncertain. To this end, this study proposes an analytical framework that determines the most significant factors affecting human mobility in the United States during the early days of the pandemic. Particularly, the study uses least absolute shrinkage and selection operator (LASSO) regularization to identify the most significant variables influencing human mobility and uses linear regularization algorithms, including ridge, LASSO, and elastic net modeling techniques, to predict human mobility. State-level data were obtained from various sources from January 1, 2020 to June 13, 2020. The entire data set was divided into a training and a test data set, and the variables selected by LASSO were used to train models by the linear regularization algorithms, using the training data set. Finally, the prediction accuracy of the developed models was examined on the test data. The results indicate that several factors, including the number of new cases, social distancing, stay-at-home orders, domestic travel restrictions, mask-wearing policy, socioeconomic status, unemployment rate, transit mode share, percent of population working from home, and percent of older (60+ years) and African and Hispanic American populations, among others, significantly influence daily trips. Moreover, among all models, ridge regression provides the most superior performance with the least error, whereas both LASSO and elastic net performed better than the ordinary linear model.

2020 ◽  
Author(s):  
Meghna Chakraborty ◽  
Shakir Mahmud ◽  
Timothy Gates ◽  
Subhrajit Sinha

Since the increasing spread of COVID-19 in the U.S., with currently the highest number of confirmed cases and deaths in the world, most states in the nation have enforced travel restrictions resulting in drastic reductions in mobility and travel. However, the overall impact and long-term implications of this crisis to mobility still remain uncertain. To this end, this study develops an analytical framework that determines the most significant factors impacting human mobility and travel in the U.S. during the pandemic. In particular, we use Least Absolute Shrinkage and Selection Operator (LASSO) to identify the significant variables influencing human mobility and utilize linear regularization algorithms, including Ridge, LASSO, and Elastic Net modeling techniques to model and predict human mobility and travel. State-level data were obtained from various open-access sources for the period from January 1, 2020 to June 13, 2020. The entire data set was divided into a training data-set and a test data-set and the variables selected by LASSO were used to train four different models by ordinary linear regression, Ridge regression, LASSO and Elastic Net regression algorithms, using the training data-set. Finally, the prediction accuracy of the developed models was examined on the test data. The results indicate that among all models, the Ridge regression provides the most superior performance with the least error, while both LASSO and Elastic Net performed better than the ordinary linear model.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chao Fan ◽  
Ronald Lee ◽  
Yang Yang ◽  
Ali Mostafavi

AbstractDeriving effective mobility control measures is critical for the control of COVID-19 spreading. In response to the COVID-19 pandemic, many countries and regions implemented travel restrictions and quarantines to reduce human mobility and thus reduce virus transmission. But since human mobility decreased heterogeneously, we lack empirical evidence of the extent to which the reductions in mobility alter the way people from different regions of cities are connected, and what containment policies could complement mobility reductions to conquer the pandemic. Here, we examined individual movements in 21 of the most affected counties in the United States, showing that mobility reduction leads to a segregated place network and alters its relationship with pandemic spread. Our findings suggest localized area-specific policies, such as geo-fencing, as viable alternatives to city-wide lockdown for conquering the pandemic after mobility was reduced.


2021 ◽  
Author(s):  
Hye-Won Hwang ◽  
Jun-Ho Moon ◽  
Min-Gyu Kim ◽  
Richard E. Donatelli ◽  
Shin-Jae Lee

ABSTRACT Objectives To compare an automated cephalometric analysis based on the latest deep learning method of automatically identifying cephalometric landmarks (AI) with previously published AI according to the test style of the worldwide AI challenges at the International Symposium on Biomedical Imaging conferences held by the Institute of Electrical and Electronics Engineers (IEEE ISBI). Materials and Methods This latest AI was developed by using a total of 1983 cephalograms as training data. In the training procedures, a modification of a contemporary deep learning method, YOLO version 3 algorithm, was applied. Test data consisted of 200 cephalograms. To follow the same test style of the AI challenges at IEEE ISBI, a human examiner manually identified the IEEE ISBI-designated 19 cephalometric landmarks, both in training and test data sets, which were used as references for comparison. Then, the latest AI and another human examiner independently detected the same landmarks in the test data set. The test results were compared by the measures that appeared at IEEE ISBI: the success detection rate (SDR) and the success classification rates (SCR). Results SDR of the latest AI in the 2-mm range was 75.5% and SCR was 81.5%. These were greater than any other previous AIs. Compared to the human examiners, AI showed a superior success classification rate in some cephalometric analysis measures. Conclusions This latest AI seems to have superior performance compared to previous AI methods. It also seems to demonstrate cephalometric analysis comparable to human examiners.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Richard Johnston ◽  
Xiaohan Yan ◽  
Tatiana M. Anderson ◽  
Edwin A. Mitchell

AbstractThe effect of altitude on the risk of sudden infant death syndrome (SIDS) has been reported previously, but with conflicting findings. We aimed to examine whether the risk of sudden unexpected infant death (SUID) varies with altitude in the United States. Data from the Centers for Disease Control and Prevention (CDC)’s Cohort Linked Birth/Infant Death Data Set for births between 2005 and 2010 were examined. County of birth was used to estimate altitude. Logistic regression and Generalized Additive Model (GAM) were used, adjusting for year, mother’s race, Hispanic origin, marital status, age, education and smoking, father’s age and race, number of prenatal visits, plurality, live birth order, and infant’s sex, birthweight and gestation. There were 25,305,778 live births over the 6-year study period. The total number of deaths from SUID in this period were 23,673 (rate = 0.94/1000 live births). In the logistic regression model there was a small, but statistically significant, increased risk of SUID associated with birth at > 8000 feet compared with < 6000 feet (aOR = 1.93; 95% CI 1.00–3.71). The GAM showed a similar increased risk over 8000 feet, but this was not statistically significant. Only 9245 (0.037%) of mothers gave birth at > 8000 feet during the study period and 10 deaths (0.042%) were attributed to SUID. The number of SUID deaths at this altitude in the United States is very small (10 deaths in 6 years).


2014 ◽  
Vol 7 (5) ◽  
pp. 2477-2484 ◽  
Author(s):  
J. C. Kathilankal ◽  
T. L. O'Halloran ◽  
A. Schmidt ◽  
C. V. Hanson ◽  
B. E. Law

Abstract. A semi-parametric PAR diffuse radiation model was developed using commonly measured climatic variables from 108 site-years of data from 17 AmeriFlux sites. The model has a logistic form and improves upon previous efforts using a larger data set and physically viable climate variables as predictors, including relative humidity, clearness index, surface albedo and solar elevation angle. Model performance was evaluated by comparison with a simple cubic polynomial model developed for the PAR spectral range. The logistic model outperformed the polynomial model with an improved coefficient of determination and slope relative to measured data (logistic: R2 = 0.76; slope = 0.76; cubic: R2 = 0.73; slope = 0.72), making this the most robust PAR-partitioning model for the United States currently available.


2021 ◽  
pp. 215336872110389
Author(s):  
Andrew J. Baranauskas

In the effort to prevent school shootings in the United States, policies that aim to arm teachers with guns have received considerable attention. Recent research on public support for these policies finds that African Americans are substantially less likely to support them, indicating that support for arming teachers is a racial issue. Given the racialized nature of support for punitive crime policies in the United States, it is possible that racial sentiment shapes support for arming teachers as well. This study aims to determine the association between two types of racial sentiment—explicit negative feelings toward racial/ethnic minority groups and racial resentment—and support for arming teachers using a nationally representative data set. While explicit negative feelings toward African Americans and Hispanics are not associated with support for arming teachers, those with racial resentments are significantly more likely to support arming teachers. Racial resentment also weakens the effect of other variables found to be associated with support for arming teachers, including conservative ideology and economic pessimism. Implications for policy and research are discussed.


2021 ◽  
Author(s):  
Marni Mack ◽  
Argo Easston

In the United States, sepsis, the body's response to infection in a typically sterile circulation, is a leading causeof death (1). To assess the primary transcriptional alterations associated with each illness state, I utilized amicroarray data set from a cohort of thirtyone individuals with septic shock or systemic inflammatory responsesyndrome (2). At the transcriptional level, I discovered that the granulocytes of patients with SIRS weresimilar to those of patients with septic shock. SIRS showed a “intermediate” gene expression state betweenthat of control patients and that of septic shock patients for numerous genes expressed in the granulocyte. Thediscovery of the most differentially expressed genes in the granulocytic immune cells of patients with septicshock might aid the development of new therapies or diagnostics for an illness with a 14.7 percent to 29.9% inhospitaldeath rate despite decades of study (1).


Author(s):  
Yanxiang Yu ◽  
◽  
Chicheng Xu ◽  
Siddharth Misra ◽  
Weichang Li ◽  
...  

Compressional and shear sonic traveltime logs (DTC and DTS, respectively) are crucial for subsurface characterization and seismic-well tie. However, these two logs are often missing or incomplete in many oil and gas wells. Therefore, many petrophysical and geophysical workflows include sonic log synthetization or pseudo-log generation based on multivariate regression or rock physics relations. Started on March 1, 2020, and concluded on May 7, 2020, the SPWLA PDDA SIG hosted a contest aiming to predict the DTC and DTS logs from seven “easy-to-acquire” conventional logs using machine-learning methods (GitHub, 2020). In the contest, a total number of 20,525 data points with half-foot resolution from three wells was collected to train regression models using machine-learning techniques. Each data point had seven features, consisting of the conventional “easy-to-acquire” logs: caliper, neutron porosity, gamma ray (GR), deep resistivity, medium resistivity, photoelectric factor, and bulk density, respectively, as well as two sonic logs (DTC and DTS) as the target. The separate data set of 11,089 samples from a fourth well was then used as the blind test data set. The prediction performance of the model was evaluated using root mean square error (RMSE) as the metric, shown in the equation below: RMSE=sqrt(1/2*1/m* [∑_(i=1)^m▒〖(〖DTC〗_pred^i-〖DTC〗_true^i)〗^2 + 〖(〖DTS〗_pred^i-〖DTS〗_true^i)〗^2 ] In the benchmark model, (Yu et al., 2020), we used a Random Forest regressor and conducted minimal preprocessing to the training data set; an RMSE score of 17.93 was achieved on the test data set. The top five models from the contest, on average, beat the performance of our benchmark model by 27% in the RMSE score. In the paper, we will review these five solutions, including preprocess techniques and different machine-learning models, including neural network, long short-term memory (LSTM), and ensemble trees. We found that data cleaning and clustering were critical for improving the performance in all models.


1976 ◽  
Vol 38 (3_suppl) ◽  
pp. 1023-1050 ◽  
Author(s):  
Wilma J. Knox

The literature published in 1971 and 1972 on alcoholics in the United States was reviewed for objective psychological test data or behavioral measurements. The review was organized to facilitate further research by assembling information according to problem area and by including tests employed, significant findings ( p = .05), critical comments, and inferences for therapy. An appendix of references from 1968–1970 employing objective psychological measurements is included and cross-indexed.


2021 ◽  
Author(s):  
Louise Bloch ◽  
Christoph M. Friedrich

Abstract Background: The prediction of whether Mild Cognitive Impaired (MCI) subjects will prospectively develop Alzheimer's Disease (AD) is important for the recruitment and monitoring of subjects for therapy studies. Machine Learning (ML) is suitable to improve early AD prediction. The etiology of AD is heterogeneous, which leads to noisy data sets. Additional noise is introduced by multicentric study designs and varying acquisition protocols. This article examines whether an automatic and fair data valuation method based on Shapley values can identify subjects with noisy data. Methods: An ML-workow was developed and trained for a subset of the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. The validation was executed for an independent ADNI test data set and for the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) cohort. The workow included volumetric Magnetic Resonance Imaging (MRI) feature extraction, subject sample selection using data Shapley, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for model training and Kernel SHapley Additive exPlanations (SHAP) values for model interpretation. This model interpretation enables clinically relevant explanation of individual predictions. Results: The XGBoost models which excluded 116 of the 467 subjects from the training data set based on their Logistic Regression (LR) data Shapley values outperformed the models which were trained on the entire training data set and which reached a mean classification accuracy of 58.54 % by 14.13 % (8.27 percentage points) on the independent ADNI test data set. The XGBoost models, which were trained on the entire training data set reached a mean accuracy of 60.35 % for the AIBL data set. An improvement of 24.86 % (15.00 percentage points) could be reached for the XGBoost models if those 72 subjects with the smallest RF data Shapley values were excluded from the training data set. Conclusion: The data Shapley method was able to improve the classification accuracies for the test data sets. Noisy data was associated with the number of ApoEϵ4 alleles and volumetric MRI measurements. Kernel SHAP showed that the black-box models learned biologically plausible associations.


Sign in / Sign up

Export Citation Format

Share Document