ESTIMATING CORN YIELD IN THE UNITED STATES WITH MODIS EVI AND MACHINE LEARNING METHODS

Satellite remote sensing is commonly used to monitor crop yield in wide areas. Because many parameters are necessary for crop yield estimation, modelling the relationships between parameters and crop yield is generally complicated. Several methodologies using machine learning have been proposed to solve this issue, but the accuracy of county-level estimation remains to be improved. In addition, estimating county-level crop yield across an entire country has not yet been achieved. In this study, we applied a deep neural network (DNN) to estimate corn yield. We evaluated the estimation accuracy of the DNN model by comparing it with other models trained by different machine learning algorithms. We also prepared two time-series datasets differing in duration and confirmed the feature extraction performance of models by inputting each dataset. As a result, the DNN estimated county-level corn yield for the entire area of the United States with a determination coefficient (R2) of 0.780 and a root mean square error (RMSE) of 18.2 bushels/acre. In addition, our results showed that estimation models that were trained by a neural network extracted features from the input data better than an existing machine learning algorithm.

Download Full-text

ESTIMATING CORN YIELD IN THE UNITED STATES WITH MODIS EVI AND MACHINE LEARNING METHODS

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-iii-8-131-2016 ◽

2016 ◽

Vol III-8 ◽

pp. 131-136 ◽

Cited By ~ 8

Author(s):

K. Kuwata ◽

R. Shibasaki

Keyword(s):

Neural Network ◽

United States ◽

Machine Learning ◽

Crop Yield ◽

The United States ◽

Machine Learning Algorithms ◽

Corn Yield ◽

County Level ◽

Entire Area ◽

Modis Evi

Download Full-text

Characteristics of Twitter Use by State Medicaid Programs in the United States: Machine Learning Approach

Journal of Medical Internet Research ◽

10.2196/18401 ◽

2020 ◽

Vol 22 (8) ◽

pp. e18401

Author(s):

Jane M Zhu ◽

Abeed Sarker ◽

Sarah Gollust ◽

Raina Merchant ◽

David Grande

Keyword(s):

Public Health ◽

United States ◽

Machine Learning ◽

Public Health Education ◽

The United States ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Care Organization ◽

The Public ◽

The Mean

Background Twitter is a potentially valuable tool for public health officials and state Medicaid programs in the United States, which provide public health insurance to 72 million Americans. Objective We aim to characterize how Medicaid agencies and managed care organization (MCO) health plans are using Twitter to communicate with the public. Methods Using Twitter’s public application programming interface, we collected 158,714 public posts (“tweets”) from active Twitter profiles of state Medicaid agencies and MCOs, spanning March 2014 through June 2019. Manual content analyses identified 5 broad categories of content, and these coded tweets were used to train supervised machine learning algorithms to classify all collected posts. Results We identified 15 state Medicaid agencies and 81 Medicaid MCOs on Twitter. The mean number of followers was 1784, the mean number of those followed was 542, and the mean number of posts was 2476. Approximately 39% of tweets came from just 10 accounts. Of all posts, 39.8% (63,168/158,714) were classified as general public health education and outreach; 23.5% (n=37,298) were about specific Medicaid policies, programs, services, or events; 18.4% (n=29,203) were organizational promotion of staff and activities; and 11.6% (n=18,411) contained general news and news links. Only 4.5% (n=7142) of posts were responses to specific questions, concerns, or complaints from the public. Conclusions Twitter has the potential to enhance community building, beneficiary engagement, and public health outreach, but appears to be underutilized by the Medicaid program.

Download Full-text

Explaining COVID-19 Outbreaks with Reactive SEIRD Models

10.1101/2021.02.09.21251440 ◽

2021 ◽

Author(s):

Kunal Menda ◽

Lucas Laird ◽

Mykel J. Kochenderfer ◽

Rajmonda S. Caceres

Keyword(s):

Neural Network ◽

United States ◽

Infection Rate ◽

Expectation Maximization ◽

The United States ◽

County Level ◽

Training Set ◽

Single Model ◽

Partial Observability ◽

Level Data

AbstractCOVID-19 epidemics have varied dramatically in nature across the United States, where some counties have clear peaks in infections, and others have had a multitude of unpredictable and non-distinct peaks. In this work, we seek to explain the diversity in epidemic progressions by considering an extension to the compartmental SEIRD model. The model we propose uses a neural network to predict the infection rate as a function of time and of the prevalence of the disease. We provide a methodology for fitting this model to available county-level data describing aggregate cases and deaths. Our method uses Expectation-Maximization in order to overcome the challenge of partial observability—that the system’s state is only partially reflected in available data. We fit a single model to data from multiple counties in the United States exhibiting different behavior. By simulating the model, we show that it is capable of exhibiting both single peak and multi-peak behavior, reproducing behavior observed in counties both in and out of the training set. We also numerically compare the error of simulations from our model with a standard SEIRD model, showing that the proposed extensions are necessary to be able to explain the spread of COVID-19.

Download Full-text

Artificial Neural Network Modeling of Novel Coronavirus (COVID-19) Incidence Rates across the Continental United States

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17124204 ◽

2020 ◽

Vol 17 (12) ◽

pp. 4204 ◽

Cited By ~ 8

Author(s):

Abolfazl Mollalo ◽

Kiara M. Rivera ◽

Behzad Vahedi

Keyword(s):

Neural Network ◽

United States ◽

Disease Incidence ◽

Ground Truth ◽

The United States ◽

Incidence Rates ◽

Machine Learning Algorithms ◽

Neural Network Modeling ◽

Public Health Decision ◽

Explanatory Variables

Prediction of the COVID-19 incidence rate is a matter of global importance, particularly in the United States. As of 4 June 2020, more than 1.8 million confirmed cases and over 108 thousand deaths have been reported in this country. Few studies have examined nationwide modeling of COVID-19 incidence in the United States particularly using machine-learning algorithms. Thus, we collected and prepared a database of 57 candidate explanatory variables to examine the performance of multilayer perceptron (MLP) neural network in predicting the cumulative COVID-19 incidence rates across the continental United States. Our results indicated that a single-hidden-layer MLP could explain almost 65% of the correlation with ground truth for the holdout samples. Sensitivity analysis conducted on this model showed that the age-adjusted mortality rates of ischemic heart disease, pancreatic cancer, and leukemia, together with two socioeconomic and environmental factors (median household income and total precipitation), are among the most substantial factors for predicting COVID-19 incidence rates. Moreover, results of the logistic regression model indicated that these variables could explain the presence/absence of the hotspots of disease incidence that were identified by Getis-Ord Gi* (p < 0.05) in a geographic information system environment. The findings may provide useful insights for public health decision makers regarding the influence of potential risk factors associated with the COVID-19 incidence at the county level.

Download Full-text

1344. Predicting Measles Outbreaks in the United States: Application of Different Modeling Approaches

Open Forum Infectious Diseases ◽

10.1093/ofid/ofab466.1536 ◽

2021 ◽

Vol 8 (Supplement_1) ◽

pp. S759-S759

Author(s):

Stephanie Kujawski ◽

Boshu Ru ◽

Amar K Das ◽

Nelson L Afanador ◽

richard baumgartner ◽

...

Keyword(s):

United States ◽

Machine Learning ◽

Logistic Regression ◽

Panel Member ◽

The United States ◽

County Level ◽

Review Panel ◽

Modeling Approaches ◽

Case Data ◽

The U.S

Abstract Background Although measles is still rare in the United States (U.S.), there have been recent resurgent outbreaks in the U.S. To improve the accuracy of prediction given the rarity of measles events, we used machine learning (ML) algorithms to model measles case predictions at the U.S. county level. Methods The main outcome was occurrence of ≥1 measles case at the U.S. county level. Two ML prediction models were developed (HDBSCAN, a clustering algorithm, and XGBoost, a gradient boosting algorithm) and compared with traditional logistic regression. We included 28 predictors in the following categories: sociodemographics, population statistics, measles vaccination coverage, healthcare access, and exposure to measles via international air travel. The models were trained on 2014 case data and validated on 2018 case data. Models were compared using area under the receiver operating curve (AUC), sensitivity, specificity, positive predictive value (PPV), and F2 score (combined measure of sensitivity and PPV). Results There were 667 measles cases in 2014 and 375 in 2018 in the U.S. We identified U.S. counties for 635 (95.2%) cases in 2014 and 366 (97.6%) cases in 2018 through published sources, corresponding to 81/3143 (2.6%) counties in 2014 and 64/3143 (2.0%) counties in 2018 with ≥1 measles case. HDBSCAN had the highest sensitivity (0.92), but lowest AUC (0.68) and PPV (0.04) (Table). XGBoost had the highest F2 score (0.49), best balance of sensitivity (0.72) and specificity (0.94), and AUC = 0.92. Logistic regression had high AUC (0.91) and specificity (1.00) but the lowest sensitivity (0.16). Conclusion Machine learning approaches outperformed logistic regression by maximizing sensitivity to predict counties with measles cases, an important criterion to consider to prevent or prepare for future outbreaks. XGBoost or logistic regression could be considered to maximize specificity. Prioritizing sensitivity versus specificity may depend on county resources, priorities, and measles risk. Different modeling approaches could be considered to optimize surveillance efforts and develop effective interventions for timely response. Disclosures Stephanie Kujawski, PhD MPH, Merck & Co., Inc. (Employee, Shareholder) Boshu Ru, Ph.D., Merck & Co. Kenilworth, NJ (NYSE: MRK) (Employee, Shareholder) Amar K. Das, MD, PhD, Merck (Employee) richard baumgartner, PhD, Merck (Employee) Shuang Lu, MBA, MS, Merck (Employee) Matthew Pillsbury, PhD, Merck & CO. (Employee, Shareholder) Joseph Lewnard, PhD, Merck (Consultant, Grant/Research Support) James H. Conway, MD, FAAP, GSK (Advisor or Review Panel member)Merck (Advisor or Review Panel member)Moderna (Advisor or Review Panel member)Pfizer (Advisor or Review Panel member)Sanofi Pasteur (Research Grant or Support) Manjiri D. Pawaskar, PhD, Merck & Co., Inc. (Employee, Shareholder)

Download Full-text

Collusive Algorithms as Mere Tools, Super-tools or Legal Persons

Journal of Competition Law & Economics ◽

10.1093/joclec/nhz010 ◽

2019 ◽

Author(s):

Guan Zheng ◽

Hong Wu

Keyword(s):

United States ◽

Machine Learning ◽

Learning Algorithms ◽

The United States ◽

Machine Learning Algorithms ◽

Tacit Collusion ◽

Antitrust Law ◽

Strong Argument ◽

Market Pricing ◽

Distinct Features

Abstract The widespread use of algorithmic technologies makes rules on tacit collusion, which are already controversial in antitrust law, more complicated. These rules have obvious limitations in effectively regulating algorithmic collusion. Although some scholars and practitioners within antitrust circles in the United States, Europe and beyond have taken notice of this problem, they have failed to a large extent to make clear its specific manifestations, root causes, and effective legal solutions. In this article, the authors make a strong argument that it is no longer appropriate to regard algorithms as mere tools of firms, and that the distinct features of machine learning algorithms as super-tools and as legal persons may inevitably bring about two new cracks in antitrust law. This article clarifies the root causes why these rules are inapplicable to a large extent to algorithmic collusion particularly in the case of machine learning algorithms, classifies the new legal cracks, and provides sound legal criteria for the courts and competition authorities to assess the legality of algorithmic collusion much more accurately. More importantly, this article proposes an efficacious solution to revive the market pricing mechanism for the purposes of resolving the two new cracks identified in antitrust law.

Download Full-text

Predicting wildfire burned area in South Central US using integrated machine learning techniques

10.5194/acp-2019-885 ◽

2019 ◽

Author(s):

Sing-Chun Wang ◽

Yuxuan Wang

Keyword(s):

United States ◽

Machine Learning ◽

The United States ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Environmental Drivers ◽

Fire Season ◽

Burned Area ◽

Environmental Controls ◽

South Central

Abstract. Occurrences of devastating wildfires have been on the rise in the United States for the past decades. While the environmental controls, including weather, climate, and fuels, are known to play important roles in controlling wildfires, the interrelationships between fires and the environmental controls are highly complex and may not be well represented by traditional parametric regressions. Here we develop a model integrating multiple machine learning algorithms to predict gridded monthly wildfire burned area during 2002–2015 over the South Central United States and identify the relative importance of the environmental drivers on the burned area for both the winter-spring and summer fire seasons of that region. The developed model is able to alleviate the issue of unevenly-distributed burned area data and achieve a cross-validation (CV) R2 value of 0.42 and 0.40 for the two fire seasons. For the total burned area over the study domain, the model can explain 50 % and 79 % of interannual total burned area for the winter-spring and summer fire season, respectively. The prediction model ranks relative humidity (RH) anomalies and preceding months’ drought severity as the top two most important predictors on the gridded burned area for both fire seasons. Sensitivity experiments with the model show that the effect of climate change represented by a group of climate-anomaly variables contributes the most to the burned area for both fire seasons. Antecedent fuel amount and conditions are found to outweigh weather effects for the burned area in the winter-spring fire season, while the current-month fire weather is more important for the summer fire season likely due to the controlling effect of weather on fuel moisture in this season. This developed model allows us to predict gridded burned area and to access specific fire management strategies for different fire mechanisms in the two seasons.

Download Full-text

Characteristics of Twitter Use by State Medicaid Programs in the United States: Machine Learning Approach (Preprint)

10.2196/preprints.18401 ◽

2020 ◽

Author(s):

Jane M Zhu ◽

Abeed Sarker ◽

Sarah Gollust ◽

Raina Merchant ◽

David Grande

Keyword(s):

Public Health ◽

United States ◽

Machine Learning ◽

Public Health Education ◽

The United States ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Care Organization ◽

The Public ◽

The Mean

BACKGROUND Twitter is a potentially valuable tool for public health officials and state Medicaid programs in the United States, which provide public health insurance to 72 million Americans. OBJECTIVE We aim to characterize how Medicaid agencies and managed care organization (MCO) health plans are using Twitter to communicate with the public. METHODS Using Twitter’s public application programming interface, we collected 158,714 public posts (“tweets”) from active Twitter profiles of state Medicaid agencies and MCOs, spanning March 2014 through June 2019. Manual content analyses identified 5 broad categories of content, and these coded tweets were used to train supervised machine learning algorithms to classify all collected posts. RESULTS We identified 15 state Medicaid agencies and 81 Medicaid MCOs on Twitter. The mean number of followers was 1784, the mean number of those followed was 542, and the mean number of posts was 2476. Approximately 39% of tweets came from just 10 accounts. Of all posts, 39.8% (63,168/158,714) were classified as general public health education and outreach; 23.5% (n=37,298) were about specific Medicaid policies, programs, services, or events; 18.4% (n=29,203) were organizational promotion of staff and activities; and 11.6% (n=18,411) contained general news and news links. Only 4.5% (n=7142) of posts were responses to specific questions, concerns, or complaints from the public. CONCLUSIONS Twitter has the potential to enhance community building, beneficiary engagement, and public health outreach, but appears to be underutilized by the Medicaid program.

Download Full-text

Explaining COVID-19 outbreaks with reactive SEIRD models

Scientific Reports ◽

10.1038/s41598-021-97260-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Kunal Menda ◽

Lucas Laird ◽

Mykel J. Kochenderfer ◽

Rajmonda S. Caceres

Keyword(s):

Neural Network ◽

United States ◽

Expectation Maximization ◽

Simulated Data ◽

The United States ◽

County Level ◽

Training Set ◽

Single Model ◽

Partial Observability ◽

Level Data

AbstractCOVID-19 epidemics have varied dramatically in nature across the United States, where some counties have clear peaks in infections, and others have had a multitude of unpredictable and non-distinct peaks. Our lack of understanding of how the pandemic has evolved leads to increasing errors in our ability to predict the spread of the disease. This work seeks to explain this diversity in epidemic progressions by considering an extension to the compartmental SEIRD model. The model we propose uses a neural network to predict the infection rate as a function of both time and the disease’s prevalence. We provide a methodology for fitting this model to available county-level data describing aggregate cases and deaths. Our method uses Expectation-Maximization to overcome the challenge of partial observability, due to the fact that the system’s state is only partially reflected in available data. We fit a single model to data from multiple counties in the United States exhibiting different behavior. By simulating the model, we show that it can exhibit both single peak and multi-peak behavior, reproducing behavior observed in counties both in and out of the training set. We then compare the error of simulations from our model with a standard SEIRD model, and show that ours substantially reduces errors. We also use simulated data to compare our methodology for handling partial observability with a standard approach, showing that ours is significantly better at estimating the values of unobserved quantities.

Download Full-text

Recent changes in county-level corn yield variability in the United States from observations and crop models

The Science of The Total Environment ◽

10.1016/j.scitotenv.2017.07.017 ◽

2017 ◽

Vol 607-608 ◽

pp. 683-690 ◽

Cited By ~ 21

Author(s):

Guoyong Leng

Keyword(s):

United States ◽

The United States ◽

Yield Variability ◽

Corn Yield ◽

County Level ◽

Crop Models

Download Full-text