scholarly journals A geospatial mapping pipeline for ecologists

2021 ◽  
Author(s):  
Johan van den Hoogen ◽  
Niamh Robmann ◽  
Devin Routh ◽  
Thomas Lauber ◽  
Nina van Tiel ◽  
...  

Geospatial modelling can give fundamental insights in the biogeography of life, providing key information about the living world in current and future climate scenarios. Emerging statistical and machine learning approaches can help us to generate new levels of predictive accuracy in exploring the spatial patterns in ecological and biophysical processes. Although these statistical models cannot necessarily represent the essential mechanistic insights that are needed to understand global biogeochemical processes under ever-changing environmental conditions, they can provide unparalleled predictive insights that can be useful for exploring the variation in biophysical processes across space. As such, these emerging tools can be a valuable approach to complement existing mechanistic approaches as we aim to understand the biogeography of Earth's ecosystems. Here, we present a comprehensive methodology that efficiently handles large datasets to produce global predictions. This mapping pipeline can be used to generate quantitative, spatially explicit predictions, with a particular emphasis on spatially-explicit insights into the evaluation of model uncertainties and inaccuracies.

Eye ◽  
2021 ◽  
Author(s):  
Lutfiah Al-Turk ◽  
James Wawrzynski ◽  
Su Wang ◽  
Paul Krause ◽  
George M. Saleh ◽  
...  

Abstract Background In diabetic retinopathy (DR) screening programmes feature-based grading guidelines are used by human graders. However, recent deep learning approaches have focused on end to end learning, based on labelled data at the whole image level. Most predictions from such software offer a direct grading output without information about the retinal features responsible for the grade. In this work, we demonstrate a feature based retinal image analysis system, which aims to support flexible grading and monitor progression. Methods The system was evaluated against images that had been graded according to two different grading systems; The International Clinical Diabetic Retinopathy and Diabetic Macular Oedema Severity Scale and the UK’s National Screening Committee guidelines. Results External evaluation on large datasets collected from three nations (Kenya, Saudi Arabia and China) was carried out. On a DR referable level, sensitivity did not vary significantly between different DR grading schemes (91.2–94.2.0%) and there were excellent specificity values above 93% in all image sets. More importantly, no cases of severe non-proliferative DR, proliferative DR or DMO were missed. Conclusions We demonstrate the potential of an AI feature-based DR grading system that is not constrained to any specific grading scheme.


2020 ◽  
Author(s):  
Pedro Ballester

Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.


2015 ◽  
Vol 12 (10) ◽  
pp. 11083-11127 ◽  
Author(s):  
J. E. Shortridge ◽  
S. D. Guikema ◽  
B. F. Zaitchik

Abstract. In the past decade, certain methods for empirical rainfall–runoff modeling have seen extensive development and been proposed as a useful complement to physical hydrologic models, particularly in basins where data to support process-based models is limited. However, the majority of research has focused on a small number of methods, such as artificial neural networks, despite the development of multiple other approaches for non-parametric regression in recent years. Furthermore, this work has generally evaluated model performance based on predictive accuracy alone, while not considering broader objectives such as model interpretability and uncertainty that are important if such methods are to be used for planning and management decisions. In this paper, we use multiple regression and machine-learning approaches to simulate monthly streamflow in five highly-seasonal rivers in the highlands of Ethiopia and compare their performance in terms of predictive accuracy, error structure and bias, model interpretability, and uncertainty when faced with extreme climate conditions. While the relative predictive performance of models differed across basins, data-driven approaches were able to achieve reduced errors when compared to physical models developed for the region. Methods such as random forests and generalized additive models may have advantages in terms of visualization and interpretation of model structure, which can be useful in providing insights into physical watershed function. However, the uncertainty associated with model predictions under climate change should be carefully evaluated, since certain models (especially generalized additive models and multivariate adaptive regression splines) became highly variable when faced with high temperatures.


2016 ◽  
Author(s):  
Theo Knijnenburg ◽  
Gunnar Klau ◽  
Francesco Iorio ◽  
Mathew Garnett ◽  
Ultan McDermott ◽  
...  

Mining large datasets using machine learning approaches often leads to models that are hard to interpret and not amenable to the generation of hypotheses that can be experimentally tested. Finding 'actionable knowledge' is becoming more important, but also more challenging as datasets grow in size and complexity. We present 'Logic Optimization for Binary Input to Continuous Output' (LOBICO), a computational approach that infers small and easily interpretable logic models of binary input features that explain a binarized continuous output variable. Although the continuous output variable is binarized prior to optimization, the continuous information is retained to find the optimal logic model. Applying LOBICO to a large cancer cell line panel, we find that logic combinations of multiple mutations are more predictive of drug response than single gene predictors. Importantly, we show that the use of the continuous information leads to robust and more accurate logic models. LOBICO is formulated as an integer programming problem, which enables rapid computation on large datasets. Moreover, LOBICO implements the ability to uncover logic models around predefined operating points in terms of sensitivity and specificity. As such, it represents an important step towards practical application of interpretable logic models.


2020 ◽  
Author(s):  
Pedro Ballester

Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.


10.2196/24246 ◽  
2021 ◽  
Vol 23 (2) ◽  
pp. e24246 ◽  
Author(s):  
Siavash Bolourani ◽  
Max Brenner ◽  
Ping Wang ◽  
Thomas McGinn ◽  
Jamie S Hirsch ◽  
...  

Background Predicting early respiratory failure due to COVID-19 can help triage patients to higher levels of care, allocate scarce resources, and reduce morbidity and mortality by appropriately monitoring and treating the patients at greatest risk for deterioration. Given the complexity of COVID-19, machine learning approaches may support clinical decision making for patients with this disease. Objective Our objective is to derive a machine learning model that predicts respiratory failure within 48 hours of admission based on data from the emergency department. Methods Data were collected from patients with COVID-19 who were admitted to Northwell Health acute care hospitals and were discharged, died, or spent a minimum of 48 hours in the hospital between March 1 and May 11, 2020. Of 11,525 patients, 933 (8.1%) were placed on invasive mechanical ventilation within 48 hours of admission. Variables used by the models included clinical and laboratory data commonly collected in the emergency department. We trained and validated three predictive models (two based on XGBoost and one that used logistic regression) using cross-hospital validation. We compared model performance among all three models as well as an established early warning score (Modified Early Warning Score) using receiver operating characteristic curves, precision-recall curves, and other metrics. Results The XGBoost model had the highest mean accuracy (0.919; area under the curve=0.77), outperforming the other two models as well as the Modified Early Warning Score. Important predictor variables included the type of oxygen delivery used in the emergency department, patient age, Emergency Severity Index level, respiratory rate, serum lactate, and demographic characteristics. Conclusions The XGBoost model had high predictive accuracy, outperforming other early warning scores. The clinical plausibility and predictive ability of XGBoost suggest that the model could be used to predict 48-hour respiratory failure in admitted patients with COVID-19.


2020 ◽  
Author(s):  
Kenny F Chou ◽  
Virginia Best ◽  
H Steven Colburn ◽  
Kamal Sen

AbstractListening in an acoustically cluttered scene remains a difficult task for both machines and hearing-impaired listeners. Normal-hearing listeners accomplish this task with relative ease by segregating the scene into its constituent sound sources, then selecting and attending to a target source. An assistive listening device that mimics the biological mechanisms underlying this behavior may provide an effective solution for those with difficulty listening in acoustically cluttered environments (e.g., a cocktail party). Here, we present a binaural sound segregation algorithm based on a hierarchical network model of the auditory system. In the algorithm, binaural sound inputs first drive populations of neurons tuned to specific spatial locations and frequencies. Lateral inhibition then sharpens the spatial response of the neurons. Finally, the spiking response of neurons in the output layer are then reconstructed into audible waveforms via a novel reconstruction method. We evaluate the performance of the algorithm with psychoacoustic measures of normal-hearing listeners. This two-microphone algorithm is shown to provide listeners with perceptual benefit similar to that of a 16-microphone acoustic beamformer in a difficult listening task. Unlike deep-learning approaches, the proposed algorithm is biologically interpretable and does not need to be trained on large datasets. This study presents a biologically based algorithm for sound source segregation as well as a method to reconstruct highly intelligible audio signals from spiking models.Author SummaryAnimal and humans can navigate complex auditory environments with relative ease, attending to certain sounds while suppressing others. Normally, various sounds originate from various spatial locations. This paper presents an algorithmic model to perform sound segregation based on how animals make use of this spatial information at various stages of the auditory pathway. We showed that the performance of this two-microphone algorithm provides as much benefit to normal-hearing listeners a multi-microphone algorithm. Unlike mathematical and machine-learning approaches, our model is fully interpretable and does not require training with large datasets. Such an approach may benefit the design of machine hearing algorithms. To interpret the spike-trains generated in the model, we designed a method to recover sounds from model spikes with high intelligibility. This method can be applied to spiking neural networks for audio-related applications, or to interpret each node within a spiking model of the auditory cortex.


Sensors ◽  
2019 ◽  
Vol 19 (9) ◽  
pp. 2063 ◽  
Author(s):  
Xiangdong Ran ◽  
Zhiguang Shan ◽  
Yufei Fang ◽  
Chuang Lin

Deep learning approaches have been recently applied to traffic prediction because of their ability to extract features of traffic data. While convolutional neural networks may improve the predictive accuracy by transiting traffic data to images and extracting features in the images, the convolutional results can be improved by using the global-level representation that is a direct way to extract features. The time intervals are not considered as aspects of convolutional neural networks for traffic prediction. The attention mechanism may adaptively select a sequence of regions and only process the selected regions to better extract features when aspects are considered. In this paper, we propose the attention mechanism over the convolutional result for traffic prediction. The proposed method is based on multiple links. The time interval is considered as the aspect of attention mechanism. Based on the dataset provided by Highways England, the experimental results show that the proposed method can achieve better accuracy than the baseline methods.


2020 ◽  
Vol 12 (4) ◽  
pp. 1481 ◽  
Author(s):  
Xiaobo Xue Romeiko ◽  
Zhijian Guo ◽  
Yulei Pang ◽  
Eun Kyung Lee ◽  
Xuesong Zhang

Agriculture ranks as one of the top contributors to global warming and nutrient pollution. Quantifying life cycle environmental impacts from agricultural production serves as a scientific foundation for forming effective remediation strategies. However, methods capable of accurately and efficiently calculating spatially explicit life cycle global warming (GW) and eutrophication (EU) impacts at the county scale over a geographic region are lacking. The objective of this study was to determine the most efficient and accurate model for estimating spatially explicit life cycle GW and EU impacts at the county scale, with corn production in the U.S.’s Midwest region as a case study. This study compared the predictive accuracies and efficiencies of five distinct supervised machine learning (ML) algorithms, testing various sample sizes and feature selections. The results indicated that the gradient boosting regression tree model built with approximately 4000 records of monthly weather features yielded the highest predictive accuracy with cross-validation (CV) values of 0.8 for the life cycle GW impacts. The gradient boosting regression tree model built with nearly 6000 records of monthly weather features showed the highest predictive accuracy with CV values of 0.87 for the life cycle EU impacts based on all modeling scenarios. Moreover, predictive accuracy was improved at the cost of simulation time. The gradient boosting regression tree model required the longest training time. ML algorithms demonstrated to be one million times faster than the traditional process-based model with high predictive accuracy. This indicates that ML can serve as an alternative surrogate of process-based models to estimate life-cycle environmental impacts, capturing large geographic areas and timeframes.


Author(s):  
Brian Carnahan ◽  
Gérard Meyer ◽  
Lois-Ann Kuntz

Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches - genetic programming and decision tree induction - were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.


Sign in / Sign up

Export Citation Format

Share Document