scholarly journals Use of Correlated Data for Nonparametric Prediction of a Spatial Target Variable

Mathematics ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 2077
Author(s):  
Pilar García-Soidán ◽  
Tomás R. Cotos-Yáñez

The kriging methodology can be applied to predict the value of a spatial variable at an unsampled location, from the available spatial data. Furthermore, additional information from secondary variables, correlated with the target one, can be included in the resulting predictor by using the cokriging techniques. The latter procedures require a previous specification of the multivariate dependence structure, difficult to characterize in practice in an appropriate way. To simplify this task, the current work introduces a nonparametric kernel approach for prediction, which satisfies good properties, such as asymptotic unbiasedness or the convergence to zero of the mean squared prediction error. The selection of the bandwidth parameters involved is also addressed, as well as the estimation of the remaining unknown terms in the kernel predictor. The performance of the new methodology is illustrated through numerical studies with simulated data, carried out in different scenarios. In addition, the proposed nonparametric approach is applied to predict the concentrations of a pollutant that represents a risk to human health, the cadmium, in the floodplain of the Meuse river (Netherlands), by incorporating the lead level as an auxiliary variable.

2021 ◽  
Author(s):  
Hlib Nekrasov ◽  
Alexander Vostrikov ◽  
Ekaterina Prokofeva ◽  
Nashon Adero

Abstract Background: This article discusses the approach to the implementation of the project for the extraction and the methodology of preliminary processing of the obtained data with the aim of centralized accumulation for collective multipurpose use of the databank on the example of carbon dioxide emissions into the atmosphere by air transport for a given territory. It should be noted that on the basis of morphological analysis, processing, as well as the classification of spatial objects of the geodatabase and additional information, it is subsequently possible to organize, for example, a system of geoecological monitoring.Methods: At the fundamental level, the research used integration and process-based approaches, the method of extrapolation, expert methods of evaluation, random selection and analytical comparisons, a set of methods of spatial analysis based on various instruments and sources. In this study are used of open standards OGC, web, GIS technologies and the Internet for the formation, processing and storage of spatial data, their unambiguous geolocation, the implementation of territorial selections and visualization of results.Results: The set of data, which was organized according to the proposed and defined rules, made it possible to assess the structural processing of geospatial data, and to prepare a visual representation of the impact of aviation on the environmental situation over the designated geographic area.Conclusions: The transport industry was chosen as the object of research, but this solution can also be successfully applied to other logistics and industrial areas. During the implementation of the project, the analysis of the subject area was carried out, the architecture of the future prototype of the databank was designed, the accumulated data from the sources was structured, and a database was selected for storing them, taking into account the provision of high availability and ensuring stable operation under high loads. For the convenience of displaying data, an interactive visualization tool with a convenient and friendly user interface has been developed.


2005 ◽  
Vol 20 (29) ◽  
pp. 6855-6857
Author(s):  
◽  
J. VAN BUREN ◽  
T. ANTONI ◽  
W. D. APEL ◽  
F. BADEA ◽  
...  

The KASCADE-Grande experiment extends the existing extensive air shower experiment KASCADE by an array of 37 detector stations spread over an area of 0.5 km2. The new Grande array measures the charged component of the showers. To reconstruct the mass composition of the cosmic rays, additional information from the muon component of air showers is required. Using observables from the new Grande array together with the muon detectors of the original KASCADE array provides a possibility to reconstruct also the total muon number of each air shower. The quality of reconstruction based on comparison with simulated data, as well as first results of a reconstructed muon size spectrum are presented.


Symmetry ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2308
Author(s):  
Xiaofu Du ◽  
Qiuming Zhu ◽  
Guoru Ding ◽  
Jie Li ◽  
Qihui Wu ◽  
...  

As the number of civil aerial vehicles increase explosively, spectrum scarcity and security become an increasingly challenge in both the airspace and terrestrial space. To address this difficulty, this paper presents an unmanned aerial vehicle-assisted (UAV-assisted) spectrum mapping system and a spectrum data reconstruction algorithm driven by spectrum data and channel model are proposed. The reconstruction algorithm, which includes a model-driven spectrum data inference method and a spectrum data completion method with uniformity decision mechanism, can reconstruct limited and incomplete spectrum data to a three-dimensional (3D) spectrum map. As a result, spectrum scarcity and security can be achieved. Spectrum mapping is a symmetry-based digital twin technology. By employing an uniformity decision mechanism, the proposed completion method can effectively interpolate spatial data even when the collected data are unevenly distributed. The effectiveness of the proposed mapping scheme is evaluated by comparing its results with the ray-tracing simulated data of the campus scenario. Simulation results show that the proposed reconstruction algorithm outperforms the classical inverse distance weighted (IDW) interpolation method and the tensor completion method by about 12.5% and 92.3%, respectively, in terms of reconstruction accuracy when the collected spectrum data are regularly missing, unevenly distributed and limited.


Entropy ◽  
2021 ◽  
Vol 23 (10) ◽  
pp. 1270
Author(s):  
Milan Žukovič ◽  
Dionissios T. Hristopulos

We apply the Ising model with nearest-neighbor correlations (INNC) in the problem of interpolation of spatially correlated data on regular grids. The correlations are captured by short-range interactions between “Ising spins”. The INNC algorithm can be used with label data (classification) as well as discrete and continuous real-valued data (regression). In the regression problem, INNC approximates continuous variables by means of a user-specified number of classes. INNC predicts the class identity at unmeasured points by using the Monte Carlo simulation conditioned on the observed data (partial sample). The algorithm locally respects the sample values and globally aims to minimize the deviation between an energy measure of the partial sample and that of the entire grid. INNC is non-parametric and, thus, is suitable for non-Gaussian data. The method is found to be very competitive with respect to interpolation accuracy and computational efficiency compared to some standard methods. Thus, this method provides a useful tool for filling gaps in gridded data such as satellite images.


2020 ◽  
Vol 36 (8) ◽  
pp. 2359-2364 ◽  
Author(s):  
Pasi Rastas

Abstract Motivation Linkage mapping provides a practical way to anchor de novo genome assemblies into chromosomes and to detect chimeric or otherwise erroneous contigs. Such anchoring improves with higher number of markers and individuals, as long as the mapping software can handle all the information. Recent software Lep-MAP3 can robustly construct linkage maps for millions of genotyped markers and on thousands of individuals, providing optimal maps for genome anchoring. For such large datasets, automated and robust genome anchoring tool is especially valuable and can significantly reduce intensive computational and manual work involved. Results Here, we present a software Lep-Anchor (LA) to anchor genome assemblies automatically using dense linkage maps. As the main novelty, it takes into account the uncertainty of the linkage map positions caused by low recombination regions, cross type or poor mapping data quality. Furthermore, it can automatically detect and cut chimeric contigs, and use contig–contig, single read or alternative genome assembly alignments as additional information on contig order and orientations and to collapse haplotype contigs. We demonstrate the performance of LA using real data and show that it outperforms ALLMAPS on anchoring completeness and speed. Accuracy-wise LA and ALLMAPS are about equal, but at the expense of lower completeness of ALLMAPS. The software Chromonomer was faster than the other two methods but has major limitations and is lower in accuracy. We also show that with additional information, such as contig–contig and read alignments, the anchoring completeness can be improved by up to 70% without significant loss in accuracy. Based on simulated data, we conclude that the anchoring accuracy can be improved by utilizing information about map position uncertainty. Accuracy is the rate of contigs in correct orientation and completeness is the number contigs with inferred orientation. Availability and implementation Lep-Anchor is available with the source code under GNU general public license from http://sourceforge.net/projects/lep-anchor. All the scripts and code used to produce the reported results are included with Lep-Anchor.


2019 ◽  
pp. 177-198
Author(s):  
Geisa Bugs ◽  
Marketta Kyttä

This chapter addresses PPGIS (Public Participation Geographic Information Systems), a participatory method through which the public can produce maps and spatial data that represent their perceptions of the urban space in question. Specifically, it analyzes the data collected from an experiment in Jaguarão, Brazil. The data represents the perceptions of a small group of inhabitants about the problems and potential of the city's urban area. The procedures include an exploratory analysis and data visualization in the form of maps that allow describing a variable's distribution and identifying patterns. Moreover, for some issues, the authors cross the perception collected data with infrastructure data, socioeconomic data, and cadastral data to study possible associations among these different types of information layers. The results show that public perception, collected through PPGIS, forms an additional information layer that could be analyzed together with other information layers commonly used in urban planning, and thus to be taken into account for designing better cities.


2014 ◽  
Vol 10 (S306) ◽  
pp. 365-368 ◽  
Author(s):  
Viviana Acquaviva ◽  
Eric Gawiser ◽  
Andrew S. Leung ◽  
Mario R. Martin

AbstractWe discuss different methods to separate high- from low-redshift galaxies based on a combination of spectroscopic and photometric observations. Our baseline scenario is the Hobby-Eberly Telescope Dark Energy eXperiment (HETDEX) survey, which will observe several hundred thousand Lyman Alpha Emitting (LAE) galaxies at 1.9 < z < 3.5, and for which the main source of contamination is [OII]-emitting galaxies at z < 0.5. Additional information useful for the separation comes from empirical knowledge of LAE and [OII] luminosity functions and equivalent width distributions as a function of redshift. We consider three separation techniques: a simple cut in equivalent width, a Bayesian separation method, and machine learning algorithms, including support vector machines. These methods can be easily applied to other surveys and used on simulated data in the framework of survey planning.


2000 ◽  
Vol 12 (10) ◽  
pp. 2385-2404 ◽  
Author(s):  
G. Baudat ◽  
F. Anouar

We present a new method that we call generalized discriminant analysis (GDA) to deal with nonlinear discriminant analysis using kernel function operator. The underlying theory is close to the support vector machines (SVM) insofar as the GDA method provides a mapping of the input vectors into high-dimensional feature space. In the transformed space, linear properties make it easy to extend and generalize the classical linear discriminant analysis (LDA) to nonlinear discriminant analysis. The formulation is expressed as an eigenvalue problem resolution. Using a different kernel, one can cover a wide class of nonlinearities. For both simulated data and alternate kernels, we give classification results, as well as the shape of the decision function. The results are confirmed using real data to perform seed classification.


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Rainer Schnell ◽  
Jonas Klingwort ◽  
James M. Farrow

Abstract Background We introduce and study a recently proposed method for privacy-preserving distance computations which has received little attention in the scientific literature so far. The method, which is based on intersecting sets of randomly labeled grid points, is henceforth denoted as ISGP allows calculating the approximate distances between masked spatial data. Coordinates are replaced by sets of hash values. The method allows the computation of distances between locations L when the locations at different points in time t are not known simultaneously. The distance between $$L_1$$ L 1 and $$L_2$$ L 2 could be computed even when $$L_2$$ L 2 does not exist at $$t_1$$ t 1 and $$L_1$$ L 1 has been deleted at $$t_2$$ t 2 . An example would be patients from a medical data set and locations of later hospitalizations. ISGP is a new tool for privacy-preserving data handling of geo-referenced data sets in general. Furthermore, this technique can be used to include geographical identifiers as additional information for privacy-preserving record-linkage. To show that the technique can be implemented in most high-level programming languages with a few lines of code, a complete implementation within the statistical programming language R is given. The properties of the method are explored using simulations based on large-scale real-world data of hospitals ($$n=850$$ n = 850 ) and residential locations ($$n=13,000$$ n = 13 , 000 ). The method has already been used in a real-world application. Results ISGP yields very accurate results. Our simulation study showed that—with appropriately chosen parameters – 99 % accuracy in the approximated distances is achieved. Conclusion We discussed a new method for privacy-preserving distance computations in microdata. The method is highly accurate, fast, has low computational burden, and does not require excessive storage.


Sign in / Sign up

Export Citation Format

Share Document