scholarly journals Comparison of statistical methods for downscaling daily precipitation

2012 ◽  
Vol 14 (4) ◽  
pp. 1006-1023 ◽  
Author(s):  
Getnet Y. Muluye

There are several statistical downscaling methods available for generating local-scale meteorological variables from large-scale model outputs. There is still no universal single method, or group of methods, that is clearly superior, particularly for downscaling daily precipitation. This paper compares different statistical methods for downscaling daily precipitation from numerical weather prediction model output. Three different methods are considered: (i) hybrids; (ii) neural networks; and (iii) nearest neighbor-based approaches. These methods are implemented in the Saguenay watershed in northeastern Canada. Suites of standard diagnostic measures are computed to evaluate and inter-compare the performances of the downscaling models. Although results of the downscaling experiment show mixed performances, clear patterns emerge with respect to the reproduction of variation in daily precipitation and skill values. Artificial neural network-logistic regression (ANN-Logst), partial least squares (PLS) regression and recurrent multilayer perceptron (RMLP) models yield greater skill values, and conditional resampling method (SDSM) and K-nearest neighbor (KNN)-based models show the potential to capture the variability in daily precipitation.

Author(s):  
Bao Bing-Kun ◽  
Yan Shuicheng

Graph-based learning provides a useful approach for modeling data in image annotation problems. In this chapter, the authors introduce how to construct a region-based graph to annotate large scale multi-label images. It has been well recognized that analysis in semantic region level may greatly improve image annotation performance compared to that in whole image level. However, the region level approach increases the data scale to several orders of magnitude and lays down new challenges to most existing algorithms. To this end, each image is firstly encoded as a Bag-of-Regions based on multiple image segmentations. And then, all image regions are constructed into a large k-nearest-neighbor graph with efficient Locality Sensitive Hashing (LSH) method. At last, a sparse and region-aware image-based graph is fed into the multi-label extension of the Entropic graph regularized semi-supervised learning algorithm (Subramanya & Bilmes, 2009). In combination they naturally yield the capability in handling large-scale dataset. Extensive experiments on NUS-WIDE (260k images) and COREL-5k datasets well validate the effectiveness and efficiency of the framework for region-aware and scalable multi-label propagation.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yaron Ogen ◽  
Michael Denk ◽  
Cornelia Glaesser ◽  
Holger Eichstaedt ◽  
Rene Kahnt ◽  
...  

Reflectance spectroscopy is a nondestructive, rapid, and easy-to-use technique which can be used to assess the composition of rocks qualitatively or quantitatively. Although it is a powerful tool, it has its limitations especially when it comes to measurements of rocks with a phaneritic texture. The external variability is reflected only in spectroscopy and not in the chemical-mineralogical measurements that are performed on crushed rock in certified laboratories. Hence, the spectral variability of the surface of an uncrushed rock will, in most cases, be higher than the internal chemical-mineralogical variability, which may impair statistical models built on field measurements. For this reason, studying ore-bearing rocks and evaluating their spectral variability in different scales is an important procedure to better understand the factors that may influence the qualitative and quantitative analysis of the rocks. The objectives are to quantify the spectral variability of three types of altered granodiorite using well-established statistical methods with an upscaling approach. With this approach, the samples were measured in the laboratory under supervised ambient conditions and in the field under semisupervised conditions. This study further aims to conclude which statistical method provides the best practical and accurate classification for use in future studies. Our results showed that all statistical methods enable the separation of the rock types, although two types of rocks have exhibited almost identical spectra. Furthermore, the statistical methods that supplied the most significant results for classification purposes were principal component analysis combined with k-nearest neighbor with a classification accuracy for laboratory and field measurements of 68.1% and 100%, respectively.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7269
Author(s):  
Ling Ruan ◽  
Ling Zhang ◽  
Tong Zhou ◽  
Yi Long

The weighted K-nearest neighbor algorithm (WKNN) is easily implemented, and it has been widely applied. In the large-scale positioning regions, using all fingerprint data in matching calculations would lead to high computation expenses, which is not conducive to real-time positioning. Due to signal instability, irrelevant fingerprints reduce the positioning accuracy when performing the matching calculation process. Therefore, selecting the appropriate fingerprint data from the database more quickly and accurately is an urgent problem for improving WKNN. This paper proposes an improved Bluetooth indoor positioning method using a dynamic fingerprint window (DFW-WKNN). The dynamic fingerprint window is a space range for local fingerprint data searching instead of universal searching, and it can be dynamically adjusted according to the indoor pedestrian movement and always covers the maximum possible range of the next positioning. This method was tested and evaluated in two typical scenarios, comparing two existing algorithms, the traditional WKNN and the improved WKNN based on local clustering (LC-WKNN). The experimental results show that the proposed DFW-WKNN algorithm enormously improved both the positioning accuracy and positioning efficiency, significantly, when the fingerprint data increased.


Author(s):  
Bingming Wang ◽  
Shi Ying ◽  
Guoli Cheng ◽  
Rui Wang ◽  
Zhe Yang ◽  
...  

Logs play an important role in the maintenance of large-scale systems. The number of logs which indicate normal (normal logs) differs greatly from the number of logs that indicate anomalies (abnormal logs), and the two types of logs have certain differences. To automatically obtain faults by K-Nearest Neighbor (KNN) algorithm, an outlier detection method with high accuracy, is an effective way to detect anomalies from logs. However, logs have the characteristics of large scale and very uneven samples, which will affect the results of KNN algorithm on log-based anomaly detection. Thus, we propose an improved KNN algorithm-based method which uses the existing mean-shift clustering algorithm to efficiently select the training set from massive logs. Then we assign different weights to samples with different distances, which reduces the negative effect of unbalanced distribution of the log samples on the accuracy of KNN algorithm. By comparing experiments on log sets from five supercomputers, the results show that the method we proposed can be effectively applied to log-based anomaly detection, and the accuracy, recall rate and F measure with our method are higher than those of traditional keyword search method.


2010 ◽  
Vol 138 (7) ◽  
pp. 2867-2882 ◽  
Author(s):  
Martina Tudor ◽  
Piet Termonia

Abstract Limited-area models (LAMs) use higher resolutions and more advanced parameterizations of physical processes than global numerical weather prediction models, but suffer from one additional source of error—the lateral boundary condition (LBC). The large-scale model passes the information on its fields to the LAM only over the narrow coupling zone at discrete times separated by a coupling interval of several hours. The LBC temporal resolution can be lower than the time necessary for a particular meteorological feature to cross the boundary. A LAM user who depends on LBC data acquired from an independent prior analysis or parent model run can find that usual schemes for temporal interpolation of large-scale data provide LBC data of inadequate quality. The problem of a quickly moving depression that is not recognized by the operationally used gridpoint coupling scheme is examined using a simple one-dimensional model. A spectral method for nesting a LAM in a larger-scale model is implemented and tested. Results for a traditional flow-relaxation scheme combined with temporal interpolation in spectral space are also presented.


Forests ◽  
2014 ◽  
Vol 5 (7) ◽  
pp. 1635-1652 ◽  
Author(s):  
Leonhard Suchenwirth ◽  
Wolfgang Stümer ◽  
Tobias Schmidt ◽  
Michael Förster ◽  
Birgit Kleinschmit

2021 ◽  
Vol 6 (1) ◽  
pp. 96
Author(s):  
Ikhsan Romli ◽  
Shanti Prameswari R ◽  
Antika Zahrotul Kamalia

Sentiment analysis is a data processing to recognize topics that people talk about and their sentiments toward the topics, one of which in this study is about large-scale social restrictions (PSBB). This study aims to classify negative and positive sentiments by applying the K-Nearest Neighbor algorithm to see the accuracy value of 3 types of distance calculation which are cosine similarity, euclidean, and manhattan distance for Indonesian language tweets about large-scale social restrictions (PSBB) from social media twitter. With the results obtained, the K-Nearest Neighbor accuracy by the Cosine Similarity distance 82% at k = 3, K-Nearest Neighbor by the Euclidean Distance with an accuracy of 81% at k = 11 and K-Nearest Neighbor by Manhattan Distance with an accuracy 80% at k = 5, 7, 9, 11, and 13. So, in this study the K-Nearest Neighbor algorithm with the Cosine Similarity Distance calculation gets the highest point.


2016 ◽  
Vol 12 (2) ◽  
Author(s):  
Lidya Andriani Sunjoyo ◽  
R. Gunawan Santosa ◽  
Kristian Adi Nugraha

Lime is a fruit that has been widely cultivated and used in Indonesia. Many products use this fruit in the production process. The process of sorting fruit is undeniably a very substantial early process. It is necessary for large-scale  be aware of this in term of result and time required for the sorting process. Pattern Recognition is a discipline that focuses on classifying or picturing an object based on characteristics or main attribute of the object. In this research, the author implements Haar Wavelet Transformation method by characteristic extraction based on colour and texture ,  performs classification using K-Nearest Neighbor (k-NN) to detect indication of rotten lime and the grade of k on k-NN so the accuracy of the result could be acquired. Based on analysis result, Haar Wavelet Transformation method is able to be implemented to detect the indication of rotten lime and most optimal accuracy level of this system reaches the number of 85 percent.


Sign in / Sign up

Export Citation Format

Share Document