Estimating Query Timings in Elasticsearch

2021 ◽  
Vol 9 (2) ◽  
pp. 15-36
Author(s):  
Sikha Bagui ◽  
Evorell Fridge

In a shared Elasticsearch environment it can be useful to know how long a particular query will take to execute. This information can be used to enforce rate limiting or distribute requests equitably among multiple clusters. Elasticsearch uses multiple Lucene instances on multiple hosts as an underlying search engine implementation, but this abstraction makes it difficult to predict execution with previously known predictors such as the number of postings. This research investigates the ability of different pre-retrieval statistics, available through Elasticsearch, to accurately predict the execution time of queries on a typical Elasticsearch cluster. The number of terms in a query and the Total Term Frequency (TTF) from Elasticsearch’s API are found to significantly predict execution time. Regression models are then built and compared to find the most accurate method for predicting query time.

Author(s):  
A. Safari ◽  
H. Sohrabi

The role of forests as a reservoir for carbon has prompted the need for timely and reliable estimation of aboveground carbon stocks. Since measurement of aboveground carbon stocks of forests is a destructive, costly and time-consuming activity, aerial and satellite remote sensing techniques have gained many attentions in this field. Despite the fact that using aerial data for predicting aboveground carbon stocks has been proved as a highly accurate method, there are challenges related to high acquisition costs, small area coverage, and limited availability of these data. These challenges are more critical for non-commercial forests located in low-income countries. Landsat program provides repetitive acquisition of high-resolution multispectral data, which are freely available. The aim of this study was to assess the potential of multispectral Landsat 8 Operational Land Imager (OLI) derived texture metrics in quantifying aboveground carbon stocks of coppice Oak forests in Zagros Mountains, Iran. We used four different window sizes (3×3, 5×5, 7×7, and 9×9), and four different offsets ([0,1], [1,1], [1,0], and [1,-1]) to derive nine texture metrics (angular second moment, contrast, correlation, dissimilar, entropy, homogeneity, inverse difference, mean, and variance) from four bands (blue, green, red, and infrared). Totally, 124 sample plots in two different forests were measured and carbon was calculated using species-specific allometric models. Stepwise regression analysis was applied to estimate biomass from derived metrics. Results showed that, in general, larger size of window for deriving texture metrics resulted models with better fitting parameters. In addition, the correlation of the spectral bands for deriving texture metrics in regression models was ranked as b4>b3>b2>b5. The best offset was [1,-1]. Amongst the different metrics, mean and entropy were entered in most of the regression models. Overall, different models based on derived texture metrics were able to explain about half of the variation in aboveground carbon stocks. These results demonstrated that Landsat 8 derived texture metrics can be applied for mapping aboveground carbon stocks of coppice Oak Forests in large areas.


Information ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 425
Author(s):  
Cinthya M. França ◽  
Rodrigo S. Couto ◽  
Pedro B. Velloso

In an Internet of Things (IoT) environment, sensors collect and send data to application servers through IoT gateways. However, these data may be missing values due to networking problems or sensor malfunction, which reduces applications’ reliability. This work proposes a mechanism to predict and impute missing data in IoT gateways to achieve greater autonomy at the network edge. These gateways typically have limited computing resources. Therefore, the missing data imputation methods must be simple and provide good results. Thus, this work presents two regression models based on neural networks to impute missing data in IoT gateways. In addition to the prediction quality, we analyzed both the execution time and the amount of memory used. We validated our models using six years of weather data from Rio de Janeiro, varying the missing data percentages. The results show that the neural network regression models perform better than the other imputation methods analyzed, based on the averages and repetition of previous values, for all missing data percentages. In addition, the neural network models present a short execution time and need less than 140 KiB of memory, which allows them to run on IoT gateways.


2019 ◽  
Vol 28 (04) ◽  
pp. 1950060 ◽  
Author(s):  
Isil Oz ◽  
Muhammad Khurram Bhatti ◽  
Konstantin Popov ◽  
Mats Brorsson

As multicore systems evolve by increasing the number of parallel execution units, parallel programming models have been released to exploit parallelism in the applications. Task-based programming model uses task abstractions to specify parallel tasks and schedules tasks onto processors at runtime. In order to increase the efficiency and get the highest performance, it is required to identify which runtime configuration is needed and how processor cores must be shared among tasks. Exploring design space for all possible scheduling and runtime options, especially for large input data, becomes infeasible and requires statistical modeling. Regression-based modeling determines the effects of multiple factors on a response variable, and makes predictions based on statistical analysis. In this work, we propose a regression-based modeling approach to predict the task-based program performance for different scheduling parameters with variable data size. We execute a set of task-based programs by varying the runtime parameters, and conduct a systematic measurement for influencing factors on execution time. Our approach uses executions with different configurations for a set of input data, and derives different regression models to predict execution time for larger input data. Our results show that regression models provide accurate predictions for validation inputs with mean error rate as low as 6.3%, and 14% on average among four task-based programs.


2018 ◽  
Vol 7 (3.12) ◽  
pp. 361 ◽  
Author(s):  
Kavita Goel ◽  
Jay Shankar Prasad ◽  
Saba Hilal

Searching is the important requirement of the web user and results is based on crawler. Users rely on search engines to get desired information in various forms text, images, sound, Video. Search engine gives information on the basis of indexed database and this database is created by the URLs through crawler. Some URLs directly or indirectly leads to same page. Crawling and indexing similar contents URLs implies wastage of resources. Crawler gives such results because of bad crawling algorithm, poor quality Ranking algorithm or low level user experience. The challenge is to remove duplicate results, near duplicate document detection and elimination to improve the performance of any search engine. This paper proposes a Web Crawler which performs crawling in particular category to remove irrelevant URL and implements URL normalization for removing duplicate URLs within particular category. Results are analyzed on the basis of total URL Fetched, Duplicate URLs, and Query execution time.  


Author(s):  
Vedant Karmalkar, Et. al.

Twitter monitoring enables firms to consider their market, stay on track of what is being said regarding their company and contenders, and uncover emerging market trends. Twego Trending is a platform where data will be viewed and structured by an automated procedure of analyzing and processing tweets data and classifying it into various hash statistics and visualizations. Implementing Twego forecasting analysis on Twitter data using various technologies may help businesses know how consumers talk about their product. Twitter has more than 340 million active users and almost 500 millions tweets are posted every day. This social media platform helps companies to reach a large audience and communicate without intermediaries with consumers. The aim is to build a Search Engine in which , when someone will type in a query , it will return back tweets as well as do data analytics on the results and provide visualizations.


2014 ◽  
Vol 7 (1) ◽  
pp. 29-34 ◽  
Author(s):  
Domingo Jesús Campo Ramos ◽  
Antonio A. Sánchez Peraza ◽  
José Juan. Robles-Pérez ◽  
Pedro Montañez-Toledo ◽  
Vicente Javier Clemente-Suárez

Introduction: Shackled maneuver was poorly studied in specific literature despite being a technique used every day for security and police forces. The use of new technologies had reported important benefits in learning process and could improve shackled learning process. Therefore, the objectives of the present research were: (i) to analyze the effect of an audiovisual training on the efficiency in shackled technique in three different stress situations and (ii) to study the week point in the shackled maneuver to know how to improve the audiovisual training. Methods: Technical procedures in shackle techniques in 3 different situations of 26 male soldiers were analyzed after an audiovisual session. The situations were normal, alert and danger. All the shackle interventions were filmed by a video camera and were analyzed later. Shackle maneuver was divided in approach to the subject, subject control, placing the first shackle, limb immobilization, placing the second shackle, frisk and transfer. Every part of the shackle maneuver was evaluated with 10, 5 or 0 point. Results: Maneuver efficiency decreased with the stress. There was significant difference in the placement of the second phase shackle between the normal and alert situations and in all the situations the limb immobilization phase obtained the lower values. Furthermore, the execution time was lower in normal (40.30 s) than in alert (120.08 s) and danger situations (180.36 s). Conclusion: A short audiovisual training was enough to learn the proper shackled procedure in non experimented soldiers with a low technical level in the limb immobilization phase and differences between the stress situation in the placing the second shackle phase. Also, the increase in stress caused an increase in the maneuver time, and an increase in number of the video reproductions.


Author(s):  
A. Safari ◽  
H. Sohrabi

The role of forests as a reservoir for carbon has prompted the need for timely and reliable estimation of aboveground carbon stocks. Since measurement of aboveground carbon stocks of forests is a destructive, costly and time-consuming activity, aerial and satellite remote sensing techniques have gained many attentions in this field. Despite the fact that using aerial data for predicting aboveground carbon stocks has been proved as a highly accurate method, there are challenges related to high acquisition costs, small area coverage, and limited availability of these data. These challenges are more critical for non-commercial forests located in low-income countries. Landsat program provides repetitive acquisition of high-resolution multispectral data, which are freely available. The aim of this study was to assess the potential of multispectral Landsat 8 Operational Land Imager (OLI) derived texture metrics in quantifying aboveground carbon stocks of coppice Oak forests in Zagros Mountains, Iran. We used four different window sizes (3×3, 5×5, 7×7, and 9×9), and four different offsets ([0,1], [1,1], [1,0], and [1,-1]) to derive nine texture metrics (angular second moment, contrast, correlation, dissimilar, entropy, homogeneity, inverse difference, mean, and variance) from four bands (blue, green, red, and infrared). Totally, 124 sample plots in two different forests were measured and carbon was calculated using species-specific allometric models. Stepwise regression analysis was applied to estimate biomass from derived metrics. Results showed that, in general, larger size of window for deriving texture metrics resulted models with better fitting parameters. In addition, the correlation of the spectral bands for deriving texture metrics in regression models was ranked as b4>b3>b2>b5. The best offset was [1,-1]. Amongst the different metrics, mean and entropy were entered in most of the regression models. Overall, different models based on derived texture metrics were able to explain about half of the variation in aboveground carbon stocks. These results demonstrated that Landsat 8 derived texture metrics can be applied for mapping aboveground carbon stocks of coppice Oak Forests in large areas.


2017 ◽  
Vol 1 (01) ◽  
pp. 43
Author(s):  
Zainal Putra

<p>The purpose of this research are: (1). To know how the condition of region own source revenue, the general allocation fund, special allocation fund and regional financial performance , (2). To find out the influence of region own source revenue, the general allocation fund, special allocation fund to the region financial performance, (3). To know the influence of region own source revenue to the region financial performance, (4). To find out the influence of the general allocation fund to the region financial performance, (5). to know the effect of the special allocation fund toward region financial performance. The data used in this research is a secondary data obtained from the office of BPK RI Aceh province Representative. The data collected in the form of pooling data in period of 2008-2012. The entire population in this research are sampled as many as 23 regencies / cities in Aceh province. In analyzing the data using multiple linear regression models. The results showed that: (1). Regencies / cities in Aceh province the category of of region finance ability is “very less" and the category of Region Finance Independence is "very low" with a pattern of "instructive" relationship, (2). The financial performance of the region show a decreasing trend in the span of 2008-2012 , (3). The variable of region own source revenue, general allocation fund, special allocation fund simultaneously influential and significant to the variable of regional financial performance , (4). Partially only variable of general allocation fund and variable of special allocation fund that influence and significant toward the variable of regional financial performance, whereas the variable of region own source revenue does not influence significantly to the variable of regional financial performance.</p><p><br />JEL Classification: H20, H50, P50<br />Key words: The General Allocation Fund, The Region Own Source Revenue, The Regional Financial Performance, The Special Allocation Fund</p>


Sign in / Sign up

Export Citation Format

Share Document