scholarly journals Coresets for the Average Case Error for Finite Query Sets

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6689
Author(s):  
Alaa Maalouf ◽  
Ibrahim Jubran ◽  
Murad Tukan ◽  
Dan Feldman

Coreset is usually a small weighted subset of an input set of items, that provably approximates their loss function for a given set of queries (models, classifiers, hypothesis). That is, the maximum (worst-case) error over all queries is bounded. To obtain smaller coresets, we suggest a natural relaxation: coresets whose average error over the given set of queries is bounded. We provide both deterministic and randomized (generic) algorithms for computing such a coreset for any finite set of queries. Unlike most corresponding coresets for the worst-case error, the size of the coreset in this work is independent of both the input size and its Vapnik–Chervonenkis (VC) dimension. The main technique is to reduce the average-case coreset into the vector summarization problem, where the goal is to compute a weighted subset of the n input vectors which approximates their sum. We then suggest the first algorithm for computing this weighted subset in time that is linear in the input size, for n≫1/ε, where ε is the approximation error, improving, e.g., both [ICML’17] and applications for principal component analysis (PCA) [NIPS’16]. Experimental results show significant and consistent improvement also in practice. Open source code is provided.

Author(s):  
Adrijan Božinovski ◽  
George Tanev ◽  
Biljana Stojčevska ◽  
Veno Pačovski ◽  
Nevena Ackovska

This paper presents the space complexity analysis of the Binary Tree Roll algorithm. The space complexity is analyzed theoretically and the results are then confirmed empirically. The theoretical analysis consists of determining the amount of memory occupied during the execution of the algorithm and deriving functions of it, in terms of the number of nodes of the tree n, for the worst - and best-case scenarios. The empirical analysis of the space complexity consists of measuring the maximum and minimum amounts of memory occupied during the execution of the algorithm, for all binary tree topologies with the given number of nodes. The space complexity is shown, both theoretically and empirically, to be logarithmic in the best case and linear in the worst case, whereas its average case is shown to be dominantly logarithmic.


Sorting is the basic activity in the field of computer science and it is commonly used in searching for information and data. The main goal of sorting is to make reports or records easier to edit, delete and search, etc. It organizes the given data in any sequence. There are many sorting algorithms like insertion sort, bubble sort, radix sort, heap sort, and so forth. Bubble sort and insertion sort are clearly described with algorithms and examples. In this paper, the bubble sort and insertion sort performance analysis is carried out by calculating the time complexity. These algorithm time complexities have been calculated by implementing in the rust and python languages and observed the best case, average case, and worst case. The flowchart shows the complete workflow of this study. The results have been shown graphically and time complexity has been shown in a tabular form. We have compared the efficiency of bubble sort and insertion sort algorithms in the rust and python platforms. The rust language is more efficient than python for both bubble and insertion sort algorithms. However, it is observed insertion sort is more efficient than the bubble sort algorithm.


Author(s):  
Sunil Pathak

Background: The significant work has been present to identify suspects, gathering information and examining any videos from CCTV Footage. This exploration work expects to recognize suspicious exercises, i.e. object trade, passage of another individual, peeping into other's answer sheet and individual trade from the video caught by a reconnaissance camera amid examinations. This requires the procedure of face acknowledgment, hand acknowledgment and distinguishing the contact between the face and hands of a similar individual and that among various people. Methods: Segmented frames has given as input to obtain foreground image with the help of Gaussian filtering and background modeling method. Suh foreground images has given to Activity Recognition model to detect normal activity or suspicious activity. Results: Accuracy rate, Precision and Recall are calculate for activities detection, contact detection for Best Case, Average Case and Worst Case. Simulation results are compare with performance parameter such as Material Exchange, Position Exchange, and Introduction of a new person, Face and Hand Detection and Multi Person Scenario. Conclusion: In this paper, a framework is prepared for suspect detection. This framework will absolutely realize an unrest in the field of security observation in the training area.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 773
Author(s):  
Amichai Painsky ◽  
Meir Feder

Learning and making inference from a finite set of samples are among the fundamental problems in science. In most popular applications, the paradigmatic approach is to seek a model that best explains the data. This approach has many desirable properties when the number of samples is large. However, in many practical setups, data acquisition is costly and only a limited number of samples is available. In this work, we study an alternative approach for this challenging setup. Our framework suggests that the role of the train-set is not to provide a single estimated model, which may be inaccurate due to the limited number of samples. Instead, we define a class of “reasonable” models. Then, the worst-case performance in the class is controlled by a minimax estimator with respect to it. Further, we introduce a robust estimation scheme that provides minimax guarantees, also for the case where the true model is not a member of the model class. Our results draw important connections to universal prediction, the redundancy-capacity theorem, and channel capacity theory. We demonstrate our suggested scheme in different setups, showing a significant improvement in worst-case performance over currently known alternatives.


2021 ◽  
Vol 68 (4) ◽  
pp. 1-25
Author(s):  
Thodoris Lykouris ◽  
Sergei Vassilvitskii

Traditional online algorithms encapsulate decision making under uncertainty, and give ways to hedge against all possible future events, while guaranteeing a nearly optimal solution, as compared to an offline optimum. On the other hand, machine learning algorithms are in the business of extrapolating patterns found in the data to predict the future, and usually come with strong guarantees on the expected generalization error. In this work, we develop a framework for augmenting online algorithms with a machine learned predictor to achieve competitive ratios that provably improve upon unconditional worst-case lower bounds when the predictor has low error. Our approach treats the predictor as a complete black box and is not dependent on its inner workings or the exact distribution of its errors. We apply this framework to the traditional caching problem—creating an eviction strategy for a cache of size k . We demonstrate that naively following the oracle’s recommendations may lead to very poor performance, even when the average error is quite low. Instead, we show how to modify the Marker algorithm to take into account the predictions and prove that this combined approach achieves a competitive ratio that both (i) decreases as the predictor’s error decreases and (ii) is always capped by O (log k ), which can be achieved without any assistance from the predictor. We complement our results with an empirical evaluation of our algorithm on real-world datasets and show that it performs well empirically even when using simple off-the-shelf predictions.


2014 ◽  
Vol 2014 ◽  
pp. 1-11
Author(s):  
Wei Zhou ◽  
Zilong Tan ◽  
Shaowen Yao ◽  
Shipu Wang

Resource location in structured P2P system has a critical influence on the system performance. Existing analytical studies of Chord protocol have shown some potential improvements in performance. In this paper a splay tree-based new Chord structure called SChord is proposed to improve the efficiency of locating resources. We consider a novel implementation of the Chord finger table (routing table) based on the splay tree. This approach extends the Chord finger table with additional routing entries. Adaptive routing algorithm is proposed for implementation, and it can be shown that hop count is significantly minimized without introducing any other protocol overheads. We analyze the hop count of the adaptive routing algorithm, as compared to Chord variants, and demonstrate sharp upper and lower bounds for both worst-case and average case settings. In addition, we theoretically analyze the hop reducing in SChord and derive the fact that SChord can significantly reduce the routing hops as compared to Chord. Several simulations are presented to evaluate the performance of the algorithm and support our analytical findings. The simulation results show the efficiency of SChord.


Algorithmica ◽  
2021 ◽  
Author(s):  
Jie Zhang

AbstractApart from the principles and methodologies inherited from Economics and Game Theory, the studies in Algorithmic Mechanism Design typically employ the worst-case analysis and design of approximation schemes of Theoretical Computer Science. For instance, the approximation ratio, which is the canonical measure of evaluating how well an incentive-compatible mechanism approximately optimizes the objective, is defined in the worst-case sense. It compares the performance of the optimal mechanism against the performance of a truthful mechanism, for all possible inputs. In this paper, we take the average-case analysis approach, and tackle one of the primary motivating problems in Algorithmic Mechanism Design—the scheduling problem (Nisan and Ronen, in: Proceedings of the 31st annual ACM symposium on theory of computing (STOC), 1999). One version of this problem, which includes a verification component, is studied by Koutsoupias (Theory Comput Syst 54(3):375–387, 2014). It was shown that the problem has a tight approximation ratio bound of $$(n+1)/2$$ ( n + 1 ) / 2 for the single-task setting, where n is the number of machines. We show, however, when the costs of the machines to executing the task follow any independent and identical distribution, the average-case approximation ratio of the mechanism given by Koutsoupias (Theory Comput Syst 54(3):375–387, 2014) is upper bounded by a constant. This positive result asymptotically separates the average-case ratio from the worst-case ratio. It indicates that the optimal mechanism devised for a worst-case guarantee works well on average.


2010 ◽  
Vol 5 (1) ◽  
pp. 78-88 ◽  
Author(s):  
Marcelo Porto ◽  
André Silva ◽  
Sergo Almeida ◽  
Eduardo Da Costa ◽  
Sergio Bampi

This paper presents real time HDTV (High Definition Television) architecture for Motion Estimation (ME) using efficient adder compressors. The architecture is based on the Quarter Sub-sampled Diamond Search algorithm (QSDS) with Dynamic Iteration Control (DIC) algorithm. The main characteristic of the proposed architecture is the large amount of Processing Units (PUs) that are used to calculate the SAD (Sum of Absolute Difference) metric. The internal structures of the PUs are composed by a large number of addition operations to calculate the SADs. In this paper, efficient 4-2 and 8-2 adder compressors are used in the PUs architecture to achieve the performance to work with HDTV (High Definition Television) videos in real time at 30 frames per second. These adder compressors enable the simultaneous addition of 4 and 8 operands respectively. The PUs, using adder compressors, were applied to the ME architecture. The implemented architecture was described in VHDL and synthesized to FPGA and, with Leonardo Spectrum tool, to the TSMC 0.18μm CMOS standard cell technology. Synthesis results indicate that the new QSDS-DIC architecture reach the best performance result and enable gains of 12% in terms of processing rate. The architecture can reach real time for full HDTV (1920x1080 pixels) in the worst case processing 65 frames per second, and it can process 269 HDTV frames per second in the average case.


Sign in / Sign up

Export Citation Format

Share Document