A Computationally Efficient Algorithm for Large Scale Near-Duplicate Video Detection

Author(s):  
Dawei Liu ◽  
Zhihua Yu
2015 ◽  
Vol 75 (23) ◽  
pp. 15665-15678 ◽  
Author(s):  
Woogyoung Jun ◽  
Yillbyung Lee ◽  
Byoung-Min Jun

Author(s):  
B. Aparna ◽  
S. Madhavi ◽  
G. Mounika ◽  
P. Avinash ◽  
S. Chakravarthi

We propose a new design for large-scale multimedia content protection systems. Our design leverages cloud infrastructures to provide cost efficiency, rapid deployment, scalability, and elasticity to accommodate varying workloads. The proposed system can be used to protect different multimedia content types, including videos, images, audio clips, songs, and music clips. The system can be deployed on private and/or public clouds. Our system has two novel components: (i) method to create signatures of videos, and (ii) distributed matching engine for multimedia objects. The signature method creates robust and representative signatures of videos that capture the depth signals in these videos and it is computationally efficient to compute and compare as well as it requires small storage. The distributed matching engine achieves high scalability and it is designed to support different multimedia objects. We implemented the proposed system and deployed it on two clouds: Amazon cloud and our private cloud. Our experiments with more than 11,000 videos and 1 million images show the high accuracy and scalability of the proposed system. In addition, we compared our system to the protection system used by YouTube and our results show that the YouTube protection system fails to detect most copies of videos, while our system detects more than 98% of them.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Hongyi Zhang ◽  
Xiaowei Zhan ◽  
Bo Li

AbstractSimilarity in T-cell receptor (TCR) sequences implies shared antigen specificity between receptors, and could be used to discover novel therapeutic targets. However, existing methods that cluster T-cell receptor sequences by similarity are computationally inefficient, making them impractical to use on the ever-expanding datasets of the immune repertoire. Here, we developed GIANA (Geometric Isometry-based TCR AligNment Algorithm) a computationally efficient tool for this task that provides the same level of clustering specificity as TCRdist at 600 times its speed, and without sacrificing accuracy. GIANA also allows the rapid query of large reference cohorts within minutes. Using GIANA to cluster large-scale TCR datasets provides candidate disease-specific receptors, and provides a new solution to repertoire classification. Querying unseen TCR-seq samples against an existing reference differentiates samples from patients across various cohorts associated with cancer, infectious and autoimmune disease. Our results demonstrate how GIANA could be used as the basis for a TCR-based non-invasive multi-disease diagnostic platform.


Author(s):  
Mahdi Esmaily Moghadam ◽  
Yuri Bazilevs ◽  
Tain-Yen Hsia ◽  
Alison Marsden

A closed-loop lumped parameter network (LPN) coupled to a 3D domain is a powerful tool that can be used to model the global dynamics of the circulatory system. Coupling a 0D LPN to a 3D CFD domain is a numerically challenging problem, often associated with instabilities, extra computational cost, and loss of modularity. A computationally efficient finite element framework has been recently proposed that achieves numerical stability without sacrificing modularity [1]. This type of coupling introduces new challenges in the linear algebraic equation solver (LS), producing an strong coupling between flow and pressure that leads to an ill-conditioned tangent matrix. In this paper we exploit this strong coupling to obtain a novel and efficient algorithm for the linear solver (LS). We illustrate the efficiency of this method on several large-scale cardiovascular blood flow simulation problems.


2014 ◽  
Vol 26 (4) ◽  
pp. 781-817 ◽  
Author(s):  
Ching-Pei Lee ◽  
Chih-Jen Lin

Linear rankSVM is one of the widely used methods for learning to rank. Although its performance may be inferior to nonlinear methods such as kernel rankSVM and gradient boosting decision trees, linear rankSVM is useful to quickly produce a baseline model. Furthermore, following its recent development for classification, linear rankSVM may give competitive performance for large and sparse data. A great deal of works have studied linear rankSVM. The focus is on the computational efficiency when the number of preference pairs is large. In this letter, we systematically study existing works, discuss their advantages and disadvantages, and propose an efficient algorithm. We discuss different implementation issues and extensions with detailed experiments. Finally, we develop a robust linear rankSVM tool for public use.


Sign in / Sign up

Export Citation Format

Share Document