A Computationally Efficient Algorithm for Large Scale Near-Duplicate Video Detection

We propose a new design for large-scale multimedia content protection systems. Our design leverages cloud infrastructures to provide cost efficiency, rapid deployment, scalability, and elasticity to accommodate varying workloads. The proposed system can be used to protect different multimedia content types, including videos, images, audio clips, songs, and music clips. The system can be deployed on private and/or public clouds. Our system has two novel components: (i) method to create signatures of videos, and (ii) distributed matching engine for multimedia objects. The signature method creates robust and representative signatures of videos that capture the depth signals in these videos and it is computationally efficient to compute and compare as well as it requires small storage. The distributed matching engine achieves high scalability and it is designed to support different multimedia objects. We implemented the proposed system and deployed it on two clouds: Amazon cloud and our private cloud. Our experiments with more than 11,000 videos and 1 million images show the high accuracy and scalability of the proposed system. In addition, we compared our system to the protection system used by YouTube and our results show that the YouTube protection system fails to detect most copies of videos, while our system detects more than 98% of them.

Download Full-text

GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation

Nature Communications ◽

10.1038/s41467-021-25006-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Hongyi Zhang ◽

Xiaowei Zhan ◽

Bo Li

Keyword(s):

T Cell ◽

T Cell Receptor ◽

Large Scale ◽

Cell Receptor ◽

Alignment Algorithm ◽

Computationally Efficient ◽

Antigen Specificity ◽

Non Invasive ◽

Isometric Transformation ◽

Specific Receptors

AbstractSimilarity in T-cell receptor (TCR) sequences implies shared antigen specificity between receptors, and could be used to discover novel therapeutic targets. However, existing methods that cluster T-cell receptor sequences by similarity are computationally inefficient, making them impractical to use on the ever-expanding datasets of the immune repertoire. Here, we developed GIANA (Geometric Isometry-based TCR AligNment Algorithm) a computationally efficient tool for this task that provides the same level of clustering specificity as TCRdist at 600 times its speed, and without sacrificing accuracy. GIANA also allows the rapid query of large reference cohorts within minutes. Using GIANA to cluster large-scale TCR datasets provides candidate disease-specific receptors, and provides a new solution to repertoire classification. Querying unseen TCR-seq samples against an existing reference differentiates samples from patients across various cohorts associated with cancer, infectious and autoimmune disease. Our results demonstrate how GIANA could be used as the basis for a TCR-based non-invasive multi-disease diagnostic platform.

Download Full-text

An Efficient Preconditioner for Linear System Solution in Multi-Domain Modeling of the Circulatory System

Volume 1A: Abdominal Aortic Aneurysms; Active and Reactive Soft Matter; Atherosclerosis; BioFluid Mechanics; Education; Biotransport Phenomena; Bone, Joint and Spine Mechanics; Brain Injury; Cardiac Mechanics; Cardiovascular Devices, Fluids and Imaging; Cartilage and Disc Mechanics; Cell and Tissue Engineering; Cerebral Aneurysms; Computational Biofluid Dynamics; Device Design, Human Dynamics, and Rehabilitation; Drug Delivery and Disease Treatment; Engineered Cellular Environments ◽

10.1115/sbc2013-14392 ◽

2013 ◽

Author(s):

Mahdi Esmaily Moghadam ◽

Yuri Bazilevs ◽

Tain-Yen Hsia ◽

Alison Marsden

Keyword(s):

Strong Coupling ◽

Large Scale ◽

Computational Cost ◽

Circulatory System ◽

Flow Simulation ◽

Global Dynamics ◽

Domain Modeling ◽

Computationally Efficient ◽

Lumped Parameter ◽

System Solution

A closed-loop lumped parameter network (LPN) coupled to a 3D domain is a powerful tool that can be used to model the global dynamics of the circulatory system. Coupling a 0D LPN to a 3D CFD domain is a numerically challenging problem, often associated with instabilities, extra computational cost, and loss of modularity. A computationally efficient finite element framework has been recently proposed that achieves numerical stability without sacrificing modularity [1]. This type of coupling introduces new challenges in the linear algebraic equation solver (LS), producing an strong coupling between flow and pressure that leads to an ill-conditioned tangent matrix. In this paper we exploit this strong coupling to obtain a novel and efficient algorithm for the linear solver (LS). We illustrate the efficiency of this method on several large-scale cardiovascular blood flow simulation problems.

Download Full-text

An efficient algorithm for web service selection based on local selection in large scale

2017 IEEE 8th International Conference on Awareness Science and Technology (iCAST) ◽

10.1109/icawst.2017.8256443 ◽

2017 ◽

Author(s):

Sheng Zhang ◽

Incheon Paik

Keyword(s):

Web Service ◽

Efficient Algorithm ◽

Large Scale ◽

Service Selection ◽

Web Service Selection ◽

Local Selection

Download Full-text

Near-Duplicate Video Detection Using Temporal Patterns of Semantic Concepts

2009 11th IEEE International Symposium on Multimedia ◽

10.1109/ism.2009.93 ◽

2009 ◽

Cited By ~ 8

Author(s):

Hyun-seok Min ◽

JaeYoung Choi ◽

Wesley De Neve ◽

Yong Man Ro

Keyword(s):

Temporal Patterns ◽

Video Detection ◽

Semantic Concepts ◽

Duplicate Video

Download Full-text

Large-Scale Linear RankSVM

Neural Computation ◽

10.1162/neco_a_00571 ◽

2014 ◽

Vol 26 (4) ◽

pp. 781-817 ◽

Cited By ~ 48

Author(s):

Ching-Pei Lee ◽

Chih-Jen Lin

Keyword(s):

Decision Trees ◽

Computational Efficiency ◽

Efficient Algorithm ◽

Large Scale ◽

Learning To Rank ◽

Gradient Boosting ◽

Baseline Model ◽

Nonlinear Methods ◽

Advantages And Disadvantages ◽

Linear Ranksvm

Linear rankSVM is one of the widely used methods for learning to rank. Although its performance may be inferior to nonlinear methods such as kernel rankSVM and gradient boosting decision trees, linear rankSVM is useful to quickly produce a baseline model. Furthermore, following its recent development for classification, linear rankSVM may give competitive performance for large and sparse data. A great deal of works have studied linear rankSVM. The focus is on the computational efficiency when the number of preference pairs is large. In this letter, we systematically study existing works, discuss their advantages and disadvantages, and propose an efficient algorithm. We discuss different implementation issues and extensions with detailed experiments. Finally, we develop a robust linear rankSVM tool for public use.

Download Full-text

An efficient algorithm for large scale global optimization of continuous functions

Journal of Computational and Applied Mathematics ◽

10.1016/j.cam.2006.09.006 ◽

2007 ◽

Vol 206 (2) ◽

pp. 1015-1026 ◽

Cited By ~ 8

Author(s):

Yong-Jun Wang ◽

Jiang-She Zhang

Keyword(s):

Global Optimization ◽

Efficient Algorithm ◽

Large Scale ◽

Continuous Functions

Download Full-text