VIDEO RETRIEVAL VIA LEARNING COLLABORATIVE SEMANTIC DISTANCE

Author(s):  
SONGHAO ZHU ◽  
ZHIWEI LIANG ◽  
XIAOYUAN JING

Graph-based semi-supervised learning approaches have been proven effective and efficient in solving the problem of the inefficiency of labeled data in many real-world application areas, such as video annotation. However, the pairwise similarity metric, a significant factor of existing approaches, has not been fully investigated. That is, these graph-based semi-supervised approaches estimate the pairwise similarity between samples mainly according to the spatial property of video data. On the other hand, temporal property, an essential characteristic of video data, is not embedded into the pairwise similarity measure. Accordingly, a novel framework for video annotation, called Joint Spatio-Temporal Correlation Learning (JSTCL), is proposed in this paper. This framework is characterized by simultaneously taking into account the spatial and temporal property of video data to achieve more accurate pairwise similarity values. We apply the proposed framework to video annotation and report superior performance compared to key existing approaches over the benchmark TRECVID data set.

2021 ◽  
Author(s):  
ElMehdi SAOUDI ◽  
Said Jai Andaloussi

Abstract With the rapid growth of the volume of video data and the development of multimedia technologies, it has become necessary to have the ability to accurately and quickly browse and search through information stored in large multimedia databases. For this purpose, content-based video retrieval ( CBVR ) has become an active area of research over the last decade. In this paper, We propose a content-based video retrieval system providing similar videos from a large multimedia data-set based on a query video. The approach uses vector motion-based signatures to describe the visual content and uses machine learning techniques to extract key-frames for rapid browsing and efficient video indexing. We have implemented the proposed approach on both, single machine and real-time distributed cluster to evaluate the real-time performance aspect, especially when the number and size of videos are large. Experiments are performed using various benchmark action and activity recognition data-sets and the results reveal the effectiveness of the proposed method in both accuracy and processing time compared to state-of-the-art methods.


2015 ◽  
Vol 3 (3) ◽  
pp. 1-13 ◽  
Author(s):  
Hiroki Nomiya ◽  
Atsushi Morikuni ◽  
Teruhisa Hochin

A lifelog video retrieval framework is proposed for the better utilization of a large amount of lifelog video data. The proposed method retrieves emotional scenes such as the scenes in which a person in the video is smiling, considering that a certain important event could happen in most of emotional scenes. The emotional scene is detected on the basis of facial expression recognition using a wide variety of facial features. The authors adopt an unsupervised learning approach called ensemble clustering in order to recognize the facial expressions because supervised learning approaches require sufficient training data, which make it quite troublesome to apply to large-scale video databases. The retrieval performance of the proposed method is evaluated by means of an emotional scene detection experiment from the viewpoints of accuracy and efficiency. In addition, a prototype retrieval system is implemented based on the proposed emotional scene detection method.


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


2018 ◽  
Author(s):  
Peter De Wolf ◽  
Zhuangqun Huang ◽  
Bede Pittenger

Abstract Methods are available to measure conductivity, charge, surface potential, carrier density, piezo-electric and other electrical properties with nanometer scale resolution. One of these methods, scanning microwave impedance microscopy (sMIM), has gained interest due to its capability to measure the full impedance (capacitance and resistive part) with high sensitivity and high spatial resolution. This paper introduces a novel data-cube approach that combines sMIM imaging and sMIM point spectroscopy, producing an integrated and complete 3D data set. This approach replaces the subjective approach of guessing locations of interest (for single point spectroscopy) with a big data approach resulting in higher dimensional data that can be sliced along any axis or plane and is conducive to principal component analysis or other machine learning approaches to data reduction. The data-cube approach is also applicable to other AFM-based electrical characterization modes.


2019 ◽  
Vol 277 ◽  
pp. 02024 ◽  
Author(s):  
Lincan Li ◽  
Tong Jia ◽  
Tianqi Meng ◽  
Yizhe Liu

In this paper, an accurate two-stage deep learning method is proposed to detect vulnerable plaques in ultrasonic images of cardiovascular. Firstly, a Fully Convonutional Neural Network (FCN) named U-Net is used to segment the original Intravascular Optical Coherence Tomography (IVOCT) cardiovascular images. We experiment on different threshold values to find the best threshold for removing noise and background in the original images. Secondly, a modified Faster RCNN is adopted to do precise detection. The modified Faster R-CNN utilize six-scale anchors (122,162,322,642,1282,2562) instead of the conventional one scale or three scale approaches. First, we present three problems in cardiovascular vulnerable plaque diagnosis, then we demonstrate how our method solve these problems. The proposed method in this paper apply deep convolutional neural networks to the whole diagnostic procedure. Test results show the Recall rate, Precision rate, IoU (Intersection-over-Union) rate and Total score are 0.94, 0.885, 0.913 and 0.913 respectively, higher than the 1st team of CCCV2017 Cardiovascular OCT Vulnerable Plaque Detection Challenge. AP of the designed Faster RCNN is 83.4%, higher than conventional approaches which use one-scale or three-scale anchors. These results demonstrate the superior performance of our proposed method and the power of deep learning approaches in diagnose cardiovascular vulnerable plaques.


2020 ◽  
Vol 70 (5) ◽  
pp. 1211-1230
Author(s):  
Abdus Saboor ◽  
Hassan S. Bakouch ◽  
Fernando A. Moala ◽  
Sheraz Hussain

AbstractIn this paper, a bivariate extension of exponentiated Fréchet distribution is introduced, namely a bivariate exponentiated Fréchet (BvEF) distribution whose marginals are univariate exponentiated Fréchet distribution. Several properties of the proposed distribution are discussed, such as the joint survival function, joint probability density function, marginal probability density function, conditional probability density function, moments, marginal and bivariate moment generating functions. Moreover, the proposed distribution is obtained by the Marshall-Olkin survival copula. Estimation of the parameters is investigated by the maximum likelihood with the observed information matrix. In addition to the maximum likelihood estimation method, we consider the Bayesian inference and least square estimation and compare these three methodologies for the BvEF. A simulation study is carried out to compare the performance of the estimators by the presented estimation methods. The proposed bivariate distribution with other related bivariate distributions are fitted to a real-life paired data set. It is shown that, the BvEF distribution has a superior performance among the compared distributions using several tests of goodness–of–fit.


Mathematics ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 807
Author(s):  
Carlos M. Castorena ◽  
Itzel M. Abundez ◽  
Roberto Alejo ◽  
Everardo E. Granda-Gutiérrez ◽  
Eréndira Rendón ◽  
...  

The problem of gender-based violence in Mexico has been increased considerably. Many social associations and governmental institutions have addressed this problem in different ways. In the context of computer science, some effort has been developed to deal with this problem through the use of machine learning approaches to strengthen the strategic decision making. In this work, a deep learning neural network application to identify gender-based violence on Twitter messages is presented. A total of 1,857,450 messages (generated in Mexico) were downloaded from Twitter: 61,604 of them were manually tagged by human volunteers as negative, positive or neutral messages, to serve as training and test data sets. Results presented in this paper show the effectiveness of deep neural network (about 80% of the area under the receiver operating characteristic) in detection of gender violence on Twitter messages. The main contribution of this investigation is that the data set was minimally pre-processed (as a difference versus most state-of-the-art approaches). Thus, the original messages were converted into a numerical vector in accordance to the frequency of word’s appearance and only adverbs, conjunctions and prepositions were deleted (which occur very frequently in text and we think that these words do not contribute to discriminatory messages on Twitter). Finally, this work contributes to dealing with gender violence in Mexico, which is an issue that needs to be faced immediately.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Philipp Rentzsch ◽  
Max Schubach ◽  
Jay Shendure ◽  
Martin Kircher

Abstract Background Splicing of genomic exons into mRNAs is a critical prerequisite for the accurate synthesis of human proteins. Genetic variants impacting splicing underlie a substantial proportion of genetic disease, but are challenging to identify beyond those occurring at donor and acceptor dinucleotides. To address this, various methods aim to predict variant effects on splicing. Recently, deep neural networks (DNNs) have been shown to achieve better results in predicting splice variants than other strategies. Methods It has been unclear how best to integrate such process-specific scores into genome-wide variant effect predictors. Here, we use a recently published experimental data set to compare several machine learning methods that score variant effects on splicing. We integrate the best of those approaches into general variant effect prediction models and observe the effect on classification of known pathogenic variants. Results We integrate two specialized splicing scores into CADD (Combined Annotation Dependent Depletion; cadd.gs.washington.edu), a widely used tool for genome-wide variant effect prediction that we previously developed to weight and integrate diverse collections of genomic annotations. With this new model, CADD-Splice, we show that inclusion of splicing DNN effect scores substantially improves predictions across multiple variant categories, without compromising overall performance. Conclusions While splice effect scores show superior performance on splice variants, specialized predictors cannot compete with other variant scores in general variant interpretation, as the latter account for nonsense and missense effects that do not alter splicing. Although only shown here for splice scores, we believe that the applied approach will generalize to other specific molecular processes, providing a path for the further improvement of genome-wide variant effect prediction.


2019 ◽  
Vol 78 (5) ◽  
pp. 617-628 ◽  
Author(s):  
Erika Van Nieuwenhove ◽  
Vasiliki Lagou ◽  
Lien Van Eyck ◽  
James Dooley ◽  
Ulrich Bodenhofer ◽  
...  

ObjectivesJuvenile idiopathic arthritis (JIA) is the most common class of childhood rheumatic diseases, with distinct disease subsets that may have diverging pathophysiological origins. Both adaptive and innate immune processes have been proposed as primary drivers, which may account for the observed clinical heterogeneity, but few high-depth studies have been performed.MethodsHere we profiled the adaptive immune system of 85 patients with JIA and 43 age-matched controls with indepth flow cytometry and machine learning approaches.ResultsImmune profiling identified immunological changes in patients with JIA. This immune signature was shared across a broad spectrum of childhood inflammatory diseases. The immune signature was identified in clinically distinct subsets of JIA, but was accentuated in patients with systemic JIA and those patients with active disease. Despite the extensive overlap in the immunological spectrum exhibited by healthy children and patients with JIA, machine learning analysis of the data set proved capable of discriminating patients with JIA from healthy controls with ~90% accuracy.ConclusionsThese results pave the way for large-scale immune phenotyping longitudinal studies of JIA. The ability to discriminate between patients with JIA and healthy individuals provides proof of principle for the use of machine learning to identify immune signatures that are predictive to treatment response group.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Chen Zhang ◽  
Bin Hu ◽  
Yucong Suo ◽  
Zhiqiang Zou ◽  
Yimu Ji

In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.


Sign in / Sign up

Export Citation Format

Share Document