Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data

Kyoung-Woon On; Eun-Sol Kim; Yu-Jung Heo; Byoung-Tak Zhang

doi:10.1609/aaai.v34i04.5978

Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5978 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5315-5322

Author(s):

Kyoung-Woon On ◽

Eun-Sol Kim ◽

Yu-Jung Heo ◽

Byoung-Tak Zhang

Keyword(s):

Recurrent Neural Networks ◽

Message Passing ◽

Video Data ◽

Sequential Data ◽

Complex Structures ◽

Learning Networks ◽

Graph Learning ◽

First Order ◽

Compositional Structure ◽

Dependency Structures

Conventional sequential learning methods such as Recurrent Neural Networks (RNNs) focus on interactions between consecutive inputs, i.e. first-order Markovian dependency. However, most of sequential data, as seen with videos, have complex dependency structures that imply variable-length semantic flows and their compositions, and those are hard to be captured by conventional methods. Here, we propose Cut-Based Graph Learning Networks (CB-GLNs) for learning video data by discovering these complex structures of the video. The CB-GLNs represent video data as a graph, with nodes and edges corresponding to frames of the video and their dependencies respectively. The CB-GLNs find compositional dependencies of the data in multilevel graph forms via a parameterized kernel with graph-cut and a message passing framework. We evaluate the proposed method on the two different tasks for video understanding: Video theme classification (Youtube-8M dataset (Abu-El-Haija et al. 2016)) and Video Question and Answering (TVQA dataset(Lei et al. 2018)). The experimental results show that our model efficiently learns the semantic compositional structure of video data. Furthermore, our model achieves the highest performance in comparison to other baseline methods.

Download Full-text

Music emotion recognition using recurrent neural networks and pretrained models

Journal of Intelligent Information Systems ◽

10.1007/s10844-021-00658-5 ◽

2021 ◽

Author(s):

Jacek Grekow

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Recurrent Network ◽

Sequential Data ◽

Circumplex Model ◽

Learning Networks ◽

Audio Features ◽

Svm Algorithm ◽

Audio Feature Extraction

AbstractThe article presents conducted experiments using recurrent neural networks for emotion detection in musical segments. Trained regression models were used to predict the continuous values of emotions on the axes of Russell’s circumplex model. A process of audio feature extraction and creating sequential data for learning networks with long short-term memory (LSTM) units is presented. Models were implemented using the WekaDeeplearning4j package and a number of experiments were carried out with data with different sets of features and varying segmentation. The usefulness of dividing the data into sequences as well as the point of using recurrent networks to recognize emotions in music, the results of which have even exceeded the SVM algorithm for regression, were demonstrated. The author analyzed the effect of the network structure and the set of used features on the results of the regressors recognizing values on two axes of the emotion model: arousal and valence. Finally, the use of a pretrained model for processing audio features and training a recurrent network with new sequences of features is presented.

Download Full-text

A Nonstationary Hidden Markov Model with Approximately Infinitely-Long Time-Dependencies

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213016400017 ◽

2016 ◽

Vol 25 (05) ◽

pp. 1640001 ◽

Cited By ~ 3

Author(s):

Sotirios Chatzis ◽

Dimitrios Kosmopoulos ◽

George Papadourakis

Keyword(s):

Markov Models ◽

Hidden Markov ◽

Mean Field ◽

Sequential Data ◽

First Order ◽

Order Markov Chain ◽

Inference Algorithms ◽

Unrealistic Assumption ◽

Long Time ◽

Time Dependencies

Hidden Markov models (HMMs) are a popular approach for modeling sequential data, typically based on the assumption of a first-order Markov chain. In other words, only one-step back dependencies are modeled which is a rather unrealistic assumption in most applications. In this paper, we propose a method for postulating HMMs with approximately infinitely-long time-dependencies. Our approach considers the whole history of model states in the postulated dependencies, by making use of a recently proposed nonparametric Bayesian method for modeling label sequences with infinitely-long time dependencies, namely the sequence memoizer. We manage to derive training and inference algorithms for our model with computational costs identical to simple first-order HMMs, despite its entailed infinitely-long time-dependencies, by employing a mean-field-like approximation. The efficacy of our proposed model is experimentally demonstrated.

Download Full-text

Parallel Nonnegative Matrix Factorization via Newton Iteration

Parallel Processing Letters ◽

10.1142/s0129626416500146 ◽

2016 ◽

Vol 26 (03) ◽

pp. 1650014 ◽

Cited By ~ 3

Author(s):

Markus Flatz ◽

Marián Vajteršic

Keyword(s):

Shared Memory ◽

Matrix Factorization ◽

Message Passing ◽

Nonnegative Matrix Factorization ◽

Nonnegative Matrix ◽

Newton Iteration ◽

Parallel Execution ◽

Kkt Conditions ◽

Nonnegative Matrices ◽

First Order

The goal of Nonnegative Matrix Factorization (NMF) is to represent a large nonnegative matrix in an approximate way as a product of two significantly smaller nonnegative matrices. This paper shows in detail how an NMF algorithm based on Newton iteration can be derived using the general Karush-Kuhn-Tucker (KKT) conditions for first-order optimality. This algorithm is suited for parallel execution on systems with shared memory and also with message passing. Both versions were implemented and tested, delivering satisfactory speedup results.

Download Full-text

An Edge-Sense Bidirectional Pyramid Network for Stereo Matching of VHR Remote Sensing Images

Remote Sensing ◽

10.3390/rs12244025 ◽

2020 ◽

Vol 12 (24) ◽

pp. 4025

Author(s):

Rongshu Tao ◽

Yuming Xiang ◽

Hongjian You

Keyword(s):

Remote Sensing ◽

Stereo Matching ◽

Tall Buildings ◽

Disparity Estimation ◽

Complex Structures ◽

Learning Networks ◽

Remote Sensing Images ◽

Essential Step ◽

Disparity Range ◽

The Cost

As an essential step in 3D reconstruction, stereo matching still faces unignorable problems due to the high resolution and complex structures of remote sensing images. Especially in occluded areas of tall buildings and textureless areas of waters and woods, precise disparity estimation has become a difficult but important task. In this paper, we develop a novel edge-sense bidirectional pyramid stereo matching network to solve the aforementioned problems. The cost volume is constructed from negative to positive disparities since the disparity range in remote sensing images varies greatly and traditional deep learning networks only work well for positive disparities. Then, the occlusion-aware maps based on the forward-backward consistency assumption are applied to reduce the influence of the occluded area. Moreover, we design an edge-sense smoothness loss to improve the performance of textureless areas while maintaining the main structure. The proposed network is compared with two baselines. The experimental results show that our proposed method outperforms two methods, DenseMapNet and PSMNet, in terms of averaged endpoint error (EPE) and the fraction of erroneous pixels (D1), and the improvements in occluded and textureless areas are significant.

Download Full-text

Discovering the Compositional Structure of Vector Representations with Role Learning Networks

10.18653/v1/2020.blackboxnlp-1.23 ◽

2020 ◽

Author(s):

Paul Soulos ◽

R. Thomas McCoy ◽

Tal Linzen ◽

Paul Smolensky

Keyword(s):

Learning Networks ◽

Compositional Structure ◽

Role Learning ◽

Vector Representations

Download Full-text

Penerapan K-Means Clustering Untuk Seleksi Frame Dominan Berbasis NTSC Pada Obyek Bergerak

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020742184 ◽

2020 ◽

Vol 7 (4) ◽

pp. 745

Author(s):

Rizka Indah Armianti ◽

Achmad Fanany Onnilita Gaffar ◽

Arief Bramanto Wicaksono Putra

Keyword(s):

Feature Extraction ◽

Video Data ◽

Clustering Method ◽

Data Frame ◽

First Order ◽

Feature Extraction And Selection ◽

Different Shapes ◽

Frame Set ◽

Selection Of

Obyek dinyatakan bergerak jika terjadi perubahan posisi dimensi disetiap frame. Pergerakan obyek menyebabkan obyek memiliki perbedaan bentuk pola disetiap frame-nya. Frame yang memiliki pola terbaik diantara frame lainnya disebut frame dominan. Penelitian ini bertujuan untuk menyeleksi frame dominan dari rangkaian frame dengan menerapkan metode K-means clustering untuk memperoleh centroid dominan (centroid dengan nilai tertinggi) yang digunakan sebagai dasar seleksi frame dominan. Dalam menyeleksi frame dominan terdapat 4 tahapan utama yaitu akuisisi data, penetapan pola obyek, ekstrasi ciri dan seleksi. Data yang digunakan berupa data video yang kemudian dilakukan proses penetapan pola obyek menggunakan operasi pengolahan citra digital, dengan hasil proses berupa pola obyek RGB yang kemudian dilakukan ekstraksi ciri berbasis NTSC dengan menggunakan metode statistik orde pertama yaitu Mean. Data hasil ekstraksi ciri berjumlah 93 data frame yang selanjutnya dikelompokkan menjadi 3 cluster menggunakan metode K-Means. Dari hasil clustering, centroid dominan terletak pada cluster 3 dengan nilai centroid 0.0177 dan terdiri dari 41 data frame. Selanjutnya diukur jarak kedekatan seluruh data cluster 3 terhadap centroid, data yang memiliki jarak terdekat dengan centroid itulah frame dominan. Hasil seleksi frame dominan ditunjukkan pada jarak antar centroid dengan anggota cluster, dimana dari seluruh 41 data frame tiga jarak terbaik diperoleh adalah 0.0008 dan dua jarak bernilai 0.0010 yang dimiliki oleh frame ke-59, ke-36 dan ke-35. AbstractThe object is declared moving if there is a change in the position of the dimensions in each frame. The movement of an object causes the object to have different shapes in each frame. The frame that has the best pattern among other frames is called the dominant frame. This study aims to select the dominant frame from the frame set by applying the K-means clustering method to obtain the dominant centroid (the highest value centroid) which is used as the basis for the selection of dominant frames. In selecting dominant frames, there are 4 main stages, namely data acquisition, determination of object patterns, feature extraction and selection. The data used in the form of video data which is then carried out the process of determining the pattern of objects using digital image processing operations, with the results of the process in the form of an RGB object pattern which is then performed NTSC-based feature extraction using the first-order statistical method, Mean. The data from feature extraction are 93 data frames which are then grouped into 3 clusters using the K-Means method. From the results of clustering, the dominant centroid is located in cluster 3 with a centroid value of 0.0177 and consists of 41 data frames. Furthermore, the proximity of all data cluster 3 to the centroid is measured, the data having the closest distance to the centroid is the dominant frame. The results of dominant frame selection are shown in the distance between centroids and cluster members, where from all 41 data frames the three best distances obtained are 0.0008, 0.0010, and 0.0010 owned by 59th, 36th and 35th frames.

Download Full-text

ONLINE GRAPH LEARNING FROM SEQUENTIAL DATA

2018 IEEE Data Science Workshop (DSW) ◽

10.1109/dsw.2018.8439913 ◽

2018 ◽

Cited By ~ 9

Author(s):

Stefan Vlaski ◽

Hermina P. Maretic ◽

Roula Nassif ◽

Pascal Frossard ◽

Ali H. Sayed

Keyword(s):

Sequential Data ◽

Graph Learning

Download Full-text

MOYAL STAR PRODUCT OF μ-HOLOMORPHIC j-DIFFERENTIALS

International Journal of Geometric Methods in Modern Physics ◽

10.1142/s0219887808002837 ◽

2008 ◽

Vol 05 (03) ◽

pp. 363-373

Author(s):

M. KACHKACHI

Keyword(s):

Riemann Surface ◽

Riemann Surfaces ◽

Quantum Effect ◽

Star Product ◽

Complex Structure ◽

Complex Structures ◽

Star Products ◽

First Order ◽

Conformal Fields ◽

Conformal Covariance

It was shown in [1], only for scalar conformal fields, that the Moyal–Weyl star product can introduce the quantum effect as the phase factor to the ordinary product. In this paper we show that, even on the same complex structure, the Moyal–Weyl star product of two j-differentials (conformal fields of weights (j, 0)) does not vanish but it generates the quantum effect at the first order of its perturbative series. More generally, we get the explicit expression of the Moyal–Weyl star product of j-differentials defined on any complex structure of a bi-dimensional Riemann surface Σ. We show that the star product of two j-differentials is not a j-differential and does not preserve the conformal covariance character. This can shed some light on the Moyal–Weyl deformation quantization procedure connection's with the deformation of complex structures on a Riemann surface. Hence, the situation might relate the star products to the Moduli and Teichmüller spaces of Riemann surfaces.

Download Full-text

EXPERIMENTAL COMPARISON OF THE EFFECT OF ORDER IN RECURRENT NEURAL NETWORKS

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001493000431 ◽

1993 ◽

Vol 07 (04) ◽

pp. 849-872 ◽

Cited By ~ 30

Author(s):

CLIFFORD B. MILLER ◽

C. LEE GILES

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Internal State ◽

Second Order ◽

Convergence Time ◽

Experimental Comparison ◽

Grammatical Inference ◽

Neural Net ◽

First Order ◽

Finite State

There has been much interest in increasing the computational power of neural networks. In addition there has been much interest in “designing” neural networks better suited to particular problems. Increasing the “order” of the connectivity of a neural network permits both. Though order has played a significant role in feedforward neural networks, its role in dynamically driven recurrent networks is still being understood. This work explores the effect of order in learning grammars. We present an experimental comparison of first order and second order recurrent neural networks, as applied to the task of grammatical inference. We show that for the small grammars studied these two neural net architectures have comparable learning and generalization power, and that both are reasonably capable of extracting the correct finite state automata for the language in question. However, for a larger randomly-generated ten-state grammar, second order networks significantly outperformed the first order networks, both in convergence time and generalization capability. We show that these networks learn faster the more neurons they have (our experiments used up to 10 hidden neurons), but that the solutions found by smaller networks are usually of better quality (in terms of generalization performance after training). Second order nets have the advantage that they converge more quickly to a solution and can find it more reliably than first order nets, but that the second order solutions tend to be of poorer quality than those of the first order if both architectures are trained to the same error tolerance. Despite this, second order nets can more successfully extract finite state machines using heuristic clustering techniques applied to the internal state representations. We speculate that this may be due to restrictions on the ability of first order architecture to fully make use of its internal state representation power and that this may have implications for the performance of the two architectures when scaled up to larger problems.

Download Full-text

A Hierarchical Classification of First-Order Recurrent Neural Networks

The Chinese Journal of Physiology ◽

10.4077/cjp.2010.amm037 ◽

2010 ◽

Vol 53 (6) ◽

pp. 407-416 ◽

Cited By ~ 6

Author(s):

Jérémie Cabessa

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Hierarchical Classification ◽

First Order

Download Full-text