Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit

AbstractWith the sharp booming of online live streaming platforms, some anchors seek profits and accumulate popularity by mixing inappropriate content into live programs. After being blacklisted, these anchors even forged their identities to change the platform to continue live, causing great harm to the network environment. Therefore, we propose an anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit (GRU) for anchor identification of live platform. First, the speech of the anchor is extracted from the live streaming by using voice activation detection (VAD) and speech separation. Then, the feature sequence of anchor voiceprint is generated from the speech waveform with the self-attention network RawNet-SA. Finally, the feature sequence of anchor voiceprint is aggregated by GRU to transform into a deep voiceprint feature vector for anchor recognition. Experiments are conducted on the VoxCeleb, CN-Celeb, and MUSAN dataset, and the competitive results demonstrate that our method can effectively recognize the anchor voiceprint in video streaming.

Download Full-text

Improving Perceived Quality of Live Adaptative Video Streaming

Entropy ◽

10.3390/e23080948 ◽

2021 ◽

Vol 23 (8) ◽

pp. 948

Author(s):

Carlos Eduardo Maffini Santos ◽

Carlos Alexandre Gouvea da Silva ◽

Carlos Marcelo Pedroso

Keyword(s):

Video Streaming ◽

Short Term Memory ◽

Transmission Rate ◽

Video Quality ◽

Video On Demand ◽

Live Streaming ◽

Network Congestion ◽

Congested Networks ◽

Over The Top

Quality of service (QoS) requirements for live streaming are most required for video-on-demand (VoD), where they are more sensitive to variations in delay, jitter, and packet loss. Dynamic Adaptive Streaming over HTTP (DASH) is the most popular technology for live streaming and VoD, where it has been massively deployed on the Internet. DASH is an over-the-top application using unmanaged networks to distribute content with the best possible quality. Widely, it uses large reception buffers in order to keep a seamless playback for VoD applications. However, the use of large buffers in live streaming services is not allowed because of the induced delay. Hence, network congestion caused by insufficient queues could decrease the user-perceived video quality. Active Queue Management (AQM) arises as an alternative to control the congestion in a router’s queue, pressing the TCP traffic sources to reduce their transmission rate when it detects incipient congestion. As a consequence, the DASH client tends to decrease the quality of the streamed video. In this article, we evaluate the performance of recent AQM strategies for real-time adaptive video streaming and propose a new AQM algorithm using Long Short-Term Memory (LSTM) neural networks to improve the user-perceived video quality. The LSTM forecast the trend of queue delay to allow earlier packet discard in order to avoid the network congestion. The results show that the proposed method outperforms the competing AQM algorithms, mainly in scenarios where there are congested networks.

Download Full-text

Multi-zone indoor temperature prediction based on Graph Attention Network and Gated Recurrent Unit

10.1109/case49439.2021.9551630 ◽

2021 ◽

Author(s):

Chunxiang Zhou ◽

Zhanbo Xu ◽

Jiang Wu ◽

Kun Liu ◽

Xiaohong Guan

Keyword(s):

Temperature Prediction ◽

Indoor Temperature ◽

Attention Network ◽

Gated Recurrent Unit

Download Full-text

Hierarchical Annotation Event Extraction Method in Multiple Scenarios

Wireless Communications and Mobile Computing ◽

10.1155/2021/8899852 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Shi Wang ◽

Zhujun Wang ◽

Yi Jiang ◽

Huayu Wang

Keyword(s):

Random Field ◽

Extraction Method ◽

Feature Vector ◽

Conditional Random Field ◽

Event Extraction ◽

Event Type ◽

Event Trigger ◽

Pipeline Model ◽

Gated Recurrent Unit ◽

Multiple Scenarios

In the event extraction task, considering that there may be multiple scenarios in the corpus and an argument may play different roles under different triggers, the traditional tagging scheme can only tag each word once, which cannot solve the problem of argument overlap. A hierarchical tagging pipeline model for Chinese corpus based on the pretrained model Bert was proposed, which can obtain the relevant arguments of each event in a hierarchical way. The pipeline structure is selected in the model, and the event extraction task is divided into event trigger classification and argument recognition. Firstly, the pretrained model Bert is used to generate the feature vector and transfer it to bidirectional gated recurrent unit+conditional random field (BiGRU+CRF) model for trigger classification; then, the marked event type features are spliced into the corpus as known features and then passed into BiGRU+CRF for argument recognition. We evaluated our method on DUEE, combined with data enhancement and mask operation. Experimental results show that our method is improved compared with other baselines, which prove the effectiveness of the model in Chinese corpus.

Download Full-text

Analisis Throughput Video Live Streaming pada Pengguna Layanan Internet Indihome dengan Resolusi Layar Berbeda

Jurnal Litek : Jurnal Listrik Telekomunikasi Elektronika ◽

10.30811/litek.v17i1.1779 ◽

2020 ◽

Vol 17 (1) ◽

pp. 9

Author(s):

Tihajar Sri Mauliya ◽

Hanafi Hanafi ◽

Hanafi Hanafi

Keyword(s):

Video Streaming ◽

Live Streaming ◽

Transfer Data ◽

Minimum Bandwidth

Throughput yang diperoleh pada pengguna layanan internet, berpengaruh dengan kualitas layanan internet dalam hal transfer data, terutama pada layanan video streaming yang sangat bergantung dengan throughput yang cukup besar. Pada penelitian ini dilakukan pengukuran throughput terhadap layanan video streaming, pada sisi client jaringan, selama satu minggu. Pengukuran dilakukan pada layanan Indihome PT. Telkom Kota Lhokseumawe. Layanan video streaming yang diukur adalah video dengan resolusi layar 360p, 480p, 720p, dan 1080p, pada website lk21.org. Jumlah client saat pengukuran adalah 4 client yang mengakses website secara serentak. Throughput diukur menggunakan software wireshark. Hasil pengukuran menunjukkan bahwa throughput rata-rata per client dalam kategori baik saat mengakses video streaming dengan resolusi video 360p, 480p, dan 720p, karena throughput rata-rata masing-masing video tersebut masih di atas kebutuhan bandwidth minimum masing-masing tipe video. Sementara untuk resolusi video 1080p dalam kategori buruk, kecuali hanya satu client saja yang mengakses video streaming, sehingga throughput yang diperoleh akan berada di atas kebutuhan minimum bandwidth tipe video 1080p.

Download Full-text

HAN-ReGRU: hierarchical attention network with residual gated recurrent unit for emotion recognition in conversation

Neural Computing and Applications ◽

10.1007/s00521-020-05063-7 ◽

2020 ◽

Author(s):

Hui Ma ◽

Jian Wang ◽

Lingfei Qian ◽

Hongfei Lin

Keyword(s):

Emotion Recognition ◽

Attention Network ◽

Gated Recurrent Unit

Download Full-text

Examining the Dorsolateral and Ventromedial Prefrontal Cortex Involvement in the Self-Attention Network: A Randomized, Sham-Controlled, Parallel Group, Double-Blind, and Multichannel HD-tDCS Study

Frontiers in Neuroscience ◽

10.3389/fnins.2020.00683 ◽

2020 ◽

Vol 14 ◽

Author(s):

Víctor Martínez-Pérez ◽

Guillermo Campoy ◽

Lucía B. Palmero ◽

Luis J. Fuentes

Keyword(s):

Prefrontal Cortex ◽

Ventromedial Prefrontal Cortex ◽

The Self ◽

Double Blind ◽

Attention Network ◽

Parallel Group

Download Full-text

An Optimal Feature Parameter Set Based on Gated Recurrent Unit Recurrent Neural Networks for Speech Segment Detection

Applied Sciences ◽

10.3390/app10041273 ◽

2020 ◽

Vol 10 (4) ◽

pp. 1273 ◽

Cited By ~ 5

Author(s):

Özlem BATUR DİNLER ◽

Nizamettin AYDIN

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Feature Vector ◽

Processing Parameters ◽

Speech Segmentation ◽

Hybrid Features ◽

Speech Segment ◽

Experimental Findings ◽

Gated Recurrent Unit ◽

Optimal Feature

Speech segment detection based on gated recurrent unit (GRU) recurrent neural networks for the Kurdish language was investigated in the present study. The novelties of the current research are the utilization of a GRU in Kurdish speech segment detection, creation of a unique database from the Kurdish language, and optimization of processing parameters for Kurdish speech segmentation. This study is the first attempt to find the optimal feature parameters of the model and to form a large Kurdish vocabulary dataset for a speech segment detection based on consonant, vowel, and silence (C/V/S) discrimination. For this purpose, four window sizes and three window types with three hybrid feature vector techniques were used to describe the phoneme boundaries. Identification of the phoneme boundaries using a GRU recurrent neural network was performed with six different classification algorithms for the C/V/S discrimination. We have demonstrated that the GRU model has achieved outstanding speech segmentation performance for characterizing Kurdish acoustic signals. The experimental findings of the present study show the significance of the segment detection of speech signals by effectively utilizing hybrid features, window sizes, window types, and classification models for Kurdish speech.

Download Full-text

Relevance, valence, and the self-attention network

Cognitive Neuroscience ◽

10.1080/17588928.2015.1075489 ◽

2015 ◽

Vol 7 (1-4) ◽

pp. 27-28 ◽

Cited By ~ 1

Author(s):

Bradley D. Mattan ◽

Kimberly A. Quinn ◽

Pia Rotshtein

Keyword(s):

The Self ◽

Attention Network

Download Full-text

System architecture for ubiquitous live video streaming in university network environment

2013 Africon ◽

10.1109/afrcon.2013.6757639 ◽

2013 ◽

Cited By ~ 3

Author(s):

Angeline G. Dludla ◽

Mncedisi J. Bembe ◽

Badamsuren Byambaakhuu ◽

Mohammed-Sani Abdulai ◽

Jae Jeung Rho

Keyword(s):

Video Streaming ◽

System Architecture ◽

Network Environment ◽

Live Video ◽

Live Video Streaming

Download Full-text

PENERAPAN METODE HTB DAN DIFFSERV GUNA PENINGKATAN QOS PADA LAYANAN VIDEO STREAMING

JURNAL FASILKOM ◽

10.37859/jf.v9i3.1665 ◽

2019 ◽

Vol 9 (3) ◽

pp. 35-40

Author(s):

Mitra Unik ◽

Soni Soni ◽

Randra Aguslan Pratama

Keyword(s):

Quality Of Service ◽

Video Streaming ◽

Packet Loss ◽

Digital Video ◽

Streaming Media ◽

Fundamental Problem ◽

Streaming Video ◽

Live Streaming ◽

Delay Jitter

Abstract One of the popular internet services in use today is video streaming, either live (live streaming) or pre-recorder. Streaming video is a type of streaming media where data from video files is continuously transmitted over the internet to remote users. This fundamental problem appears to be influenced by the biggest factor which is the limited infrastructure of network resources which causes poor video quality. The process of digital video communication is known to consume quite a large resource, because in general the bandwidth requirements for sending Video and Audio signals. To maintain the quality of the video being played, there are several instruments needed, one of which is a data connection that is required to have Quality of Service (QoS). The parameters used in the measurement of QoS are delay, jitter, packet loss, throughput. This study uses the PPDIO method as a workflow with a Network Lifecycle approach. In this research, there are many factors that influence the quality of video, namely network factors and hardware factors. The test results obtained are not absolute, so it is possible that there will be differences in subsequent testing. Encoding also affects the quality of the video. Bandwidth equalization according to priority when the traffic conditions of all packets are full. Based on a comparative analysis of QoS parameter calculations using HTB and Diffserv methods, a comparison of throughput, jitter and delay does not differ greatly between clients. Keywords: Video Streaming, Diffserv, HTB, QoS Abstrak Salah satu layanan dari internet yang populer digunakan saat ini adalah video streaming, baik secara langsung (live streaming) atau pre-recorder. Streaming video merupakan jenis streaming media dimana data dari file video secara terus menerus dikirimkan melalui jaringan internet ke pengguna jarak jauh. Permasalahan mendasar ini muncul dipengaruhi oleh faktor terbesar yaitu terbatasnya infrastruktur sumber daya jaringan yang menyebabkan kualitas video yang buruk. Proses komunikasi digital video, diketahui menghabiskan resource yang cukup besar, dikarenakan Secara umum kebutuhan bandwidth untuk mengirimkan sinyal Video dan Audio. Guna menjaga kualitas dari video yang dimainkan, terdapat beberapa instrument yang dibutuhkan, salah satunya adalah koneksi data yang wajib memiliki Quality of Service (QoS). Adapun Parameter yang digunakan dalam pengukuran QoS adalah delay, jitter, packet loss, Throughput. Penelitian ini menggunakan metode PPDIO sebagai alur kerja dengan pendekatan Network Lifecycle. Pada penelitian ini didapat Banyak faktor yang mempengaruhi kualitas dari video yaitu faktor jaringan dan faktor dari Hardware. Hasil pengujian didapat tidaklah mutlak sehingga tidak menutup kemungkinan akan ada perbedaan pada pengujian selanjutnya. Encoding juga mempengaruhi kualitas dari video. pemerataan Bandwidth sesuai prioritasnya saat kondisi traffic seluruh paket penuh. Berdasarkan analisa perbandingan perhitungan parameter QoS menggunakan metode HTB dan Diffserv, didapatkan perbandingan troughput, jitter dan delay yang tidak berbeda jauh antara klien. Kata kunci: Video streaming, Diffserv, HTB, QoS

Download Full-text