scholarly journals Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit

Author(s):  
Jiacheng Yao ◽  
Jing Zhang ◽  
Jiafeng Li ◽  
Li Zhuo

AbstractWith the sharp booming of online live streaming platforms, some anchors seek profits and accumulate popularity by mixing inappropriate content into live programs. After being blacklisted, these anchors even forged their identities to change the platform to continue live, causing great harm to the network environment. Therefore, we propose an anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit (GRU) for anchor identification of live platform. First, the speech of the anchor is extracted from the live streaming by using voice activation detection (VAD) and speech separation. Then, the feature sequence of anchor voiceprint is generated from the speech waveform with the self-attention network RawNet-SA. Finally, the feature sequence of anchor voiceprint is aggregated by GRU to transform into a deep voiceprint feature vector for anchor recognition. Experiments are conducted on the VoxCeleb, CN-Celeb, and MUSAN dataset, and the competitive results demonstrate that our method can effectively recognize the anchor voiceprint in video streaming.

Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 948
Author(s):  
Carlos Eduardo Maffini Santos ◽  
Carlos Alexandre Gouvea da Silva ◽  
Carlos Marcelo Pedroso

Quality of service (QoS) requirements for live streaming are most required for video-on-demand (VoD), where they are more sensitive to variations in delay, jitter, and packet loss. Dynamic Adaptive Streaming over HTTP (DASH) is the most popular technology for live streaming and VoD, where it has been massively deployed on the Internet. DASH is an over-the-top application using unmanaged networks to distribute content with the best possible quality. Widely, it uses large reception buffers in order to keep a seamless playback for VoD applications. However, the use of large buffers in live streaming services is not allowed because of the induced delay. Hence, network congestion caused by insufficient queues could decrease the user-perceived video quality. Active Queue Management (AQM) arises as an alternative to control the congestion in a router’s queue, pressing the TCP traffic sources to reduce their transmission rate when it detects incipient congestion. As a consequence, the DASH client tends to decrease the quality of the streamed video. In this article, we evaluate the performance of recent AQM strategies for real-time adaptive video streaming and propose a new AQM algorithm using Long Short-Term Memory (LSTM) neural networks to improve the user-perceived video quality. The LSTM forecast the trend of queue delay to allow earlier packet discard in order to avoid the network congestion. The results show that the proposed method outperforms the competing AQM algorithms, mainly in scenarios where there are congested networks.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Shi Wang ◽  
Zhujun Wang ◽  
Yi Jiang ◽  
Huayu Wang

In the event extraction task, considering that there may be multiple scenarios in the corpus and an argument may play different roles under different triggers, the traditional tagging scheme can only tag each word once, which cannot solve the problem of argument overlap. A hierarchical tagging pipeline model for Chinese corpus based on the pretrained model Bert was proposed, which can obtain the relevant arguments of each event in a hierarchical way. The pipeline structure is selected in the model, and the event extraction task is divided into event trigger classification and argument recognition. Firstly, the pretrained model Bert is used to generate the feature vector and transfer it to bidirectional gated recurrent unit+conditional random field (BiGRU+CRF) model for trigger classification; then, the marked event type features are spliced into the corpus as known features and then passed into BiGRU+CRF for argument recognition. We evaluated our method on DUEE, combined with data enhancement and mask operation. Experimental results show that our method is improved compared with other baselines, which prove the effectiveness of the model in Chinese corpus.


Author(s):  
Tihajar Sri Mauliya ◽  
Hanafi Hanafi ◽  
Hanafi Hanafi

Throughput yang diperoleh pada pengguna layanan internet, berpengaruh dengan kualitas layanan internet dalam hal transfer data, terutama pada layanan video streaming yang sangat bergantung dengan throughput yang cukup besar. Pada penelitian ini dilakukan pengukuran throughput terhadap layanan video streaming, pada sisi client jaringan, selama satu minggu. Pengukuran dilakukan pada layanan Indihome PT. Telkom Kota Lhokseumawe. Layanan video streaming yang diukur adalah video dengan resolusi layar 360p, 480p, 720p, dan 1080p, pada website lk21.org. Jumlah client saat pengukuran adalah 4 client yang mengakses website secara serentak. Throughput diukur menggunakan software wireshark. Hasil pengukuran menunjukkan bahwa throughput rata-rata per client dalam kategori baik saat mengakses video streaming dengan resolusi video 360p, 480p, dan 720p, karena throughput rata-rata masing-masing video tersebut masih di atas kebutuhan bandwidth minimum masing-masing tipe video. Sementara untuk resolusi video 1080p dalam kategori buruk, kecuali hanya satu client saja yang mengakses video streaming, sehingga throughput yang diperoleh akan berada di atas kebutuhan minimum bandwidth tipe video 1080p.


2020 ◽  
Vol 10 (4) ◽  
pp. 1273 ◽  
Author(s):  
Özlem BATUR DİNLER ◽  
Nizamettin AYDIN

Speech segment detection based on gated recurrent unit (GRU) recurrent neural networks for the Kurdish language was investigated in the present study. The novelties of the current research are the utilization of a GRU in Kurdish speech segment detection, creation of a unique database from the Kurdish language, and optimization of processing parameters for Kurdish speech segmentation. This study is the first attempt to find the optimal feature parameters of the model and to form a large Kurdish vocabulary dataset for a speech segment detection based on consonant, vowel, and silence (C/V/S) discrimination. For this purpose, four window sizes and three window types with three hybrid feature vector techniques were used to describe the phoneme boundaries. Identification of the phoneme boundaries using a GRU recurrent neural network was performed with six different classification algorithms for the C/V/S discrimination. We have demonstrated that the GRU model has achieved outstanding speech segmentation performance for characterizing Kurdish acoustic signals. The experimental findings of the present study show the significance of the segment detection of speech signals by effectively utilizing hybrid features, window sizes, window types, and classification models for Kurdish speech.


2015 ◽  
Vol 7 (1-4) ◽  
pp. 27-28 ◽  
Author(s):  
Bradley D. Mattan ◽  
Kimberly A. Quinn ◽  
Pia Rotshtein
Keyword(s):  
The Self ◽  

Author(s):  
Angeline G. Dludla ◽  
Mncedisi J. Bembe ◽  
Badamsuren Byambaakhuu ◽  
Mohammed-Sani Abdulai ◽  
Jae Jeung Rho

2019 ◽  
Vol 9 (3) ◽  
pp. 35-40
Author(s):  
Mitra Unik ◽  
Soni Soni ◽  
Randra Aguslan Pratama

Abstract One of the popular internet services in use today is video streaming, either live (live streaming) or pre-recorder. Streaming video is a type of streaming media where data from video files is continuously transmitted over the internet to remote users. This fundamental problem appears to be influenced by the biggest factor which is the limited infrastructure of network resources which causes poor video quality. The process of digital video communication is known to consume quite a large resource, because in general the bandwidth requirements for sending Video and Audio signals. To maintain the quality of the video being played, there are several instruments needed, one of which is a data connection that is required to have Quality of Service (QoS). The parameters used in the measurement of QoS are delay, jitter, packet loss, throughput. This study uses the PPDIO method as a workflow with a Network Lifecycle approach. In this research, there are many factors that influence the quality of video, namely network factors and hardware factors. The test results obtained are not absolute, so it is possible that there will be differences in subsequent testing. Encoding also affects the quality of the video. Bandwidth equalization according to priority when the traffic conditions of all packets are full. Based on a comparative analysis of QoS parameter calculations using HTB and Diffserv methods, a comparison of throughput, jitter and delay does not differ greatly between clients. Keywords: Video Streaming, Diffserv, HTB, QoS Abstrak Salah satu layanan dari internet yang populer digunakan saat ini adalah video streaming, baik secara langsung (live streaming) atau pre-recorder. Streaming video merupakan jenis streaming media dimana data dari file video secara terus menerus dikirimkan melalui jaringan internet ke pengguna jarak jauh. Permasalahan mendasar ini muncul dipengaruhi oleh faktor terbesar yaitu terbatasnya infrastruktur sumber daya jaringan yang menyebabkan kualitas video yang buruk. Proses  komunikasi  digital  video,  diketahui  menghabiskan  resource  yang  cukup  besar, dikarenakan Secara umum kebutuhan bandwidth untuk mengirimkan sinyal Video dan Audio. Guna menjaga kualitas dari video yang dimainkan, terdapat beberapa instrument yang dibutuhkan, salah satunya adalah koneksi data yang wajib memiliki Quality of Service (QoS). Adapun Parameter yang digunakan dalam pengukuran QoS adalah delay, jitter, packet loss, Throughput. Penelitian ini menggunakan metode PPDIO sebagai alur kerja dengan pendekatan Network Lifecycle. Pada penelitian ini didapat Banyak faktor yang mempengaruhi kualitas dari video yaitu faktor jaringan dan faktor dari Hardware. Hasil pengujian didapat tidaklah mutlak sehingga tidak menutup kemungkinan akan ada perbedaan pada pengujian selanjutnya. Encoding juga mempengaruhi kualitas dari video. pemerataan Bandwidth sesuai prioritasnya saat kondisi traffic seluruh paket penuh. Berdasarkan analisa perbandingan perhitungan parameter QoS menggunakan metode HTB dan Diffserv, didapatkan  perbandingan troughput, jitter dan delay yang tidak berbeda jauh antara klien. Kata kunci: Video streaming, Diffserv, HTB, QoS  


Sign in / Sign up

Export Citation Format

Share Document