Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare

Speech processing technology has great potential in the medical field to provide beneficial solutions for both patients and doctors. Speech interfaces, represented by speech synthesis and speech recognition, can be used to transcribe medical documents, control medical devices, correct speech and hearing impairments, and assist the visually impaired. However, it is essential to predict prosody phrase boundaries for accurate natural speech synthesis. This study proposes a method to build a reliable learning corpus to train prosody boundary prediction models based on deep learning. In addition, we offer a way to generate a rule-based model that can predict the prosody boundary from the constructed corpus and use the result to train a deep learning-based model. As a result, we have built a coherent corpus, even though many workers have participated in its development. The estimated pairwise agreement of corpus annotations is between 0.7477 and 0.7916 and kappa coefficient (K) between 0.7057 and 0.7569. In addition, the deep learning-based model based on the rules obtained from the corpus showed a prediction accuracy of 78.57% for the three-level prosody phrase boundary, 87.33% for the two-level prosody phrase boundary.

Download Full-text

Integrating Articulatory Information in Deep Learning-Based Text-to-Speech Synthesis

10.21437/interspeech.2017-1762 ◽

2017 ◽

Cited By ~ 1

Author(s):

Beiming Cao ◽

Myungjong Kim ◽

Jan van Santen ◽

Ted Mau ◽

Jun Wang

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

A Unified Framework for the Generation of Glottal Signals in Deep Learning-based Parametric Speech Synthesis Systems

10.21437/interspeech.2018-1590 ◽

2018 ◽

Cited By ~ 2

Author(s):

Min-Jae Hwang ◽

Eunwoo Song ◽

Jin-Seob Kim ◽

Hong-Goo Kang

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

Unified Framework ◽

Parametric Speech Synthesis

Download Full-text

Applying Deep Learning to Audit Procedures: An Illustrative Framework

Accounting Horizons ◽

10.2308/acch-52455 ◽

2019 ◽

Vol 33 (3) ◽

pp. 89-109 ◽

Cited By ~ 9

Author(s):

Ting (Sophia) Sun

Keyword(s):

Decision Making ◽

Deep Learning ◽

Data Warehouse ◽

Visual Recognition ◽

Prediction Models ◽

Historical Data ◽

Structured Data ◽

Text Understanding ◽

Audit Data ◽

Learning Functions

SYNOPSIS This paper aims to promote the application of deep learning to audit procedures by illustrating how the capabilities of deep learning for text understanding, speech recognition, visual recognition, and structured data analysis fit into the audit environment. Based on these four capabilities, deep learning serves two major functions in supporting audit decision making: information identification and judgment support. The paper proposes a framework for applying these two deep learning functions to a variety of audit procedures in different audit phases. An audit data warehouse of historical data can be used to construct prediction models, providing suggested actions for various audit procedures. The data warehouse will be updated and enriched with new data instances through the application of deep learning and a human auditor's corrections. Finally, the paper discusses the challenges faced by the accounting profession, regulators, and educators when it comes to applying deep learning.

Download Full-text

Exploiting Data Analytics and Deep Learning Systems to Support Pavement Maintenance Decisions

Applied Sciences ◽

10.3390/app11062458 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2458

Author(s):

Ronald Roberts ◽

Laura Inzerillo ◽

Gaetano Di Mino

Keyword(s):

Deep Learning ◽

Management Practices ◽

Prediction Models ◽

Road Networks ◽

Pavement Management ◽

Efficient Operation ◽

Pavement Maintenance ◽

Goods And Services ◽

Maintenance Decisions ◽

Computational Systems

Road networks are critical infrastructures within any region and it is imperative to maintain their conditions for safe and effective movement of goods and services. Road Management, therefore, plays a key role to ensure consistent efficient operation. However, significant resources are required to perform necessary maintenance activities to achieve and maintain high levels of service. Pavement maintenance can typically be very expensive and decisions are needed concerning planning and prioritizing interventions. Data are key towards enabling adequate maintenance planning but in many instances, there is limited available information especially in small or under-resourced urban road authorities. This study develops a roadmap to help these authorities by using flexible data analysis and deep learning computational systems to highlight important factors within road networks, which are used to construct models that can help predict future intervention timelines. A case study in Palermo, Italy was successfully developed to demonstrate how the techniques could be applied to perform appropriate feature selection and prediction models based on limited data sources. The workflow provides a pathway towards more effective pavement maintenance management practices using techniques that can be readily adapted based on different environments. This takes another step towards automating these practices within the pavement management system.

Download Full-text

Audio fingerprint analysis for speech processing using deep learning method

International Journal of Speech Technology ◽

10.1007/s10772-021-09827-x ◽

2021 ◽

Author(s):

Ali Altalbe

Keyword(s):

Deep Learning ◽

Speech Processing ◽

Learning Method ◽

Fingerprint Analysis

Download Full-text

A Novel Load Forecasting Approach Based on Smart Meter Data Using Advance Preprocessing and Hybrid Deep Learning

Applied Sciences ◽

10.3390/app11062742 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2742

Author(s):

Fatih Ünal ◽

Abdulaziz Almalaq ◽

Sami Ekici

Keyword(s):

Deep Learning ◽

Energy Consumption ◽

Prediction Models ◽

Short Term Memory ◽

Critical Role ◽

Load Forecasting ◽

Unified Framework ◽

Short Term ◽

Planning And Scheduling ◽

Short Term Load Forecasting

Short-term load forecasting models play a critical role in distribution companies in making effective decisions in their planning and scheduling for production and load balancing. Unlike aggregated load forecasting at the distribution level or substations, forecasting load profiles of many end-users at the customer-level, thanks to smart meters, is a complicated problem due to the high variability and uncertainty of load consumptions as well as customer privacy issues. In terms of customers’ short-term load forecasting, these models include a high level of nonlinearity between input data and output predictions, demanding more robustness, higher prediction accuracy, and generalizability. In this paper, we develop an advanced preprocessing technique coupled with a hybrid sequential learning-based energy forecasting model that employs a convolution neural network (CNN) and bidirectional long short-term memory (BLSTM) within a unified framework for accurate energy consumption prediction. The energy consumption outliers and feature clustering are extracted at the advanced preprocessing stage. The novel hybrid deep learning approach based on data features coding and decoding is implemented in the prediction stage. The proposed approach is tested and validated using real-world datasets in Turkey, and the results outperformed the traditional prediction models compared in this paper.

Download Full-text

Financial Information Asymmetry: Using Deep Learning Algorithms to Predict Financial Distress

Symmetry ◽

10.3390/sym13030443 ◽

2021 ◽

Vol 13 (3) ◽

pp. 443

Author(s):

Chyan-long Jan

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Information Asymmetry ◽

Error Rate ◽

Financial Distress ◽

Prediction Models ◽

High Accuracy ◽

Financial Information ◽

Financial Distress Prediction ◽

Distress Prediction

Because of the financial information asymmetry, the stakeholders usually do not know a company’s real financial condition until financial distress occurs. Financial distress not only influences a company’s operational sustainability and damages the rights and interests of its stakeholders, it may also harm the national economy and society; hence, it is very important to build high-accuracy financial distress prediction models. The purpose of this study is to build high-accuracy and effective financial distress prediction models by two representative deep learning algorithms: Deep neural networks (DNN) and convolutional neural networks (CNN). In addition, important variables are selected by the chi-squared automatic interaction detector (CHAID). In this study, the data of Taiwan’s listed and OTC sample companies are taken from the Taiwan Economic Journal (TEJ) database during the period from 2000 to 2019, including 86 companies in financial distress and 258 not in financial distress, for a total of 344 companies. According to the empirical results, with the important variables selected by CHAID and modeling by CNN, the CHAID-CNN model has the highest financial distress prediction accuracy rate of 94.23%, and the lowest type I error rate and type II error rate, which are 0.96% and 4.81%, respectively.

Download Full-text

Synthetic speech detection through short-term and long-term prediction traces

EURASIP Journal on Information Security ◽

10.1186/s13635-021-00116-3 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Clara Borrelli ◽

Paolo Bestagini ◽

Fabio Antonacci ◽

Augusto Sarti ◽

Stefano Tubaro

Keyword(s):

Deep Learning ◽

Speech Processing ◽

Synthetic Speech ◽

Opinion Formation ◽

Closed Set ◽

Speech Detection ◽

Open Set ◽

Technological Advances ◽

Speech Generation ◽

Long Term Prediction

AbstractSeveral methods for synthetic audio speech generation have been developed in the literature through the years. With the great technological advances brought by deep learning, many novel synthetic speech techniques achieving incredible realistic results have been recently proposed. As these methods generate convincing fake human voices, they can be used in a malicious way to negatively impact on today’s society (e.g., people impersonation, fake news spreading, opinion formation). For this reason, the ability of detecting whether a speech recording is synthetic or pristine is becoming an urgent necessity. In this work, we develop a synthetic speech detector. This takes as input an audio recording, extracts a series of hand-crafted features motivated by the speech-processing literature, and classify them in either closed-set or open-set. The proposed detector is validated on a publicly available dataset consisting of 17 synthetic speech generation algorithms ranging from old fashioned vocoders to modern deep learning solutions. Results show that the proposed method outperforms recently proposed detectors in the forensics literature.

Download Full-text

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Download Full-text

Beyond Deep Learning: An Econometric Example

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488520400036 ◽

2020 ◽

Vol 28 (Supp01) ◽

pp. 31-38 ◽

Cited By ~ 1

Author(s):

Ruofan Liao ◽

Paravee Maneejuk ◽

Songsak Sriboonchitta

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Prediction Models ◽

Original Data ◽

Parametric Model ◽

Parametric Models ◽

The Past ◽

Currency Exchange ◽

Currency Exchange Rate ◽

Linear And Nonlinear

In the past, in many areas, the best prediction models were linear and nonlinear parametric models. In the last decade, in many application areas, deep learning has shown to lead to more accurate predictions than the parametric models. Deep learning-based predictions are reasonably accurate, but not perfect. How can we achieve better accuracy? To achieve this objective, we propose to combine neural networks with parametric model: namely, to train neural networks not on the original data, but on the differences between the actual data and the predictions of the parametric model. On the example of predicting currency exchange rate, we show that this idea indeed leads to more accurate predictions.

Download Full-text