scholarly journals Automatic discrimination between front and back ensemble locations in HRTF-convolved binaural recordings of music

Author(s):  
Sławomir K. Zieliński ◽  
Paweł Antoniuk ◽  
Hyunkook Lee ◽  
Dale Johnson

AbstractOne of the greatest challenges in the development of binaural machine audition systems is the disambiguation between front and back audio sources, particularly in complex spatial audio scenes. The goal of this work was to develop a method for discriminating between front and back located ensembles in binaural recordings of music. To this end, 22, 496 binaural excerpts, representing either front or back located ensembles, were synthesized by convolving multi-track music recordings with 74 sets of head-related transfer functions (HRTF). The discrimination method was developed based on the traditional approach, involving hand-engineering of features, as well as using a deep learning technique incorporating the convolutional neural network (CNN). According to the results obtained under HRTF-dependent test conditions, CNN showed a very high discrimination accuracy (99.4%), slightly outperforming the traditional method. However, under the HRTF-independent test scenario, CNN performed worse than the traditional algorithm, highlighting the importance of testing the algorithms under HRTF-independent conditions and indicating that the traditional method might be more generalizable than CNN. A minimum of 20 HRTFs are required to achieve a satisfactory generalization performance for the traditional algorithm and 30 HRTFs for CNN. The minimum duration of audio excerpts required by both the traditional and CNN-based methods was assessed as 3 s. Feature importance analysis, based on a gradient attribution mapping technique, revealed that for both the traditional and the deep learning methods, a frequency band between 5 and 6 kHz is particularly important in terms of the discrimination between front and back ensemble locations. Linear-frequency cepstral coefficients, interaural level differences, and audio bandwidth were identified as the key descriptors facilitating the discrimination process using the traditional approach.

Attendance Management System under unconstrained video using face recognition technology has made a great variation from the traditional method of attendance marking system. This attendance management system has been developed under the domain of Deep Learning by using Face recognition. Automatic Attendance Management under unconstrained video using face recognition systems which automatically mark attendance by detecting end to end face from the frames obtained from live stream video of surveillance camera which placed in center of the classroom. From the recognized faces, it will be compared with stored images in database, then the attendance report will be generated and it also provides attendance reports to parents of the absentee’s student.


2021 ◽  
Author(s):  
Hon-Yi Shi ◽  
King-The Lee ◽  
Chong-Chi Chiu ◽  
Jhi-Joung Wang ◽  
Ding-Ping Sun ◽  
...  

Abstract BackgroundRisk of hepatocellular carcinoma (HCC) recurrence after surgical resection is unknown. Therefore, the aim of this study was 5-year recurrence prediction after HCC resection using deep learning and Cox regression models.MethodsThis study recruited 520 HCC patients who had undergone surgical resection at three medical centers in southern Taiwan between April, 2011, and December, 2015. Two popular deep learning algorithms: a deep neural network (DNN) model and a recurrent neural network (RNN) model and a Cox proportional hazard (CPH) regression model were designed to solve both classification problems and regression problems in predicting HCC recurrence. A feature importance analysis was also performed to identify confounding factors in the prediction of HCC recurrence in patients who had undergone resection.ResultsAll performance indices for the DNN model were significantly higher than those for the RNN model and the traditional CPH model (p<0.001). The most important confounding factor in 5-year recurrence after HCC resection was surgeon volume followed by, in order of importance, hospital volume, preoperative Beck Depression Scale score, preoperative Beck Anxiety Scale score, co-residence with family, tumor stage, and tumor size. ConclusionsThe DNN model is useful for early baseline prediction of 5-year recurrence after HCC resection. Its prediction accuracy can be improved by further training with temporal data collected from treated patients. The feature importance analysis performed in this study to investigate model interpretability provided important insights into the potential use of deep learning models for predicting recurrence after HCC resection and for identifying predictors of recurrence.


2021 ◽  
Vol 7 (2) ◽  
pp. 66-70
Author(s):  
Homnath Khatiwada ◽  
◽  
Ajay Neupane ◽  
Lal B Thapa ◽  
◽  
...  

Oral Collection and documentation of indigenous knowledge of local people have an important role in scientific research, biodiversity conservation, and the drug development process. A study was carried out to document the medicinal plants that have been used by the local folk healer to treat Shingles in Ilam district, Eastern Nepal. A renowned folk healer who was involved in curing Shingles for decades and 30 key informants were selected for the interview to know the methods of curing Shingles. Altogether six plants viz: Oroxylum indicum, Cynodon dactylon, Centella asiatica, Drymaria cordata, Sesamum indicum, and Lygodium japonicum were found to be used against the disease. The traditional method of preparing medicine from these plants was found to be highly effective. The finding provides a clue for further extensive lab-based research to isolate the specific compounds that are effective against the disease.


Author(s):  
Yun Jiang ◽  
Junyu Zhuo ◽  
Juan Zhang ◽  
Xiao Xiao

With the extensive attention and research of the scholars in deep learning, the convolutional restricted Boltzmann machine (CRBM) model based on restricted Boltzmann machine (RBM) is widely used in image recognition, speech recognition, etc. However, time consuming training still seems to be an unneglectable issue. To solve this problem, this paper mainly uses optimized parallel CRBM based on Spark, and proposes a parallel comparison divergence algorithm based on Spark and uses it to train the CRBM model to improve the training speed. The experiments show that the method is faster than traditional sequential algorithm. We train the CRBM with the method and apply it to breast X-ray image classification. The experiments show that it can improve the precision and the speed of training compared with traditional algorithm.


2021 ◽  
Author(s):  
Benjamin Clavié ◽  
Marc Alphonsus

We aim to highlight an interesting trend to contribute to the ongoing debate around advances within legal Natural Language Processing. Recently, the focus for most legal text classification tasks has shifted towards large pre-trained deep learning models such as BERT. In this paper, we show that a more traditional approach based on Support Vector Machine classifiers reaches competitive performance with deep learning models. We also highlight that error reduction obtained by using specialised BERT-based models over baselines is noticeably smaller in the legal domain when compared to general language tasks. We discuss some hypotheses for these results to support future discussions.


2021 ◽  
Vol 2066 (1) ◽  
pp. 012049
Author(s):  
Jianfeng Zhong

Abstract As a value-added service that improves the efficiency of online customer service, customer service robots have been well received by sellers in recent years. Because the robot strives to free the customer service staff from the heavy consulting services in the past, thereby reducing the seller’s operating costs and improving the quality of online services. The purpose of this article is to study the intelligent customer service robot scene understanding technology based on deep learning. It mainly introduces some commonly used models and training methods of deep learning and the application fields of deep learning. Analyzed the problems of the traditional Encoder-Decoder framework, and introduced the chat model designed in this paper based on these problems, that is, the intelligent chat robot model (T-DLLModel) obtained by combining the neural network topic model and the deep learning language model. Conduct an independent question understanding experiment based on question retelling and a question understanding experiment combined with contextual information on the dialogue between online shopping customer service and customers. The experimental results show that when the similarity threshold is 0.4, the method achieves better results, and an F value of 0.5 is achieved. The semantic similarity calculation method proposed in this paper is better than the traditional method based on keywords and semantic information, especially when the similarity threshold increases, the recall rate of this paper is significantly better than the traditional method. The method in this article has a slightly better answer sorting effect on the real customer service dialogue data than the method based on LDA.


2021 ◽  
Author(s):  
Hamidullah Binol ◽  
M. Khalid Khan Niazi ◽  
Charles Elmaraghy ◽  
Aaron C Moberly ◽  
Metin N Gurcan

Background: The lack of an objective method to evaluate the eardrum is a critical barrier to an accurate diagnosis. Eardrum images are classified into normal or abnormal categories with machine learning techniques. If the input is an otoscopy video, a traditional approach requires great effort and expertise to manually determine the representative frame(s). Methods: In this paper, we propose a novel deep learning-based method, called OtoXNet, which automatically learns features for eardrum classification from otoscope video clips. We utilized multiple composite image generation methods to construct a highly representative version of otoscopy videos to diagnose three major eardrum diseases, i.e., otitis media with effusion, eardrum perforation, and tympanosclerosis versus normal (healthy). We compared the performance of OtoXNet against methods with that either use a single composite image or a keyframe selected by an experienced human. Our dataset consists of 394 otoscopy videos from 312 patients and 765 composite images before augmentation. Results: OtoXNet with multiple composite images achieved 84.8% class-weighted accuracy with 3.8% standard deviation, whereas with the human-selected keyframes and single composite images, the accuracies were respectively, 81.8% ± 5.0% and 80.1% ± 4.8% on multi-class eardrum video classification task using an 8-fold cross-validation scheme. A paired t-test shows that there is a statistically significant difference (p-value of 1.3 × 10-2) between the performance values of OtoXNet (multiple composite images) and the human-selected keyframes. Contrarily, the difference in means of keyframe and single composites was not significant (p = 5.49 × 10-1). OtoXNet surpasses the baseline approaches in qualitative results. Conclusion: The use of multiple composite images in analyzing eardrum abnormalities is advantageous compared to using single composite images or manual keyframe selection.


The fashion industry has developed in many fields and its growth is making an enormous promote in article of clothing company and e-commerce entity. The difficult task for IT industry in this field is designing the predictive system of data mining to model this. E-commerce uses electronic communication as well as information technology in many transactions for creating, transforming or for redefining the relationships between individuals and organizations. It simply means buying of products, services and information and selling them through computer network. It is totally changing the traditional approach of business. The main change in business is noticeable growth and it has many significant effects on environment as well. This is the reason why it is so preferred in business nowadays. The important part of the proposed system is to rate the fashionable outfit individual and it is considers appearances as well as meta-data. Our approach has first implemented a system of encoding visual characteristics with the help of deep convolution network for complicated contents because it is not possible to list or to label every attribute of a image. Secondly, we proposed a multi-model deep learning framework for rich contexts of fashion outfit. We propose a system which will recommend with review comments and which product should purchase and the system will display a rating of the product.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Lifang Sun ◽  
Xi Hu ◽  
Yutao Liu ◽  
Hengyu Cai

In order to explore the effect of convolutional neural network (CNN) algorithm based on deep learning on magnetic resonance imaging (MRI) images of brain tumor patients and evaluate the practical value of MRI image features based on deep learning algorithm in the clinical diagnosis and nursing of malignant tumors, in this study, a brain tumor MRI image model based on the CNN algorithm was constructed, and 80 patients with brain tumors were selected as the research objects. They were divided into an experimental group (CNN algorithm) and a control group (traditional algorithm). The patients were nursed in the whole process. The macroscopic characteristics and imaging index of the MRI image and anxiety of patients in two groups were compared and analyzed. In addition, the image quality after nursing was checked. The results of the study revealed that the MRI characteristics of brain tumors based on CNN algorithm were clearer and more accurate in the fluid-attenuated inversion recovery (FLAIR), MRI T1, T1c, and T2; in terms of accuracy, sensitivity, and specificity, the mean value was 0.83, 0.84, and 0.83, which had obvious advantages compared with the traditional algorithm ( P < 0.05 ). The patients in the nursing group showed lower depression scores and better MRI images in contrast to the control group ( P < 0.05 ). Therefore, the deep learning algorithm can further accurately analyze the MRI image characteristics of brain tumor patients on the basis of conventional algorithms, showing high sensitivity and specificity, which improved the application value of MRI image characteristics in the diagnosis of malignant tumors. In addition, effective nursing for patients undergoing analysis and diagnosis on brain tumor MRI image characteristics can alleviate the patient’s anxiety and ensure that high-quality MRI images were obtained after the examination.


2021 ◽  
Author(s):  
Janghoon Ahn ◽  
Thong Phi Nguyen ◽  
Yoon-Ji Kim ◽  
Taeyong Kim ◽  
Jonghun Yoon

Abstract Analysing cephalometric X-rays, which is mostly performed by orthodontists or dentists, is an indispensable procedure for diagnosis and treatment planning with orthodontic patients. Artificial intelligence, especially deep-learning techniques for analysing image data, shows great potential for medical and dental image analysis and diagnosis. To explore the feasibility of automating measurement of 13 geometric parameters from three-dimensional cone beam computed tomography (CBCT) images taken in a natural head position, we here describe a smart system that combines a facial profile analysis algorithm with deep-learning models. Using multiple views extracted from the CBCT data as the dataset, our proposed method partitions and detects regions of interest by extracting the facial profile and applying Mask-RCNN, a trained decentralized convolutional neural network (CNN) that positions the key parameters. All the techniques are integrated into a software application with a graphical user interface designed for user convenience. To demonstrate the system’s ability to replace human experts, we validated the performance of the proposed method by comparing it with measurements made by two orthodontists and one advanced general dentist using a commercial dental program. The time savings compared with the traditional approach was remarkable, reducing the processing time from about 30 minutes to about 30 seconds.


Sign in / Sign up

Export Citation Format

Share Document