Diagnosis of Depression Based on Transfer Learning Model Using Audio data of Interview-type

Author(s):  
A-Hyeon Jo ◽  
Keun-Chang Kwak
Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


2021 ◽  
Vol 10 (3) ◽  
pp. 137
Author(s):  
Youngok Kang ◽  
Nahye Cho ◽  
Jiyoung Yoon ◽  
Soyeon Park ◽  
Jiyeon Kim

Recently, as computer vision and image processing technologies have rapidly advanced in the artificial intelligence (AI) field, deep learning technologies have been applied in the field of urban and regional study through transfer learning. In the tourism field, studies are emerging to analyze the tourists’ urban image by identifying the visual content of photos. However, previous studies have limitations in properly reflecting unique landscape, cultural characteristics, and traditional elements of the region that are prominent in tourism. With the purpose of going beyond these limitations of previous studies, we crawled 168,216 Flickr photos, created 75 scenes and 13 categories as a tourist’ photo classification by analyzing the characteristics of photos posted by tourists and developed a deep learning model by continuously re-training the Inception-v3 model. The final model shows high accuracy of 85.77% for the Top 1 and 95.69% for the Top 5. The final model was applied to the entire dataset to analyze the regions of attraction and the tourists’ urban image in Seoul. We found that tourists feel attracted to Seoul where the modern features such as skyscrapers and uniquely designed architectures and traditional features such as palaces and cultural elements are mixed together in the city. This work demonstrates a tourist photo classification suitable for local characteristics and the process of re-training a deep learning model to effectively classify a large volume of tourists’ photos.


2011 ◽  
Vol 4 (2) ◽  
pp. 88
Author(s):  
Peter Baggetta

The Teaching Games for Understanding (TGfU) model was first developed by Bunker and Thorpe in 1982 as a model for coaches to help players become more skillful players. Since then other versions of the model have been developed such as the tactical decision-learning model (Grehaigne, Godbout, & Bouthier, 2001) in France and the game–sense approach (Australian Sports Commission, 1991) in Australia and New Zealand. The key aspect of all the models is the design of well-structured conditioned and modified games that require players to make decisions to develop their game understanding and tactical awareness. However, both novice and experienced coaches often struggle with connecting theory to practice especially in the area of creating and developing contextualized games that actually transfer learning from training to performance in games. In order to effectively create and use games that transfer learning, coaches can use a Principles-Based approach to develop games. The Principles-Based approach removes the dichotomy of traditional drills versus games and instead combines the drills approach with a games-context approach that links principles to skills that allow for increased individual and team expertise development. This presentation will first describe a model for developing and connecting principles, policies, tactics and skills for team play. Following this the presentation will then describe how to use the principles to create contextualized games that connect practices with performance and progresses novice players toward becoming more competent performers.


2021 ◽  
Vol 27 ◽  
Author(s):  
Qi Zhou ◽  
Wenjie Zhu ◽  
Fuchen Li ◽  
Mingqing Yuan ◽  
Linfeng Zheng ◽  
...  

Objective: To verify the ability of the deep learning model in identifying five subtypes and normal images in noncontrast enhancement CT of intracranial hemorrhage. Method: A total of 351 patients (39 patients in the normal group, 312 patients in the intracranial hemorrhage group) performed with intracranial hemorrhage noncontrast enhanced CT were selected, with 2768 images in total (514 images for the normal group, 398 images for the epidural hemorrhage group, 501 images for the subdural hemorrhage group, 497 images for the intraventricular hemorrhage group, 415 images for the cerebral parenchymal hemorrhage group, and 443 images for the subarachnoid hemorrhage group). Based on the diagnostic reports of two radiologists with more than 10 years of experience, the ResNet-18 and DenseNet-121 deep learning models were selected. Transfer learning was used. 80% of the data was used for training models, 10% was used for validating model performance against overfitting, and the last 10% was used for the final evaluation of the model. Assessment indicators included accuracy, sensitivity, specificity, and AUC values. Results: The overall accuracy of ResNet-18 and DenseNet-121 models were 89.64% and 82.5%, respectively. The sensitivity and specificity of identifying five subtypes and normal images were above 0.80. The sensitivity of DenseNet-121 model to recognize intraventricular hemorrhage and cerebral parenchymal hemorrhage was lower than 0.80, 0.73, and 0.76 respectively. The AUC values of the two deep learning models were above 0.9. Conclusion: The deep learning model can accurately identify the five subtypes of intracranial hemorrhage and normal images, and it can be used as a new tool for clinical diagnosis in the future.


2021 ◽  
Author(s):  
Gaurav Chachra ◽  
Qingkai Kong ◽  
Jim Huang ◽  
Srujay Korlakunta ◽  
Jennifer Grannen ◽  
...  

Abstract After significant earthquakes, we can see images posted on social media platforms by individuals and media agencies owing to the mass usage of smartphones these days. These images can be utilized to provide information about the shaking damage in the earthquake region both to the public and research community, and potentially to guide rescue work. This paper presents an automated way to extract the damaged building images after earthquakes from social media platforms such as Twitter and thus identify the particular user posts containing such images. Using transfer learning and ~6500 manually labelled images, we trained a deep learning model to recognize images with damaged buildings in the scene. The trained model achieved good performance when tested on newly acquired images of earthquakes at different locations and ran in near real-time on Twitter feed after the 2020 M7.0 earthquake in Turkey. Furthermore, to better understand how the model makes decisions, we also implemented the Grad-CAM method to visualize the important locations on the images that facilitate the decision.


Sign in / Sign up

Export Citation Format

Share Document