scholarly journals A Survey of the Techniques for The Identification and Classification of Human Actions from Visual Data

Sensors ◽  
2018 ◽  
Vol 18 (11) ◽  
pp. 3979 ◽  
Author(s):  
Shahela Saif ◽  
Samabia Tehseen ◽  
Sumaira Kausar

Recognition of human actions form videos has been an active area of research because it has applications in various domains. The results of work in this field are used in video surveillance, automatic video labeling and human-computer interaction, among others. Any advancements in this field are tied to advances in the interrelated fields of object recognition, spatio- temporal video analysis and semantic segmentation. Activity recognition is a challenging task since it faces many problems such as occlusion, view point variation, background differences and clutter and illumination variations. Scientific achievements in the field have been numerous and rapid as the applications are far reaching. In this survey, we cover the growth of the field from the earliest solutions, where handcrafted features were used, to later deep learning approaches that use millions of images and videos to learn features automatically. By this discussion, we intend to highlight the major breakthroughs and the directions the future research might take while benefiting from the state-of-the-art methods.

Author(s):  
F. Matrone ◽  
A. Lingua ◽  
R. Pierdicca ◽  
E. S. Malinverni ◽  
M. Paolanti ◽  
...  

Abstract. The lack of benchmarking data for the semantic segmentation of digital heritage scenarios is hampering the development of automatic classification solutions in this field. Heritage 3D data feature complex structures and uncommon classes that prevent the simple deployment of available methods developed in other fields and for other types of data. The semantic classification of heritage 3D data would support the community in better understanding and analysing digital twins, facilitate restoration and conservation work, etc. In this paper, we present the first benchmark with millions of manually labelled 3D points belonging to heritage scenarios, realised to facilitate the development, training, testing and evaluation of machine and deep learning methods and algorithms in the heritage field. The proposed benchmark, available at http://archdataset.polito.it/, comprises datasets and classification results for better comparisons and insights into the strengths and weaknesses of different machine and deep learning approaches for heritage point cloud semantic segmentation, in addition to promoting a form of crowdsourcing to enrich the already annotated database.


2020 ◽  
Vol 13 (2) ◽  
Author(s):  
Viviane Clay* ◽  
Johannes Schrumpf* ◽  
Yannick Tessenow* ◽  
Helmut Leder ◽  
Ulrich Ansorge ◽  
...  

Classifying artists and their work as distinct art styles has been an important task of scholars in the field of art history. Due to its subjectivity, scholars often contradict one another. Our project investigated differences in aesthetic qualities of seven art styles through quantitative means. This was achieved with state-of-the-art deep-learning paradigms to generate new images resembling the style of an artist or entire era. We conducted psychological experiments to measure the behavior of subjects when viewing these new art images. Two different experiments were used: In an eye-tracking study, subjects viewed art-style-specific generated images. Eye movements were recorded and then compared between art styles. In a visual singleton search study, subjects had to locate a style-outlier image among three images of an alternative style. Reaction time and accuracy were measured and analyzed. These experiments show that there are measurable differences in behavior when viewing images of varying art styles. From these differences, we constructed hierarchical clusterings relating art styles based on the different behaviors of subjects viewing the samples. Our study reveals a novel perspective on the classification of artworks into stylistic eras and motivates future research in the domain of empirical aesthetics through quantitative means.


2021 ◽  
Vol 22 (15) ◽  
pp. 7911
Author(s):  
Eugene Lin ◽  
Chieh-Hsin Lin ◽  
Hsien-Yuan Lane

A growing body of evidence currently proposes that deep learning approaches can serve as an essential cornerstone for the diagnosis and prediction of Alzheimer’s disease (AD). In light of the latest advancements in neuroimaging and genomics, numerous deep learning models are being exploited to distinguish AD from normal controls and/or to distinguish AD from mild cognitive impairment in recent research studies. In this review, we focus on the latest developments for AD prediction using deep learning techniques in cooperation with the principles of neuroimaging and genomics. First, we narrate various investigations that make use of deep learning algorithms to establish AD prediction using genomics or neuroimaging data. Particularly, we delineate relevant integrative neuroimaging genomics investigations that leverage deep learning methods to forecast AD on the basis of incorporating both neuroimaging and genomics data. Moreover, we outline the limitations as regards to the recent AD investigations of deep learning with neuroimaging and genomics. Finally, we depict a discussion of challenges and directions for future research. The main novelty of this work is that we summarize the major points of these investigations and scrutinize the similarities and differences among these investigations.


Energies ◽  
2021 ◽  
Vol 14 (13) ◽  
pp. 3800
Author(s):  
Sebastian Krapf ◽  
Nils Kemmerzell ◽  
Syed Khawaja Haseeb Khawaja Haseeb Uddin ◽  
Manuel Hack Hack Vázquez ◽  
Fabian Netzler ◽  
...  

Roof-mounted photovoltaic systems play a critical role in the global transition to renewable energy generation. An analysis of roof photovoltaic potential is an important tool for supporting decision-making and for accelerating new installations. State of the art uses 3D data to conduct potential analyses with high spatial resolution, limiting the study area to places with available 3D data. Recent advances in deep learning allow the required roof information from aerial images to be extracted. Furthermore, most publications consider the technical photovoltaic potential, and only a few publications determine the photovoltaic economic potential. Therefore, this paper extends state of the art by proposing and applying a methodology for scalable economic photovoltaic potential analysis using aerial images and deep learning. Two convolutional neural networks are trained for semantic segmentation of roof segments and superstructures and achieve an Intersection over Union values of 0.84 and 0.64, respectively. We calculated the internal rate of return of each roof segment for 71 buildings in a small study area. A comparison of this paper’s methodology with a 3D-based analysis discusses its benefits and disadvantages. The proposed methodology uses only publicly available data and is potentially scalable to the global level. However, this poses a variety of research challenges and opportunities, which are summarized with a focus on the application of deep learning, economic photovoltaic potential analysis, and energy system analysis.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Dominik Jens Elias Waibel ◽  
Sayedali Shetab Boushehri ◽  
Carsten Marr

Abstract Background Deep learning contributes to uncovering molecular and cellular processes with highly performant algorithms. Convolutional neural networks have become the state-of-the-art tool to provide accurate and fast image data processing. However, published algorithms mostly solve only one specific problem and they typically require a considerable coding effort and machine learning background for their application. Results We have thus developed InstantDL, a deep learning pipeline for four common image processing tasks: semantic segmentation, instance segmentation, pixel-wise regression and classification. InstantDL enables researchers with a basic computational background to apply debugged and benchmarked state-of-the-art deep learning algorithms to their own data with minimal effort. To make the pipeline robust, we have automated and standardized workflows and extensively tested it in different scenarios. Moreover, it allows assessing the uncertainty of predictions. We have benchmarked InstantDL on seven publicly available datasets achieving competitive performance without any parameter tuning. For customization of the pipeline to specific tasks, all code is easily accessible and well documented. Conclusions With InstantDL, we hope to empower biomedical researchers to conduct reproducible image processing with a convenient and easy-to-use pipeline.


2021 ◽  
Vol 198 ◽  
pp. 110683
Author(s):  
David B. Menasche ◽  
Paul A. Shade ◽  
S. Safriet ◽  
Peter Kenesei ◽  
Jun-Sang Park ◽  
...  

Author(s):  
Jwalin Bhatt ◽  
Khurram Azeem Hashmi ◽  
Muhammad Zeshan Afzal ◽  
Didier Stricker

In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that makes digitization of documents viable. Since the advent of deep learning, the performance of deep learning-based object detection has improved many folds. In this work, we outline and summarize the deep learning approaches for detecting graphical page objects in the document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.


Author(s):  
Gabriel Sen ◽  
Albert Adeboye ◽  
Oluwole Alagbe

The paper was a pilot study that examined learning approaches of architecture students; variability of approaches by university type and gender and; influence of architecture students’ learning approaches on their academic performance. The sample was 349 architecture students from two universities. Descriptive and statistical analyses were used. Results revealed predominant use of deep learning approaches by students. Furthermore, learning approaches neither significantly differed by university type nor gender. Regression analysis revealed that demographic factors accounted for 2.9% of variation in academic performance (F (2,346) = 6.2, p = 0.002, R2 = 0.029, f2 = 0.029) and when learning approaches were also entered the model accounted for 4.4% of variation in academic performance (F (14,334) =2.2, p =0.009, R2 = 0.044, f2=0.044). Deep learning approaches significantly and positively influenced variation in academic performance while surface learning approaches significantly and negatively influenced academic performance. This implies that architectural educators should use instructional methods that encourage deep approaches. Future research needs to use larger and more heterogeneous samples for confirmation of results.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


Sign in / Sign up

Export Citation Format

Share Document