scholarly journals Critical Aspects of Person Counting and Density Estimation

2021 ◽  
Vol 7 (2) ◽  
pp. 21
Author(s):  
Roland Perko ◽  
Manfred Klopschitz ◽  
Alexander Almer ◽  
Peter M. Roth

Many scientific studies deal with person counting and density estimation from single images. Recently, convolutional neural networks (CNNs) have been applied for these tasks. Even though often better results are reported, it is often not clear where the improvements are resulting from, and if the proposed approaches would generalize. Thus, the main goal of this paper was to identify the critical aspects of these tasks and to show how these limit state-of-the-art approaches. Based on these findings, we show how to mitigate these limitations. To this end, we implemented a CNN-based baseline approach, which we extended to deal with identified problems. These include the discovery of bias in the reference data sets, ambiguity in ground truth generation, and mismatching of evaluation metrics w.r.t. the training loss function. The experimental results show that our modifications allow for significantly outperforming the baseline in terms of the accuracy of person counts and density estimation. In this way, we get a deeper understanding of CNN-based person density estimation beyond the network architecture. Furthermore, our insights would allow to advance the field of person density estimation in general by highlighting current limitations in the evaluation protocols.

Author(s):  
N. Soyama ◽  
K. Muramatsu ◽  
M. Daigo ◽  
F. Ochiai ◽  
N. Fujiwara

Validating the accuracy of land cover products using a reliable reference dataset is an important task. A reliable reference dataset is produced with information derived from ground truth data. Recently, the amount of ground truth data derived from information collected by volunteers has been increasing globally. The acquisition of volunteer-based reference data demonstrates great potential. However information given by volunteers is limited useful vegetation information to produce a complete reference dataset based on the plant functional type (PFT) with five specialized forest classes. In this study, we examined the availability and applicability of FLUXNET information to produce reference data with higher levels of reliability. FLUXNET information was useful especially for forest classes for interpretation in comparison with the reference dataset using information given by volunteers.


Author(s):  
Ding Li ◽  
Scott Dick

AbstractGraph-based algorithms are known to be effective approaches to semi-supervised learning. However, there has been relatively little work on extending these algorithms to the multi-label classification case. We derive an extension of the Manifold Regularization algorithm to multi-label classification, which is significantly simpler than the general Vector Manifold Regularization approach. We then augment our algorithm with a weighting strategy to allow differential influence on a model between instances having ground-truth vs. induced labels. Experiments on four benchmark multi-label data sets show that the resulting algorithm performs better overall compared to the existing semi-supervised multi-label classification algorithms at various levels of label sparsity. Comparisons with state-of-the-art supervised multi-label approaches (which of course are fully labeled) also show that our algorithm outperforms all of them even with a substantial number of unlabeled examples.


2020 ◽  
Vol 2020 (16) ◽  
pp. 200-1-200-7
Author(s):  
Florian Groh ◽  
Dominik Schörkhuber ◽  
Margrit Gelautz

We have developed a semi-automatic annotation tool – “CVL Annotator” – for bounding box ground truth generation in videos. Our research is particularly motivated by the need for reference annotations of challenging nighttime traffic scenes with highly dynamic lighting conditions due to reflections, headlights and halos from oncoming traffic. Our tool incorporates a suite of different state-of-the-art tracking algorithms in order to minimize the amount of human input necessary to generate high-quality ground truth data. We focus our user interface on the premise of minimizing user interaction and visualizing all information relevant to the user at a glance. We perform a preliminary user study to measure the amount of time and clicks necessary to produce ground truth annotations of video traffic scenes and evaluate the accuracy of the final annotation results.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Madalina Ciortan ◽  
Matthieu Defrance

Abstract Background Single-cell RNA sequencing (scRNA-seq) has emerged has a main strategy to study transcriptional activity at the cellular level. Clustering analysis is routinely performed on scRNA-seq data to explore, recognize or discover underlying cell identities. The high dimensionality of scRNA-seq data and its significant sparsity accentuated by frequent dropout events, introducing false zero count observations, make the clustering analysis computationally challenging. Even though multiple scRNA-seq clustering techniques have been proposed, there is no consensus on the best performing approach. On a parallel research track, self-supervised contrastive learning recently achieved state-of-the-art results on images clustering and, subsequently, image classification. Results We propose contrastive-sc, a new unsupervised learning method for scRNA-seq data that perform cell clustering. The method consists of two consecutive phases: first, an artificial neural network learns an embedding for each cell through a representation training phase. The embedding is then clustered in the second phase with a general clustering algorithm (i.e. KMeans or Leiden community detection). The proposed representation training phase is a new adaptation of the self-supervised contrastive learning framework, initially proposed for image processing, to scRNA-seq data. contrastive-sc has been compared with ten state-of-the-art techniques. A broad experimental study has been conducted on both simulated and real-world datasets, assessing multiple external and internal clustering performance metrics (i.e. ARI, NMI, Silhouette, Calinski scores). Our experimental analysis shows that constastive-sc compares favorably with state-of-the-art methods on both simulated and real-world datasets. Conclusion On average, our method identifies well-defined clusters in close agreement with ground truth annotations. Our method is computationally efficient, being fast to train and having a limited memory footprint. contrastive-sc maintains good performance when only a fraction of input cells is provided and is robust to changes in hyperparameters or network architecture. The decoupling between the creation of the embedding and the clustering phase allows the flexibility to choose a suitable clustering algorithm (i.e. KMeans when the number of expected clusters is known, Leiden otherwise) or to integrate the embedding with other existing techniques.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Oliver T. Unke ◽  
Stefan Chmiela ◽  
Michael Gastegger ◽  
Kristof T. Schütt ◽  
Huziel E. Sauceda ◽  
...  

AbstractMachine-learned force fields combine the accuracy of ab initio methods with the efficiency of conventional force fields. However, current machine-learned force fields typically ignore electronic degrees of freedom, such as the total charge or spin state, and assume chemical locality, which is problematic when molecules have inconsistent electronic states, or when nonlocal effects play a significant role. This work introduces SpookyNet, a deep neural network for constructing machine-learned force fields with explicit treatment of electronic degrees of freedom and nonlocality, modeled via self-attention in a transformer architecture. Chemically meaningful inductive biases and analytical corrections built into the network architecture allow it to properly model physical limits. SpookyNet improves upon the current state-of-the-art (or achieves similar performance) on popular quantum chemistry data sets. Notably, it is able to generalize across chemical and conformational space and can leverage the learned chemical insights, e.g. by predicting unknown spin states, thus helping to close a further important remaining gap for today’s machine learning models in quantum chemistry.


Author(s):  
Rodolfo Quispe ◽  
Darwin Ttito ◽  
Adín Rivera ◽  
Helio Pedrini

Crowd scene analysis has received a lot of attention recently due to a wide variety of applications, e.g., forensic science, urban planning, surveillance and security. In this context, a challenging task is known as crowd counting [1–6], whose main purpose is to estimate the number of people present in a single image. A multi-stream convolutional neural network is developed and evaluated in this paper, which receives an image as input and produces a density map that represents the spatial distribution of people in an end-to-end fashion. In order to address complex crowd counting issues, such as extremely unconstrained scale and perspective changes, the network architecture utilizes receptive fields with different size filters for each stream. In addition, we investigate the influence of the two most common fashions on the generation of ground truths and propose a hybrid method based on tiny face detection and scale interpolation. Experiments conducted on two challenging datasets, UCF-CC-50 and ShanghaiTech, demonstrate that the use of our ground truth generation methods achieves superior results.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1546 ◽  
Author(s):  
Dariusz Kucharski ◽  
Pawel Kleczek ◽  
Joanna Jaworek-Korjakowska ◽  
Grzegorz Dyduch ◽  
Marek Gorgon

In this research, we present a semi-supervised segmentation solution using convolutional autoencoders to solve the problem of segmentation tasks having a small number of ground-truth images. We evaluate the proposed deep network architecture for the detection of nests of nevus cells in histopathological images of skin specimens is an important step in dermatopathology. The diagnostic criteria based on the degree of uniformity and symmetry of border irregularities are particularly vital in dermatopathology, in order to distinguish between benign and malignant skin lesions. However, to the best of our knowledge, it is the first described method to segment the nests region. The novelty of our approach is not only the area of research, but, furthermore, we address a problem with a small ground-truth dataset. We propose an effective computer-vision based deep learning tool that can perform the nests segmentation based on an autoencoder architecture with two learning steps. Experimental results verified the effectiveness of the proposed approach and its ability to segment nests areas with Dice similarity coefficient 0.81, sensitivity 0.76, and specificity 0.94, which is a state-of-the-art result.


2019 ◽  
Vol 32 (14) ◽  
pp. 10705-10717 ◽  
Author(s):  
Joost van der Putten ◽  
Fons van der Sommen ◽  
Jeroen de Groof ◽  
Maarten Struyvenberg ◽  
Svitlana Zinger ◽  
...  

AbstractIn medical imaging, a proper gold-standard ground truth as, e.g., annotated segmentations by assessors or experts is lacking or only scarcely available and suffers from large intervariability in those segmentations. Most state-of-the-art segmentation models do not take inter-observer variability into account and are fully deterministic in nature. In this work, we propose hypersphere encoder–decoder networks in combination with dynamic leaky ReLUs, as a new method to explicitly incorporate inter-observer variability into a segmentation model. With this model, we can then generate multiple proposals based on the inter-observer agreement. As a result, the output segmentations of the proposed model can be tuned to typical margins inherent to the ambiguity in the data. For experimental validation, we provide a proof of concept on a toy data set as well as show improved segmentation results on two medical data sets. The proposed method has several advantages over current state-of-the-art segmentation models such as interpretability in the uncertainty of segmentation borders. Experiments with a medical localization problem show that it offers improved biopsy localizations, which are on average 12% closer to the optimal biopsy location.


Author(s):  
N. Soyama ◽  
K. Muramatsu ◽  
M. Daigo ◽  
F. Ochiai ◽  
N. Fujiwara

Validating the accuracy of land cover products using a reliable reference dataset is an important task. A reliable reference dataset is produced with information derived from ground truth data. Recently, the amount of ground truth data derived from information collected by volunteers has been increasing globally. The acquisition of volunteer-based reference data demonstrates great potential. However information given by volunteers is limited useful vegetation information to produce a complete reference dataset based on the plant functional type (PFT) with five specialized forest classes. In this study, we examined the availability and applicability of FLUXNET information to produce reference data with higher levels of reliability. FLUXNET information was useful especially for forest classes for interpretation in comparison with the reference dataset using information given by volunteers.


Author(s):  
K Sobha Rani

Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.


Sign in / Sign up

Export Citation Format

Share Document