A Master Map: An Alternative Approach to Explore Human’s Eye Fixation for Generating Ground Truth Based on Various State-of-the-Art Techniques

Author(s):  
Ravi Kant Kumar ◽  
Jogendra Garain ◽  
Dakshina Ranjan Kisku ◽  
Goutam Sanyal
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Aysen Degerli ◽  
Mete Ahishali ◽  
Mehmet Yamac ◽  
Serkan Kiranyaz ◽  
Muhammad E. H. Chowdhury ◽  
...  

AbstractComputer-aided diagnosis has become a necessity for accurate and immediate coronavirus disease 2019 (COVID-19) detection to aid treatment and prevent the spread of the virus. Numerous studies have proposed to use Deep Learning techniques for COVID-19 diagnosis. However, they have used very limited chest X-ray (CXR) image repositories for evaluation with a small number, a few hundreds, of COVID-19 samples. Moreover, these methods can neither localize nor grade the severity of COVID-19 infection. For this purpose, recent studies proposed to explore the activation maps of deep networks. However, they remain inaccurate for localizing the actual infestation making them unreliable for clinical use. This study proposes a novel method for the joint localization, severity grading, and detection of COVID-19 from CXR images by generating the so-called infection maps. To accomplish this, we have compiled the largest dataset with 119,316 CXR images including 2951 COVID-19 samples, where the annotation of the ground-truth segmentation masks is performed on CXRs by a novel collaborative human–machine approach. Furthermore, we publicly release the first CXR dataset with the ground-truth segmentation masks of the COVID-19 infected regions. A detailed set of experiments show that state-of-the-art segmentation networks can learn to localize COVID-19 infection with an F1-score of 83.20%, which is significantly superior to the activation maps created by the previous methods. Finally, the proposed approach achieved a COVID-19 detection performance with 94.96% sensitivity and 99.88% specificity.


Energies ◽  
2014 ◽  
Vol 7 (8) ◽  
pp. 4757-4780 ◽  
Author(s):  
Alistair McCay ◽  
Thomas Harley ◽  
Paul Younger ◽  
David Sanderson ◽  
Alan Cresswell

Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4233
Author(s):  
Bogdan Mocanu ◽  
Ruxandra Tapu ◽  
Titus Zaharia

Emotion is a form of high-level paralinguistic information that is intrinsically conveyed by human speech. Automatic speech emotion recognition is an essential challenge for various applications; including mental disease diagnosis; audio surveillance; human behavior understanding; e-learning and human–machine/robot interaction. In this paper, we introduce a novel speech emotion recognition method, based on the Squeeze and Excitation ResNet (SE-ResNet) model and fed with spectrogram inputs. In order to overcome the limitations of the state-of-the-art techniques, which fail in providing a robust feature representation at the utterance level, the CNN architecture is extended with a trainable discriminative GhostVLAD clustering layer that aggregates the audio features into compact, single-utterance vector representation. In addition, an end-to-end neural embedding approach is introduced, based on an emotionally constrained triplet loss function. The loss function integrates the relations between the various emotional patterns and thus improves the latent space data representation. The proposed methodology achieves 83.35% and 64.92% global accuracy rates on the RAVDESS and CREMA-D publicly available datasets, respectively. When compared with the results provided by human observers, the gains in global accuracy scores are superior to 24%. Finally, the objective comparative evaluation with state-of-the-art techniques demonstrates accuracy gains of more than 3%.


IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 118584-118605
Author(s):  
Munyaradzi Munochiveyi ◽  
Arjun Chakravarthi Pogaku ◽  
Dinh-Thuan Do ◽  
Anh-Tu Le ◽  
Miroslav Voznak ◽  
...  

2021 ◽  
Vol 7 (2) ◽  
pp. 21
Author(s):  
Roland Perko ◽  
Manfred Klopschitz ◽  
Alexander Almer ◽  
Peter M. Roth

Many scientific studies deal with person counting and density estimation from single images. Recently, convolutional neural networks (CNNs) have been applied for these tasks. Even though often better results are reported, it is often not clear where the improvements are resulting from, and if the proposed approaches would generalize. Thus, the main goal of this paper was to identify the critical aspects of these tasks and to show how these limit state-of-the-art approaches. Based on these findings, we show how to mitigate these limitations. To this end, we implemented a CNN-based baseline approach, which we extended to deal with identified problems. These include the discovery of bias in the reference data sets, ambiguity in ground truth generation, and mismatching of evaluation metrics w.r.t. the training loss function. The experimental results show that our modifications allow for significantly outperforming the baseline in terms of the accuracy of person counts and density estimation. In this way, we get a deeper understanding of CNN-based person density estimation beyond the network architecture. Furthermore, our insights would allow to advance the field of person density estimation in general by highlighting current limitations in the evaluation protocols.


2020 ◽  
Vol 31 (10) ◽  
pp. 2591-2602
Author(s):  
Yi-Di Chen ◽  
Feiyu Liu ◽  
Nan-Qi Ren ◽  
Shih-Hsin Ho

IEEE Access ◽  
2016 ◽  
Vol 4 ◽  
pp. 2670-2704 ◽  
Author(s):  
Arezou Soltani Panah ◽  
Ron Van Schyndel ◽  
Timos Sellis ◽  
Elisa Bertino

2021 ◽  
Vol 180 (4) ◽  
pp. 351-373
Author(s):  
Denis Kuperberg ◽  
Laureline Pinault ◽  
Damien Pous

We propose a new algorithm for checking language equivalence of non-deterministic Büchi automata. We start from a construction proposed by Calbrix, Nivat and Podelski, which makes it possible to reduce the problem to that of checking equivalence of automata on finite words. Although this construction generates large and highly non-deterministic automata, we show how to exploit their specific structure and apply state-of-the art techniques based on coinduction to reduce the state-space that has to be explored. Doing so, we obtain algorithms which do not require full determinisation or complementation.


2022 ◽  
Vol 29 (2) ◽  
pp. 1-33
Author(s):  
Nigel Bosch ◽  
Sidney K. D'Mello

The ability to identify whether a user is “zoning out” (mind wandering) from video has many HCI (e.g., distance learning, high-stakes vigilance tasks). However, it remains unknown how well humans can perform this task, how they compare to automatic computerized approaches, and how a fusion of the two might improve accuracy. We analyzed videos of users’ faces and upper bodies recorded 10s prior to self-reported mind wandering (i.e., ground truth) while they engaged in a computerized reading task. We found that a state-of-the-art machine learning model had comparable accuracy to aggregated judgments of nine untrained human observers (area under receiver operating characteristic curve [AUC] = .598 versus .589). A fusion of the two (AUC = .644) outperformed each, presumably because each focused on complementary cues. Furthermore, adding more humans beyond 3–4 observers yielded diminishing returns. We discuss implications of human–computer fusion as a means to improve accuracy in complex tasks.


Sign in / Sign up

Export Citation Format

Share Document