representation space
Recently Published Documents


TOTAL DOCUMENTS

132
(FIVE YEARS 33)

H-INDEX

12
(FIVE YEARS 2)

Author(s):  
Hervé Goëau ◽  
Pierre Bonnet ◽  
Alexis Joly

Automated plant identification has recently improved significantly due to advances in deep learning and the availability of large amounts of field photos. As an illustration, the classification accuracy of 10K species measured in the LifeCLEF challenge (Goëau et al. 2018) reached 90%, very close to that of human experts. However, the profusion of field images only concerns a few tens of thousands of species, mainly located in North America and Western Europe. Conversely, the richest regions in terms of biodiversity, such as tropical countries, suffer from a shortage of training data (Pitman 2021). Consequently, the identification performance of the most advanced models on the flora of these regions is much lower (Goëau et al. 2019). Nevertheless, for several centuries, botanists have systematically collected, catalogued, and stored plant specimens in herbaria. Considerable recent efforts by the biodiversity informatics community, such as DiSSCo (Addink et al. 2018) and iDigBio (Matsunaga et al. 2013), have made millions of digitized specimens from these collections available online. A key question is therefore whether these digitized specimens could be used to improve the identification performance of species for which we have very few (if any) photos. However, this is a very difficult problem from a machine learning point of view. The visual appearance of a herbarium specimen is actually very different from a field photograph because the specimens are dried and crushed on a herbarium sheet before being digitized (Fig. 1). To advance research on this topic, we built a large dataset that we shared as one of the challenges of the LifeCLEF 2020 (Goëau et al. 2020) and 2021 evaluation campaigns (Goëau et al. 2021). It includes more than 320K herbarium specimens collected mostly from the Guiana Shield and the Northern Amazon Rainforest, focusing on about 1K plant species of the French Guiana flora. A valuable asset of this collection is that some of the specimens are accompanied by a few photos of the same specimen, allowing for more precise machine learning. In addition to this training data, we also built a test set for model evaluation, composed of 3,186 field photos collected by two of the best experts on Guyanese flora. Based on this dataset, about ten research teams have developed deep learning methods to address the challenge (including the authors of this abstract as the organizing team). A detailed description of these methods can be found in the technical notes written by the participating teams (Goëau et al. 2020, Goëau et al. 2021). The methods can be divided into two categories: those based on classical convolutional neural networks (CNN) trained simply by mixing digitized specimens and photos and those based on advanced domain adaptation techniques with the objective of learning a joint representation space between field and herbarium representations. those based on classical convolutional neural networks (CNN) trained simply by mixing digitized specimens and photos and those based on advanced domain adaptation techniques with the objective of learning a joint representation space between field and herbarium representations. The domain adaptation methods themselves were of two types, those based on adversarial regularization (Motiian et al. 2017) to force herbarium specimens and photos to have the same representations, metric learning to maximize inter-species distances and minimize intra-species distances in the representation space adversarial regularization (Motiian et al. 2017) to force herbarium specimens and photos to have the same representations, metric learning to maximize inter-species distances and minimize intra-species distances in the representation space In Table 1, we report the results achieved by the different methods evaluated during the 2020 edition of the challenge. The evaluation metric used is the mean reciprocal rank (MRR), i.e., the average of the inverse of the rank of the correct species in the list of the predicted species. In addition to this main score, a second MRR score is computed on a subset of the test set composed of the most difficult species, i.e., the ones that are the least frequently photographed in the field. The main outcomes we can derive from these results are the following: Classical deep learning models fail to identify plant photos from digitized herbarium specimens. The best classical CNN trained on the provided data resulted in a very low MRR score (0.011). Even with the of use additional training data (e.g. photos and digitized herbarium from GBIF) the MRR score remains very low (0.039). Domain adaptation methods provide significant improvement but the task remains challenging. The best MRR score (0.180) was achieved by using adversarial regularization (FSDA Motiian et al. 2017). This is much better than the classical CNN models but there is still a lot of progress to be made to reach the performance of a truly functional identification system (the MRR score on classical plant identification tasks can be up to 0.9). No method fits all. As shown in Table 1, the metric learning method has a significantly better MRR score on the most difficult species (0.107). However, the performance of this method on the species with more photos is much lower than the adversarial technique. In 2021, the challenge was run again but with additional information provided to train the models, i.e., species traits (plant life form, woodiness and plant growth form). The use of the species traits allowed slight performance improvement of the best adversarial adaptation method (with a MRR equal to 0.198). In conclusion, the results of the experiments conducted are promising and demonstrate the potential interest of digitized herbarium data for automated plant identification. However, progress is still needed before integrating this type of approach into production applications.


2021 ◽  
Vol 19 (2) ◽  
pp. 101-106
Author(s):  
Felipe Freddo Breunig ◽  
Douglas Meyer Oliveira ◽  
Alex Branco Fraga

INTRODUÇÃO: Este ensaio aborda as representações sociais em torno do papel do goleiro no contexto do futebol brasileiro. O marco temporal é a Copa do Mundo de 1950 pelo impacto histórico e sociocultural que a derrota da seleção brasileira na final no Maracanã teve no cenário nacional à época. OBJETIVO: Analisa dados e informações de alguns autores sobre a Copa do mundo daquele ano, especialmente sobre a falha do goleiro Barbosa, que foi culpado pela derrota brasileira na final daquela Copa. MÉTODOS: Análise interpretativa de discursos sobre goleiros na ótica da Teoria do Espaço de Representação do Futebol. CONCLUSÃO: Conclui que no Brasil, desde o episódio Barbosa, construiu-se uma representação de fundo racista sobre a confiabilidade de goleiros negros, assim como uma exacerbada valorização de goleiros estrangeiros. Discute possíveis caminhos a serem seguidos em pesquisas futuras nesta temática.ABSTRACT. Set piece: social representations of the goalkeeper figure in Brazilian football post-BarbosaBACKGROUND: TThis essay addresses the social representations surrounding the role of the goalkeeper in the context of Brazilian football. The time frame is the 1950 World Cup due to the historical and sociocultural impact that the defeat of the Brazilian team in the final at Maracanã Stadium had on the national scene at the time. OBJECTIVE: It analyzes data and information from some authors about the World Cup of that year, especially about the failure of goalkeeper Barbosa, who was held responsible for the Brazilian defeat in the final of that Cup. METHODS: Interpretative analysis of discourses about goalkeepers from the perspective of the Football Representation Space Theory. CONCLUSION: It concludes that in Brazil, since the Barbosa episode, a representation of racist background has been built on the reliability of black goalkeepers, as well as an exacerbated appreciation of foreign goalkeepers. It discusses possible paths to be followed in future research on this topic.


Author(s):  
Amirhossein Ahmadian ◽  
Fredrik Lindsten

Likelihood of generative models has been used traditionally as a score to detect atypical (Out-of-Distribution, OOD) inputs. However, several recent studies have found this approach to be highly unreliable, even with invertible generative models, where computing the likelihood is feasible. In this paper, we present a different framework for generative model--based OOD detection that employs the model in constructing a new representation space, instead of using it directly in computing typicality scores, where it is emphasized that the score function should be interpretable as the similarity between the input and training data in the new space. In practice, with a focus on invertible models, we propose to extract low-dimensional features (statistics) based on the model encoder and complexity of input images, and then use a One-Class SVM to score the data. Contrary to recently proposed OOD detection methods for generative models, our method does not require computing likelihood values. Consequently, it is much faster when using invertible models with iteratively approximated likelihood (e.g. iResNet), while it still has a performance competitive with other related methods.


2021 ◽  
Author(s):  
Liu Yang ◽  
Fanqi Meng ◽  
Xiao Liu ◽  
Ming-Kuang Daniel Wu ◽  
Vicent Ying ◽  
...  

Author(s):  
Bao-Yu Liu ◽  
Ling Huang ◽  
Chang-Dong Wang ◽  
Jian-Huang Lai ◽  
Philip S. Yu

Sign in / Sign up

Export Citation Format

Share Document