A computer vision pipeline for automatic large-scale inventory tracking

Author(s):  
Stephen Gregory ◽  
Utkarsh Singh ◽  
Jeff Gray ◽  
Jon Hobbs
Keyword(s):  
Technologies ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 2
Author(s):  
Ashish Jaiswal ◽  
Ashwin Ramesh Babu ◽  
Mohammad Zaki Zadeh ◽  
Debapriya Banerjee ◽  
Fillia Makedon

Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we present a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make meaningful progress.


Author(s):  
Steven McDonagh ◽  
Cigdem Beyan ◽  
Phoenix X Huang ◽  
Robert B Fisher
Keyword(s):  

2020 ◽  
Vol 26 (1) ◽  
pp. 143-166
Author(s):  
Yilang Peng

Previous research on the success of politicians’ messages on social media has so far focused on a limited number of platforms, especially Facebook and Twitter, and predominately studied the effects of textual content. This research reported here applies computer vision analysis to a total of 59,020 image posts published by 172 Instagram accounts of U.S. politicians, both candidates and office holders, and examines how visual attributes influence audience engagement such as likes and comments. In particular, this study introduces an unsupervised approach that combines transfer learning and clustering techniques to discover hidden categories from large-scale visual data. The results reveal that different self-personalization strategies in visual media, for example, images featuring politicians in private, nonpolitical settings, showing faces, and displaying emotions, generally increase audience engagement. Yet, a significant portion of politician’s Instagram posts still fell into the traditional, “politics-as-usual” type of political communication, showing professional settings and activities. The analysis explains how self-personalization is embodied in specific visual portrayals and how different self-presentation strategies affect audience engagement on a popular but less studied social media platform.


Author(s):  
Harsh Agrawal ◽  
Clint Solomon Mathialagan ◽  
Yash Goyal ◽  
Neelima Chavali ◽  
Prakriti Banik ◽  
...  

Entropy ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. 1174
Author(s):  
Ashish Kumar Gupta ◽  
Ayan Seal ◽  
Mukesh Prasad ◽  
Pritee Khanna

Detection and localization of regions of images that attract immediate human visual attention is currently an intensive area of research in computer vision. The capability of automatic identification and segmentation of such salient image regions has immediate consequences for applications in the field of computer vision, computer graphics, and multimedia. A large number of salient object detection (SOD) methods have been devised to effectively mimic the capability of the human visual system to detect the salient regions in images. These methods can be broadly categorized into two categories based on their feature engineering mechanism: conventional or deep learning-based. In this survey, most of the influential advances in image-based SOD from both conventional as well as deep learning-based categories have been reviewed in detail. Relevant saliency modeling trends with key issues, core techniques, and the scope for future research work have been discussed in the context of difficulties often faced in salient object detection. Results are presented for various challenging cases for some large-scale public datasets. Different metrics considered for assessment of the performance of state-of-the-art salient object detection models are also covered. Some future directions for SOD are presented towards end.


Author(s):  
Yue Qi ◽  
Ruqing Zhong ◽  
Benjamin Kaiser ◽  
Long Nguyen ◽  
Hans Jakob Wagner ◽  
...  

AbstractThis paper presents and investigates a cyber-physical fabrication workflow, which can respond to the deviations between built- and designed form in real-time with vision augmentation. We apply this method for large scale structures built from natural bamboo poles. Raw bamboo poles obtain evolutionarily optimized fibrous layouts ideally suitable for lightweight and sustainable building construction. Nevertheless, their intrinsically imprecise geometries pose a challenge for reliable, automated construction processes. Despite recent digital advancements, building with bamboo poles is still a labor-intensive task and restricted to building typologies where accuracy is of minor importance. The integration of structural bamboo poles with other building layers is often limited by tolerance issues at the interfaces, especially for large scale structures where deviations accumulate incrementally. To address these challenges, an adaptive fabrication process is developed, in which existing deviations can be compensated by changing the geometry of subsequent joints to iteratively correct the pose of further elements. A vision-based sensing system is employed to three-dimensionally scan the bamboo elements before and during construction. Computer vision algorithms are used to process and interpret the sensory data. The updated conditions are streamed to the computational model which computes tailor-made bending stiff joint geometries that can then be directly fabricated on-the-fly. In this paper, we contextualize our research and investigate the performance domains of the proposed workflow through initial fabrication tests. Several application scenarios are further proposed for full scale vision-augmented bamboo construction systems.


2021 ◽  
pp. 242-251
Author(s):  
Waishan Qiu ◽  
Wenjing Li ◽  
Xun Liu ◽  
Xiaokai Huang

AbstractRecently, many new studies emerged to apply computer vision (CV) to street view imagery (SVI) dataset to objectively extract the view indices of various streetscape features such as trees to proxy urban scene qualities. However, human perceptions (e.g., imageability) have a subtle relationship to visual elements which cannot be fully captured using view indices. Conversely, subjective measures using survey and interview data explain more human behaviors. However, the effectiveness of integrating subjective measures with SVI dataset has been less discussed. To address this, we integrated crowdsourcing, CV, and machine learning (ML) to subjectively measure four important perceptions suggested by classical urban design theory. We first collected experts’ rating on sample SVIs regarding the four qualities which became the training labels. CV segmentation was applied to SVI samples extracting streetscape view indices as the explanatory variables. We then trained ML models and achieved high accuracy in predicting the scores. We found a strong correlation between predicted complexity score and the density of urban amenities and services Point of Interests (POI), which validates the effectiveness of subjective measures. In addition, to test the generalizability of the proposed framework as well as to inform urban renewal strategies, we compared the measured qualities in Pudong to other five renowned urban cores worldwide. Rather than predicting perceptual scores directly from generic image features using convolution neural network, our approach follows what urban design theory suggested and confirms various streetscape features affecting multi-dimensional human perceptions. Therefore, its result provides more interpretable and actionable implications for policymakers and city planners.


Author(s):  
William Gray Roncal ◽  
Michael Pekala ◽  
Verena Kaynig-Fittkau ◽  
Dean M Kleissas ◽  
Joshua T Vogelstein ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document