A computer vision pipeline for automatic large-scale inventory tracking

Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we present a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make meaningful progress.

Download Full-text

Towards large-scale evaluation of mental stress and biomechanical strain in manufacturing environments using 3D-referenced gaze and wearable-based analytics

Electronic Imaging ◽

10.2352/issn.2470-1173.2021.6.iriacv-310 ◽

2021 ◽

Keyword(s):

Computer Vision ◽

Mental Stress ◽

Large Scale ◽

Fast Track ◽

Industrial Applications ◽

Scale Evaluation ◽

Electronic Imaging ◽

Manufacturing Environments ◽

Intelligent Robotics ◽

Biomechanical Strain

Fast track article for IS&T International Symposium on Electronic Imaging 2021: Intelligent Robotics and Industrial Applications using Computer Vision 2021 proceedings.

Download Full-text

Applying semi-synchronised task farming to large-scale computer vision problems

The International Journal of High Performance Computing Applications ◽

10.1177/1094342014532965 ◽

2014 ◽

Vol 29 (4) ◽

pp. 437-460

Author(s):

Steven McDonagh ◽

Cigdem Beyan ◽

Phoenix X Huang ◽

Robert B Fisher

Keyword(s):

Computer Vision ◽

Large Scale

Download Full-text

What Makes Politicians’ Instagram Posts Popular? Analyzing Social Media Strategies of Candidates and Office Holders with Computer Vision

The International Journal of Press/Politics ◽

10.1177/1940161220964769 ◽

2020 ◽

Vol 26 (1) ◽

pp. 143-166

Author(s):

Yilang Peng

Keyword(s):

Computer Vision ◽

Social Media ◽

Political Communication ◽

Large Scale ◽

Visual Media ◽

Audience Engagement ◽

Self Presentation ◽

Media Platform ◽

Unsupervised Approach ◽

Textual Content

Previous research on the success of politicians’ messages on social media has so far focused on a limited number of platforms, especially Facebook and Twitter, and predominately studied the effects of textual content. This research reported here applies computer vision analysis to a total of 59,020 image posts published by 172 Instagram accounts of U.S. politicians, both candidates and office holders, and examines how visual attributes influence audience engagement such as likes and comments. In particular, this study introduces an unsupervised approach that combines transfer learning and clustering techniques to discover hidden categories from large-scale visual data. The results reveal that different self-personalization strategies in visual media, for example, images featuring politicians in private, nonpolitical settings, showing faces, and displaying emotions, generally increase audience engagement. Yet, a significant portion of politician’s Instagram posts still fell into the traditional, “politics-as-usual” type of political communication, showing professional settings and activities. The analysis explains how self-personalization is embodied in specific visual portrayals and how different self-presentation strategies affect audience engagement on a popular but less studied social media platform.

Download Full-text

CloudCV: Large-Scale Distributed Computer Vision as a Cloud Service

Mobile Cloud Visual Media Computing ◽

10.1007/978-3-319-24702-1_11 ◽

2015 ◽

pp. 265-290 ◽

Cited By ~ 15

Author(s):

Harsh Agrawal ◽

Clint Solomon Mathialagan ◽

Yash Goyal ◽

Neelima Chavali ◽

Prakriti Banik ◽

...

Keyword(s):

Computer Vision ◽

Large Scale ◽

Cloud Service

Download Full-text

Salient Object Detection Techniques in Computer Vision—A Survey

Entropy ◽

10.3390/e22101174 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1174

Author(s):

Ashish Kumar Gupta ◽

Ayan Seal ◽

Mukesh Prasad ◽

Pritee Khanna

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Object Detection ◽

Large Scale ◽

Research Work ◽

Salient Object Detection ◽

Future Research ◽

Automatic Identification ◽

Salient Object ◽

Detection Techniques

Detection and localization of regions of images that attract immediate human visual attention is currently an intensive area of research in computer vision. The capability of automatic identification and segmentation of such salient image regions has immediate consequences for applications in the field of computer vision, computer graphics, and multimedia. A large number of salient object detection (SOD) methods have been devised to effectively mimic the capability of the human visual system to detect the salient regions in images. These methods can be broadly categorized into two categories based on their feature engineering mechanism: conventional or deep learning-based. In this survey, most of the influential advances in image-based SOD from both conventional as well as deep learning-based categories have been reviewed in detail. Relevant saliency modeling trends with key issues, core techniques, and the scope for future research work have been discussed in the context of difficulties often faced in salient object detection. Results are presented for various challenging cases for some large-scale public datasets. Different metrics considered for assessment of the performance of state-of-the-art salient object detection models are also covered. Some future directions for SOD are presented towards end.

Download Full-text

Analysis Of Low-Level Computer Vision Algorithms For Implementation On A Very Large Scale Integrated (VLSI) Processor Array

10.1117/12.934096 ◽

1983 ◽

Cited By ~ 2

Author(s):

Michael R. Lowry ◽

Allan Miller

Keyword(s):

Computer Vision ◽

Large Scale ◽

Processor Array ◽

Low Level

Download Full-text

Working with Uncertainties: An Adaptive Fabrication Workflow for Bamboo Structures

Proceedings of the 2020 DigitalFUTURES ◽

10.1007/978-981-33-4400-6_25 ◽

2021 ◽

pp. 265-279

Author(s):

Yue Qi ◽

Ruqing Zhong ◽

Benjamin Kaiser ◽

Long Nguyen ◽

Hans Jakob Wagner ◽

...

Keyword(s):

Computer Vision ◽

Large Scale ◽

Full Scale ◽

Minor Importance ◽

Sustainable Building ◽

Large Scale Structures ◽

Sensory Data ◽

Scale Structures ◽

Performance Domains ◽

Stiff Joint

AbstractThis paper presents and investigates a cyber-physical fabrication workflow, which can respond to the deviations between built- and designed form in real-time with vision augmentation. We apply this method for large scale structures built from natural bamboo poles. Raw bamboo poles obtain evolutionarily optimized fibrous layouts ideally suitable for lightweight and sustainable building construction. Nevertheless, their intrinsically imprecise geometries pose a challenge for reliable, automated construction processes. Despite recent digital advancements, building with bamboo poles is still a labor-intensive task and restricted to building typologies where accuracy is of minor importance. The integration of structural bamboo poles with other building layers is often limited by tolerance issues at the interfaces, especially for large scale structures where deviations accumulate incrementally. To address these challenges, an adaptive fabrication process is developed, in which existing deviations can be compensated by changing the geometry of subsequent joints to iteratively correct the pose of further elements. A vision-based sensing system is employed to three-dimensionally scan the bamboo elements before and during construction. Computer vision algorithms are used to process and interpret the sensory data. The updated conditions are streamed to the computational model which computes tailor-made bending stiff joint geometries that can then be directly fabricated on-the-fly. In this paper, we contextualize our research and investigate the performance domains of the proposed workflow through initial fabrication tests. Several application scenarios are further proposed for full scale vision-augmented bamboo construction systems.

Download Full-text

Subjectively Measured Streetscape Qualities for Shanghai with Large-Scale Application of Computer Vision and Machine Learning

10.1007/978-981-16-5983-6_23 ◽

2021 ◽

pp. 242-251

Author(s):

Waishan Qiu ◽

Wenjing Li ◽

Xun Liu ◽

Xiaokai Huang

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Urban Design ◽

Design Theory ◽

Large Scale ◽

Image Features ◽

Subjective Measures ◽

Visual Elements ◽

Urban Amenities ◽

Human Perceptions

AbstractRecently, many new studies emerged to apply computer vision (CV) to street view imagery (SVI) dataset to objectively extract the view indices of various streetscape features such as trees to proxy urban scene qualities. However, human perceptions (e.g., imageability) have a subtle relationship to visual elements which cannot be fully captured using view indices. Conversely, subjective measures using survey and interview data explain more human behaviors. However, the effectiveness of integrating subjective measures with SVI dataset has been less discussed. To address this, we integrated crowdsourcing, CV, and machine learning (ML) to subjectively measure four important perceptions suggested by classical urban design theory. We first collected experts’ rating on sample SVIs regarding the four qualities which became the training labels. CV segmentation was applied to SVI samples extracting streetscape view indices as the explanatory variables. We then trained ML models and achieved high accuracy in predicting the scores. We found a strong correlation between predicted complexity score and the density of urban amenities and services Point of Interests (POI), which validates the effectiveness of subjective measures. In addition, to test the generalizability of the proposed framework as well as to inform urban renewal strategies, we compared the measured qualities in Pudong to other five renowned urban cores worldwide. Rather than predicting perceptual scores directly from generic image features using convolution neural network, our approach follows what urban design theory suggested and confirms various streetscape features affecting multi-dimensional human perceptions. Therefore, its result provides more interpretable and actionable implications for policymakers and city planners.

Download Full-text

VESICLE: Volumetric Evaluation of Synaptic Inferfaces using Computer Vision at Large Scale

Procedings of the British Machine Vision Conference 2015 ◽

10.5244/c.29.81 ◽

2015 ◽

Cited By ~ 7

Author(s):

William Gray Roncal ◽

Michael Pekala ◽

Verena Kaynig-Fittkau ◽

Dean M Kleissas ◽

Joshua T Vogelstein ◽

...

Keyword(s):

Computer Vision ◽

Large Scale

Download Full-text