Journal of Computational Vision and Imaging Systems

InnovFaceNet: Deep Face Recognition for Industrial Environments

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3553 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-4

Author(s):

Nagarjun Gururaj ◽

Kanika Batra

Keyword(s):

Face Recognition ◽

Criminal Justice ◽

Real Time ◽

Intelligent Systems ◽

Facial Recognition ◽

Industrial Plant ◽

The World ◽

Industrial Environments ◽

Or Organization ◽

Security Surveillance

In recent times the usage of intelligent systems have paved way formany applications to be robust and self-reliant. One such popularand vast growing technology is face recognition. Facial Recognitiontechnology is used in security, surveillance, criminal justice systemsand many other multimedia platforms. This work proposes a realtime facial recognition technology which can be used in any industrialsetup eliminating manual supervision, ensuring authorized accessto the personnel in the plant. Due to the recent development ofCOVID-19 pandemic around the world, wearing masks has becomea necessity. Our proposed facial recognition technology identifies aperson’s face with mask or no mask in real time with a speed of20 FPS on a CPU and an F1-score of 95.07%. This makes ouralgorithm fast, secure, robust and deployable on a simple personalcomputer or any edge device at any industrial plant or organization.

Image Scale Estimation Using Surface Textures for Quantitative Visual Inspection

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3541 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-3

Author(s):

Juan Park ◽

Chul Min Yeum ◽

Trevor Hrynyk

Keyword(s):

Neural Network ◽

Regression Model ◽

Convolutional Neural Network ◽

Surface Texture ◽

Visual Inspection ◽

Estimation Error ◽

Estimation Technique ◽

Surface Textures ◽

Scale Estimation ◽

Using Data

In this study, a learning-based scale estimation technique is proposed to enable quantitative evaluation of inspection regions. The underlying idea is that surface texture of structures (i.e. bridges or buildings) captured on images contains the scale information of the corresponding images, which is represented by pixel per physical dimension (e.g., mm, inch). This allows training a regression model that provides a relationship between surface textures on images and their corresponding scales. Deep convolutional neural network is used to extract scale-related features from the texture patches and estimate their scales. The trained model can be exploited to estimate scales for all images captured from structure surfaces that have similar textures. The capability of the proposed technique is fully demonstrated using data collected from surface textures of three different structures and achieves an overall average scale estimation error of less than 15%.

Improved Deep Convolutional Neural Network with Age Augmentation for Facial Emotion Recognition in Social Companion Robotics

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3549 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-5

Author(s):

Steven Lawrence ◽

Taif Anjum ◽

Amir Shabani

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Emotion Recognition ◽

Affective Computing ◽

Age Groups ◽

Deep Convolutional Neural Network ◽

Facial Emotion Recognition ◽

Facial Emotion ◽

Term Care ◽

Social Companion

Facial emotion recognition (FER) is a critical component for affective computing in social companion robotics. Current FER datasets are not sufficiently age-diversified as they are predominantly adults excluding seniors above fifty years of age which is the target group in long-term care facilities. Data collection from this age group is more challenging due to their privacy concerns and also restrictions under pandemic situations such as COVID-19. We address this issue by using age augmentation which could act as a regularizer and reduce the overfitting of the classifier as well. Our comprehensive experiments show that improving a typical Deep Convolutional Neural Network (CNN) architecture with facial age augmentation improves both the accuracy and standard deviation of the classifier when predicting emotions of diverse age groups including seniors. The proposed framework is a promising step towards improving a participant’s experience and interactions with social companion robots with affective computing.

Time-Series Causality with Missing Data

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3552 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-4

Author(s):

Bo Yuan Chang ◽

Mohamed A. Naiel ◽

Steven Wardell ◽

Stan Kleinikkink ◽

John S. Zelek

Keyword(s):

Time Series ◽

Missing Data ◽

Missing Values ◽

Time Series Data ◽

Multivariate Time Series ◽

Gaussian Process Regression ◽

Series Data ◽

Causal Relationships ◽

Sampled Data ◽

The Past

Over the past years, researchers have proposed various methods to discover causal relationships among time-series data as well as algorithms to fill in missing entries in time-series data. Little to no work has been done in combining the two strategies for the purpose of learning causal relationships using unevenly sampled multivariate time-series data. In this paper, we examine how the causal parameters learnt from unevenly sampled data (with missing entries) deviates from the parameters learnt using the evenly sampled data (without missing entries). However, to obtain the causal relationship from a given time-series requires evenly sampled data, which suggests filling the missing data values before obtaining the causal parameters. Therefore, the proposed method is based on applying a Gaussian Process Regression (GPR) model for missing data recovery, followed by several pairwise Granger causality equations in Vector Autoregssive form to fit the recovered data and obtain the causal parameters. Experimental results show that the causal parameters generated by using GPR data filling offers much lower RMSE than the dummy model (fill with last seen entry) under all missing values percentage, suggesting that GPR data filling can better preserve the causal relationships when compared with dummy data filling, thus should be considered when dealing with unevenly sampled time-series causality learning.

Evaluation of Solving Methods for the Fundamental Matrix Computation

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3563 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-1

Author(s):

Katherine Arnold ◽

Mohamed A. Naiel ◽

Mark Lamm ◽

Paul Fieguth

Keyword(s):

3D Reconstruction ◽

Camera Calibration ◽

Fundamental Matrix ◽

Synthetic Data ◽

Ground Truth ◽

Matrix Computation ◽

Linear Solvers ◽

Non Linear ◽

Measurements Errors ◽

Image Calibration

Solving the fundamental matrix is a key step in many image calibration and 3D reconstruction systems. The goal of this paper is to study the performance of non-linear solvers for estimating the fundamental matrix in projector-camera calibration. To prevent measurements errors from distorting our understanding, synthetic data are created from ground-truth camera and projector parameters and then used for the assessment of four nonlinear solving strategies.

Seeing the Forest from the Trees: A Novel Deep Learning-Driven Aggregate Embedding for Group-Level Analysis of Public Health Data

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3550 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-3

Author(s):

Alexander MacLean ◽

Yang Yang ◽

Helen Chen ◽

Alexander Wong

Keyword(s):

High School Students ◽

Deep Learning ◽

Decision Makers ◽

School Level ◽

Group Level ◽

School Students ◽

Policy Makers ◽

Specific Data ◽

Data Points ◽

Survey Responses

In the years since the COMPASS dataset initiative was begun, many important research questions have been investigated using its large amount of health information pertaining to high school students across Canada, with findings guiding many decisions made by policy makers [1]. However, to use traditional statistical methods, specific data points must be selected by researchers to include in the analysis, leading to possible unexpected relationships and connections across the study's 280 data points being missed. As well, most analysis is done on a per-student basis, while policies are often implemented at the school level, so understanding behaviours across a school's population can make it easier for school decision makers to interpret findings. Motivated by these goals, this study introduces a novel deep learning-driven aggregate embedding method to determine group-level representations for individual schools from student-level survey responses based on architecture introduced in Variational Autoencoders [2]. This study aims to produce a method which allows for new patterns to be identified in the COMPASS data and for the resulting embedded representations to be applied in future analysis.

COVIDNet-CT: Detection of COVID-19 from Chest CT Images using a Tailored Deep Convolutional Neural Network Architecture

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3547 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-3

Author(s):

Hayden Gunraj ◽

Linda Wang ◽

Alexander Wong

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Network Architecture ◽

Screening Method ◽

Ct Images ◽

Chest Ct ◽

Deep Convolutional Neural Network ◽

Screening Tools ◽

Neural Network Architecture ◽

Tremendous Impact

The COVID-19 pandemic continues to have a tremendous impact on patients and healthcare systems around the world. To combat this disease, there is a need for effective screening tools to identify patients infected with COVID-19, and to this end CT imaging has been proposed as a key screening method to complement RT-PCR testing. Early studies have reported abnormalities in chest CT images which are characteristic of COVID-19 infection, but these abnormalities may be difficult to distinguish from abnormalities caused by other lung conditions. Motivated by this, we introduce COVIDNet-CT, a deep convolutional neural network architecture tailored for detection of COVID-19 cases from chest CT images. We also introduce COVIDx-CT, a CT image dataset comprising 104,009 images across 1,489 patient cases. Finally, we leverage explainability to investigate the decision-making behaviour of COVIDNet-CT and ensure that COVIDNet-CT makes predictions based on relevant indicators in CT images.

Temporally Consistent Edge-Informed Video Super-Resolution (Edge-VSR)

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3555 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-6

Author(s):

Ayush Singh ◽

Mehran Ebrahimi

Keyword(s):

Video Sequence ◽

Structural Information ◽

Resolution Enhancement ◽

Super Resolution ◽

Single Image ◽

Resolution Method ◽

Edge Information ◽

Image Super Resolution ◽

Resolution Algorithm ◽

Single Image Super Resolution

Resolution enhancement of a given video sequence is known as video super-resolution. We propose an end-to-end trainable video super-resolution method as an extension of the recently developed edge-informed single image super-resolution algorithm. A two-stage adversarial-based convolutional neural network that incorporates temporal information along with the current frame's structural information will be used. The edge information in each frame along with optical flow technique for motion estimation among frames will be applied. Promising results on validation datasets will be presented.

Challenges of Deep Learning-based Text Detection in the Wild

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3543 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-5

Author(s):

Zobeir Raisi ◽

Mohamed A. Naiel ◽

Paul Fieguth ◽

Steven Wardell ◽

John Zelek

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Text Detection ◽

Detection Methods ◽

Learning Approaches ◽

Plane Rotation ◽

The Past ◽

Perspective Distortion ◽

Benchmark Datasets ◽

In The Wild

The reported accuracy of recent state-of-the-art text detection methods, mostly deep learning approaches, is in the order of 80% to 90% on standard benchmark datasets. These methods have relaxed some of the restrictions of structured text and environment (i.e., "in the wild") which are usually required for classical OCR to properly function. Even with this relaxation, there are still circumstances where these state-of-the-art methods fail. Several remaining challenges in wild images, like in-plane-rotation, illumination reflection, partial occlusion, complex font styles, and perspective distortion, cause exciting methods to perform poorly. In order to evaluate current approaches in a formal way, we standardize the datasets and metrics for comparison which had made comparison between these methods difficult in the past. We use three benchmark datasets for our evaluations: ICDAR13, ICDAR15, and COCO-Text V2.0. The objective of the paper is to quantify the current shortcomings and to identify the challenges for future text detection research.

Where do Clinical Language Models Break Down? A Critical Behavioural Exploration of the ClinicalBERT Deep Transformer Model

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3548 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-4

Author(s):

Alexander MacLean ◽

Alexander Wong

Keyword(s):

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Language Model ◽

Language Models ◽

Clinical Knowledge ◽

Language Understanding ◽

Improved Performance ◽

Transformer Model ◽

Clinical Domain

The introduction of Bidirectional Encoder Representations from Transformers (BERT) was a major breakthrough for transfer learning in natural language processing, enabling state-of-the-art performance across a large variety of complex language understanding tasks. In the realm of clinical language modeling, the advent of BERT led to the creation of ClinicalBERT, a state-of-the-art deep transformer model pretrained on a wealth of patient clinical notes to facilitate for downstream predictive tasks in the clinical domain. While ClinicalBERT has been widely leveraged by the research community as the foundation for building clinical domain-specific predictive models given its overall improved performance in the Medical Natural Language inference (MedNLI) challenge compared to the seminal BERT model, the fine-grained behaviour and intricacies of this popular clinical language model has not been well-studied. Without this deeper understanding, it is very challenging to understand where ClinicalBERT does well given its additional exposure to clinical knowledge, where it doesn't, and where it can be improved in a meaningful manner. Motivated to garner a deeper understanding, this study presents a critical behaviour exploration of the ClinicalBERT deep transformer model using MedNLI challenge dataset to better understanding the following intricacies: 1) decision-making similarities between ClinicalBERT and BERT (leverage a new metric we introduce called Model Alignment), 2) where ClinicalBERT holds advantages over BERT given its clinical knowledge exposure, and 3) where ClinicalBERT struggles when compared to BERT. The insights gained about the behaviour of ClinicalBERT will help guide towards new directions for designing and training clinical language models in a way that not only addresses the remaining gaps and facilitates for further improvements in clinical language understanding performance, but also highlights the limitation and boundaries of use for such models.

Journal of Computational Vision and Imaging Systems
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By University Of Waterloo

InnovFaceNet: Deep Face Recognition for Industrial Environments

Image Scale Estimation Using Surface Textures for Quantitative Visual Inspection

Improved Deep Convolutional Neural Network with Age Augmentation for Facial Emotion Recognition in Social Companion Robotics

Time-Series Causality with Missing Data

Evaluation of Solving Methods for the Fundamental Matrix Computation

Seeing the Forest from the Trees: A Novel Deep Learning-Driven Aggregate Embedding for Group-Level Analysis of Public Health Data

COVIDNet-CT: Detection of COVID-19 from Chest CT Images using a Tailored Deep Convolutional Neural Network Architecture

Temporally Consistent Edge-Informed Video Super-Resolution (Edge-VSR)

Challenges of Deep Learning-based Text Detection in the Wild

Where do Clinical Language Models Break Down? A Critical Behavioural Exploration of the ClinicalBERT Deep Transformer Model

Export Citation Format

Journal of Computational Vision and Imaging SystemsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By University Of Waterloo

InnovFaceNet: Deep Face Recognition for Industrial Environments

Image Scale Estimation Using Surface Textures for Quantitative Visual Inspection

Improved Deep Convolutional Neural Network with Age Augmentation for Facial Emotion Recognition in Social Companion Robotics

Time-Series Causality with Missing Data

Evaluation of Solving Methods for the Fundamental Matrix Computation

Seeing the Forest from the Trees: A Novel Deep Learning-Driven Aggregate Embedding for Group-Level Analysis of Public Health Data

COVIDNet-CT: Detection of COVID-19 from Chest CT Images using a Tailored Deep Convolutional Neural Network Architecture

Temporally Consistent Edge-Informed Video Super-Resolution (Edge-VSR)

Challenges of Deep Learning-based Text Detection in the Wild

Where do Clinical Language Models Break Down? A Critical Behavioural Exploration of the ClinicalBERT Deep Transformer Model

Journal of Computational Vision and Imaging Systems
Latest Publications