machine perception
Recently Published Documents


TOTAL DOCUMENTS

139
(FIVE YEARS 41)

H-INDEX

9
(FIVE YEARS 2)

Author(s):  
Chandan Kumar

Abstract: Computer vision is a process by which we can understand how the images and videos are stored and manipulated, also it helps in the process of retrieving data from either images or videos. Computer Vision is part of Artificial Intelligence. Computer-Vision plays a major role in Autonomous cars, Object detections, robotics, object tracking, etc. OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. It comes with a highly improved deep learning (dnn ) module. This module now supports a number of deep learning frameworks, including Caffe, TensorFlow, and Torch/PyTorch. This does allow us to take our models trained using dedicated deep learning libraries/tools and then efficiently use them directly inside our OpenCV scripts. MediaPipe is a framework mainly used for building audio, video, or any time series data. With the help of the MediaPipe framework, we can build very impressive pipelines for different media processing functions like Multi-hand Tracking, Face Detection, Object Detection and Tracking, etc.


2022 ◽  
Vol 18 (1) ◽  
pp. e1009739
Author(s):  
Nathan C. L. Kong ◽  
Eshed Margalit ◽  
Justin L. Gardner ◽  
Anthony M. Norcia

Task-optimized convolutional neural networks (CNNs) show striking similarities to the ventral visual stream. However, human-imperceptible image perturbations can cause a CNN to make incorrect predictions. Here we provide insight into this brittleness by investigating the representations of models that are either robust or not robust to image perturbations. Theory suggests that the robustness of a system to these perturbations could be related to the power law exponent of the eigenspectrum of its set of neural responses, where power law exponents closer to and larger than one would indicate a system that is less susceptible to input perturbations. We show that neural responses in mouse and macaque primary visual cortex (V1) obey the predictions of this theory, where their eigenspectra have power law exponents of at least one. We also find that the eigenspectra of model representations decay slowly relative to those observed in neurophysiology and that robust models have eigenspectra that decay slightly faster and have higher power law exponents than those of non-robust models. The slow decay of the eigenspectra suggests that substantial variance in the model responses is related to the encoding of fine stimulus features. We therefore investigated the spatial frequency tuning of artificial neurons and found that a large proportion of them preferred high spatial frequencies and that robust models had preferred spatial frequency distributions more aligned with the measured spatial frequency distribution of macaque V1 cells. Furthermore, robust models were quantitatively better models of V1 than non-robust models. Our results are consistent with other findings that there is a misalignment between human and machine perception. They also suggest that it may be useful to penalize slow-decaying eigenspectra or to bias models to extract features of lower spatial frequencies during task-optimization in order to improve robustness and V1 neural response predictivity.


Symmetry ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 1850
Author(s):  
Krishnan Balasubramanian

Symmetry forms the foundation of combinatorial theories and algorithms of enumeration such as Möbius inversion, Euler totient functions, and the celebrated Pólya’s theory of enumeration under the symmetric group action. As machine learning and artificial intelligence techniques play increasingly important roles in the machine perception of music to image processing that are central to many disciplines, combinatorics, graph theory, and symmetry act as powerful bridges to the developments of algorithms for such varied applications. In this review, we bring together the confluence of music theory and spectroscopy as two primary disciplines to outline several interconnections of combinatorial and symmetry techniques in the development of algorithms for machine generation of musical patterns of the east and west and a variety of spectroscopic signatures of molecules. Combinatorial techniques in conjunction with group theory can be harnessed to generate the musical scales, intensity patterns in ESR spectra, multiple quantum NMR spectra, nuclear spin statistics of both fermions and bosons, colorings of hyperplanes of hypercubes, enumeration of chiral isomers, and vibrational modes of complex systems including supergiant fullerenes, as exemplified by our work on the golden fullerene C150,000. Combinatorial techniques are shown to yield algorithms for the enumeration and construction of musical chords and scales called ragas in music theory, as we exemplify by the machine construction of ragas and machine perception of musical patterns. We also outline the applications of Hadamard matrices and magic squares in the development of algorithms for the generation of balanced-pitch chords. Machine perception of musical, spectroscopic, and symmetry patterns are considered.


2021 ◽  
Vol 3 ◽  
Author(s):  
Usman Mahmood ◽  
Robik Shrestha ◽  
David D. B. Bates ◽  
Lorenzo Mannelli ◽  
Giuseppe Corrias ◽  
...  

Artificial intelligence (AI) has been successful at solving numerous problems in machine perception. In radiology, AI systems are rapidly evolving and show progress in guiding treatment decisions, diagnosing, localizing disease on medical images, and improving radiologists' efficiency. A critical component to deploying AI in radiology is to gain confidence in a developed system's efficacy and safety. The current gold standard approach is to conduct an analytical validation of performance on a generalization dataset from one or more institutions, followed by a clinical validation study of the system's efficacy during deployment. Clinical validation studies are time-consuming, and best practices dictate limited re-use of analytical validation data, so it is ideal to know ahead of time if a system is likely to fail analytical or clinical validation. In this paper, we describe a series of sanity tests to identify when a system performs well on development data for the wrong reasons. We illustrate the sanity tests' value by designing a deep learning system to classify pancreatic cancer seen in computed tomography scans.


2021 ◽  
Author(s):  
Yigong Hu ◽  
Shengzhong Liu ◽  
Tarek Abdelzaher ◽  
Maggie Wigness ◽  
Philip David

2021 ◽  
pp. 239-259
Author(s):  
Alaa Alahmadi ◽  
Alan Davies ◽  
Markel Vigo ◽  
Katherine Dempsey ◽  
Caroline Jay

Electrocardiograms (ECGs), which capture the electrical activity of the human heart, are widely used in clinical practice, and notoriously difficult to interpret. Whilst there have been attempts to automate their interpretation for several decades, human reading of the data presented visually remains the ‘gold standard’. We demonstrate how a visualisation technique that significantly improves human interpretation of ECG data can be used as a basis for an automated interpretation algorithm that is more accurate than current signal processing techniques, and has the benefit of the human and machine sharing the same representation of the data. We discuss the potential of the approach, in terms of its accuracy and acceptability in clinical practice.


2021 ◽  
pp. 171-196
Author(s):  
José Hernández-Orallo ◽  
Cèsar Ferri

Machine intelligence differs signficantly from human intelligence. While human perception has similarities to the way machine perception works, human learning is mostly a directed process, guided by other people: parents, teachers, ... The area of machine teaching is becoming increasingly popular as a different paradigm for making machines learn. In this chapter, we start from recent results in machine teaching that show the relevance of prior alignment between humans and machines. From here, we focus on the scenario when a machine has to teach humans, a situation more and more common in the future. Specifically, we analyse how machine teaching relates to explainable artificial intelligence, and how simplicity priors play a role beyond intelligibility. We illustrate this with a general teaching protocol and a few examples in several representation languages: feature-value vectors and sequences. Some straightforward experiments with humans indicate when a strong simplicity prior is --and is not-- sufficient.


2021 ◽  
Author(s):  
Nathan C. L. Kong ◽  
Eshed Margalit ◽  
Justin L. Gardner ◽  
Anthony M. Norcia

Task-optimized convolutional neural networks (CNNs) show striking similarities to the ventral visual stream. However, human-imperceptible image perturbations can cause a CNN to make incorrect predictions. Here we provide insight into this brittleness by investigating the representations of models that are either robust or not robust to image perturbations. Theory suggests that the robustness of a system to these perturbations could be related to the power law exponent of the eigenspectrum of its set of neural responses, where power law exponents closer to and larger than one would indicate a system that is less susceptible to input perturbations. We show that neural responses in mouse and macaque primary visual cortex (V1) obey the predictions of this theory, where their eigenspectra have power law exponents of at least one. We also find that the eigenspectra of model representations decay slowly relative to those observed in neurophysiology and that robust models have eigenspectra that decay slightly faster and have higher power law exponents than those of non-robust models. The slow decay of the eigenspectra suggests that substantial variance in the model responses is related to the encoding of fine stimulus features. We therefore investigated the spatial frequency tuning of artificial neurons and found that a large proportion of them preferred high spatial frequencies and that robust models had preferred spatial frequency distributions more aligned with the measured spatial frequency distribution of macaque V1 cells. Furthermore, robust models were quantitatively better models of V1 than non-robust models. Our results are consistent with other findings that there is a misalignment between human and machine perception. They also suggest that it may be useful to penalize slow-decaying eigenspectra or to bias models to extract features of lower spatial frequencies during task-optimization in order to improve robustness and V1 neural response predictivity.


2021 ◽  
Vol 32 (4) ◽  
Author(s):  
Wahengbam Kanan Kumar ◽  
Ningthoujam Johny Singh ◽  
Aheibam Dinamani Singh ◽  
Kishorjit Nongmeikapam

2021 ◽  
Vol 13 (8) ◽  
pp. 1523
Author(s):  
Yang Shao ◽  
Austin J. Cooner ◽  
Stephen J. Walsh

High-spatial-resolution satellite imagery has been widely applied for detailed urban mapping. Recently, deep convolutional neural networks (DCNNs) have shown promise in certain remote sensing applications, but they are still relatively new techniques for general urban mapping. This study examines the use of two DCNNs (U-Net and VGG16) to provide an automatic schema to support high-resolution mapping of buildings, road/open built-up, and vegetation cover. Using WorldView-2 imagery as input, we first applied an established OBIA method to characterize major urban land cover classes. An OBIA-derived urban map was then divided into a training and testing region to evaluate the DCNNs’ performance. For U-Net mapping, we were particularly interested in how sample size or the number of image tiles affect mapping accuracy. U-Net generated cross-validation accuracies ranging from 40.5 to 95.2% for training sample sizes from 32 to 4096 image tiles (each tile was 256 by 256 pixels). A per-pixel accuracy assessment led to 87.8 percent overall accuracy for the testing region, suggesting U-Net’s good generalization capabilities. For the VGG16 mapping, we proposed an object-based framing paradigm that retains spatial information and assists machine perception through Gaussian blurring. Gaussian blurring was used as a pre-processing step to enhance the contrast between objects of interest and background (contextual) information. Combined with the pre-trained VGG16 and transfer learning, this analytical approach generated a 77.3 percent overall accuracy for per-object assessment. The mapping accuracy could be further improved given more robust segmentation algorithms and better quantity/quality of training samples. Our study shows significant promise for DCNN implementation for urban mapping and our approach can transfer to a number of other remote sensing applications.


Sign in / Sign up

Export Citation Format

Share Document