scholarly journals Complementary computing for visual tasks: Meshing computer vision with human visual processing

Author(s):  
Ashish Kapoor ◽  
Desney Tan ◽  
Pradeep Shenoy ◽  
Eric Horvitz
Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3691
Author(s):  
Ciprian Orhei ◽  
Silviu Vert ◽  
Muguras Mocofan ◽  
Radu Vasiu

Computer Vision is a cross-research field with the main purpose of understanding the surrounding environment as closely as possible to human perception. The image processing systems is continuously growing and expanding into more complex systems, usually tailored to the certain needs or applications it may serve. To better serve this purpose, research on the architecture and design of such systems is also important. We present the End-to-End Computer Vision Framework, an open-source solution that aims to support researchers and teachers within the image processing vast field. The framework has incorporated Computer Vision features and Machine Learning models that researchers can use. In the continuous need to add new Computer Vision algorithms for a day-to-day research activity, our proposed framework has an advantage given by the configurable and scalar architecture. Even if the main focus of the framework is on the Computer Vision processing pipeline, the framework offers solutions to incorporate even more complex activities, such as training Machine Learning models. EECVF aims to become a useful tool for learning activities in the Computer Vision field, as it allows the learner and the teacher to handle only the topics at hand, and not the interconnection necessary for visual processing flow.


Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 132-132
Author(s):  
S Edelman ◽  
S Duvdevani-Bar

To recognise a previously seen object, the visual system must overcome the variability in the object's appearance caused by factors such as illumination and pose. It is possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Routine visual tasks, however, typically require not so much recognition as categorisation, that is making sense of objects not seen before. Despite persistent practical difficulties, theorists in computer vision and visual perception traditionally favour the structural route to categorisation, according to which forming a description of a novel shape in terms of its parts and their spatial relationships is a prerequisite to the ability to categorise it. In comparison, we demonstrate that knowledge of instances of each of several representative categories can provide the necessary computational substrate for the categorisation of their new instances, as well as for representation and processing of radically novel shapes, not belonging to any of the familiar categories. The representational scheme underlying this approach, according to which objects are encoded by their similarities to entire reference shapes (S Edelman, 1997 Behavioral and Brain Sciences in press), is computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies.


2020 ◽  
Vol 10 (12) ◽  
pp. 4415 ◽  
Author(s):  
Cheng Li ◽  
Baolong Guo ◽  
Geng Wang ◽  
Yan Zheng ◽  
Yang Liu ◽  
...  

Superpixels intuitively over-segment an image into small compact regions with homogeneity. Owing to its outstanding performance on region description, superpixels have been widely used in various computer vision tasks as the substitution for pixels. Therefore, efficient algorithms for generating superpixels are still important for advanced visual tasks. In this work, two strategies are presented on conventional simple non-iterative clustering (SNIC) framework, aiming to improve the computational efficiency as well as segmentation performance. Firstly, inter-pixel correlation is introduced to eliminate the redundant inspection of neighboring elements. In addition, it strengthens the color identity in complicated texture regions, thus providing a desirable trade-off between runtime and accuracy. As a result, superpixel centroids are evolved more efficiently and accurately. For further accelerating the framework, a recursive batch processing strategy is proposed to eliminate unnecessary sorting operations. Therefore, a large number of neighboring elements can be assigned directly. Finally, the two strategies result in a novel synergetic non-iterative clustering with efficiency (NICE) method based on SNIC. Experimental results verify that it works 40% faster than conventional framework, while generating comparable superpixels for several quantitative metrics—sometimes even better.


Author(s):  
Zhanshen Feng

With the progress and development of multimedia image processing technology, and the rapid growth of image data, how to efficiently extract the interesting and valuable information from the huge image data, and effectively filter out the redundant data, these have become an urgent problem in the field of image processing and computer vision. In recent years, as one of the important branches of computer vision, image detection can assist and improve a series of visual processing tasks. It has been widely used in many fields, such as scene classification, visual tracking, object redirection, semantic segmentation and so on. Intelligent algorithms have strong non-linear mapping capability, data processing capacity and generalization ability. Support vector machine (SVM) by using the structural risk minimization principle constructs the optimal classification hyper-plane in the attribute space to make the classifier get the global optimum and has the expected risk meet a certain upper bound at a certain probability in the entire sample space. This paper combines SVM and artificial fish swarm algorithm (AFSA) for parameter optimization, builds AFSA-SVM classification model to achieve the intelligent identification of image features, and provides reliable technological means to accelerate sensing technology. The experiment result proves that AFSA-SVM has better classification accuracy and indicates that the algorithm of this paper can effectively realize the intelligent identification of image features.


1999 ◽  
Vol 11 (2) ◽  
pp. 87-87
Author(s):  
Shunichiro Oe ◽  

The widely used term <B>Computer Vision</B> applies to when computers are substituted for human visual information processing. As Real-world objects, except for characters, symbols, figures and photographs created by people, are 3-dimensional (3-D), their two-dimensional (2-D) images obtained by camera are produced by compressing 3-D information to 2-D. Many methods of 2-D image processing and pattern recognition have been developed and widely applied to industrial and medical processing, etc. Research work enabling computers to recognize 3-D objects by 3-D information extracted from 2-D images has been carried out in artificial intelligent robotics. Many techniques have been developed and some applied practically in scene analysis or 3-D measurement. These practical applications are based on image sensing, image processing, pattern recognition, image measurement, extraction of 3-D information, and image understanding. New techniques are constantly appearing. The title of this special issue is <B>Vision</B>, and it features 8 papers from basic computer vision theory to industrial applications. These papers include the following: Kohji Kamejima proposes a method to detect self-similarity in random image fields - the basis of human visual processing. Akio Nagasaka et al. developed a way to identify a real scene in real time using run-length encoding of video feature sequences. This technique will become a basis for active video recording and new robotic machine vision. Toshifumi Honda presents a method for visual inspection of solder joint by 3-D image analysis - a very important issue in the inspection of printed circuit boards. Saburo Okada et al. contribute a new technique on simultaneous measurement of shape and normal vector for specular objects. These methods are all useful for obtaining 3-D information. Masato Nakajima presents a human face identification method for security monitoring using 3-D gray-level information. Kenji Terada et al. propose a method of automatic counting passing people using image sensing. These two technologies are very useful in access control. Yoji. Ogawa presents a new image processing method for automatic welding in turbid water under a non-preparatory environment. Liu Wei et al. develop a method for detection and management of cutting-tool wear using visual sensors. We are certain that all of these papers will contribute greatly to the development of vision systems in robotics and mechatronics.


1997 ◽  
Vol 352 (1358) ◽  
pp. 1241-1248 ◽  
Author(s):  
Michael Brady

This paper describes a number of computer vision systems that we hae constructed, and which are firmly based on knowledge of diverse sorts. However, that knowledge is often represented in a way that is only accessible to a limited set of processes, that make limited use of it, and though the knowledge is amenable to change, in practice it can only be changed in rather simple ways. The rest of the paper addresses the questions: (i) what knowledge is mobilized in the furtherance of a perceptual task? (ii) how is that knowledge represented? and (iii) how is that knowledge mobilized? First we review some cases of early visual processing where the mobilization of knowledge seems to be a key contributor to success yet where the knowledge is deliberately represented in a quite inflexible way. After considering the knowledge that is involved in overcoming the projective nature of images, we move the discussion to the knowledge that was required in programs to match, register, and recognize shapes in a range of applications. Finally, we discuss the current state of process architectures for knowledge mobilization.


Author(s):  
Tom Baden ◽  
Timm Schubert ◽  
Philipp Berens ◽  
Thomas Euler

Visual processing begins in the retina—a thin, multilayered neuronal tissue lining the back of the vertebrate eye. The retina does not merely read out the constant stream of photons impinging on its dense array of photoreceptor cells. Instead it performs a first, extensive analysis of the visual scene, while constantly adapting its sensitivity range to the input statistics, such as the brightness or contrast distribution. The functional organization of the retina abides to several key organizational principles. These include overlapping and repeating instances of both divergence and convergence, constant and dynamic range-adjustments, and (perhaps most importantly) decomposition of image information into parallel channels. This is often referred to as “parallel processing.” To support this, the retina features a large diversity of neurons organized in functionally overlapping microcircuits that typically uniformly sample the retinal surface in a regular mosaic. Ultimately, each circuit drives spike trains in the retina’s output neurons, the retinal ganglion cells. Their axons form the optic nerve to convey multiple, distinctive, and often already heavily processed views of the world to higher visual centers in the brain. From an experimental point of view, the retina is a neuroscientist’s dream. While part of the central nervous system, the retina is largely self-contained, and depending on the species, it receives little feedback from downstream stages. This means that the tissue can be disconnected from the rest of the brain and studied in a dish for many hours without losing its functional integrity, all while retaining excellent experimental control over the exclusive natural network input: the visual stimulus. Once removed from the eyecup, the retina can be flattened, thus its neurons are easily accessed optically or using visually guided electrodes. Retinal tiling means that function studied at any one place can usually be considered representative for the entire tissue. At the same time, species-dependent specializations offer the opportunity to study circuits adapted to different visual tasks: for example, in case of our fovea, high-acuity vision. Taken together, today the retina is amongst the best understood complex neuronal tissues of the vertebrate brain.


2020 ◽  
pp. 1-15 ◽  
Author(s):  
Grace W. Lindsay

Convolutional neural networks (CNNs) were inspired by early findings in the study of biological vision. They have since become successful tools in computer vision and state-of-the-art models of both neural activity and behavior on visual tasks. This review highlights what, in the context of CNNs, it means to be a good model in computational neuroscience and the various ways models can provide insight. Specifically, it covers the origins of CNNs and the methods by which we validate them as models of biological vision. It then goes on to elaborate on what we can learn about biological vision by understanding and experimenting on CNNs and discusses emerging opportunities for the use of CNNs in vision research beyond basic object recognition.


Author(s):  
Natália Alcazar de Matos ◽  
Valdeni Soliani Franco ◽  
Mariana Moran

ResumoFundamentado em uma pesquisa de mestrado, este trabalho tem como objetivo analisar as técnicas mobilizadas por 18 estudantes do 2º ano de Licenciatura em Matemática ao resolverem tarefas matemáticas visuais, classificando-as segundo seu nível de Visualidade Matemática e identificando, quando houver, a presença das habilidades de visualização - Interpretação da Informação Figurativa (IFI) e Processamento visual (VP), descritas por Alan J. Bishop. Os dados foram coletados por meio do registro escrito das resoluções de duas tarefas matemáticas propostas aos participantes da pesquisa. A análise dos dados revelou que das 15 resoluções apresentadas, 7 foram classificadas como técnicas parcialmente visuais, 5 como não-visuais e 3 como visuais. Todas apresentaram indícios da habilidade IFI; e 10 do VP.Palavras-chave: Visualização matemática, Tarefas matemáticas, Habilidades visuais. AbstractBased on a master's research, this work aims to analyze the techniques mobilized by 18 students of the 2nd year of a Mathematics degree course when solving visual mathematical tasks, classifying them according to their level of Mathematical Visuality and identifying, if any, the presence of visualization skills - Interpretation of Figurative Information (IFI) and Visual Processing (VP), described by Alan J. Bishop. The data were collected through the written record of the resolutions of two mathematical tasks proposed to the research participants. Data analysis revealed that of the 15 resolutions presented, 7 were classified as partially visual techniques, 5 as non-visual, and 3 as visual. All of them showed evidence of IFI skill; and 10 of the PV.Keywords: Mathematical visualization, Mathematical tasks, Visual skills.ResumenBasado en una investigación de máster, este trabajo tiene como objetivo analizar las técnicas movilizadas por 18 estudiantes de 2o curso de la carrera docente de Matemáticas a la hora de resolver tareas matemáticas visuales, clasificándolas según su nivel de Visualidad Matemática e identificando, en su caso, la presencia de Habilidades de visualización: Interpretación de Información Figurativa (IFI) y Procesamiento Visual (VP), descrito por Alan J. Bishop. Los datos fueron recolectados a través del registro escrito de las resoluciones de dos tareas matemáticas propuestas a los participantes de la investigación. El análisis de los datos reveló que, de las 15 resoluciones presentadas, 7 se clasificaron como técnicas parcialmente visuales, 5 como no visuales y 3 como visuales. Todos mostraron evidencia de habilidad de IFI; y 10 del PV.Palabras clave: Visualización matemática, Tareas matemáticas, Habilidades visuales.


Author(s):  
Frances Egan

Vision is the most studied sense. It is our richest source of information about the external world, providing us with knowledge of the shape, size, distance, colour and luminosity of objects around us. Vision is fast, automatic and achieved without conscious effort; however, the apparent ease with which we see is deceptive. Ever since Kepler characterized the formation of the retinal image in the early seventeenth century, vision theorists have known that the image on the retina does not correspond in an obvious manner to the way things look. The retinal image is two-dimensional, yet we see three dimensions; the size and shape of the image that an object casts on the retina varies with the distance and perspective of the observer, yet we experience objects as having constant size and shape. The primary task of a theory of vision is to explain how useful information about the external world is recovered from the changing retinal image. Theories of vision fall roughly into two classes. Indirect theories characterize the processes underlying visual perception in psychological terms, as, for example, inference from prior data or construction of complex percepts from basic sensory components. Direct theories tend to stress the richness of the information available in the retinal image, but, more importantly, they deny that visual processes can be given any correct psychological or mental characterization. Direct theorists, while not denying that the processing underlying vision may be very complex, claim that the complexity is to be explicated merely by reference to nonpsychological, neural processes implemented in the brain. The most influential recent work in vision treats it as an information-processing task, hence as indirect. Many computational models characterize visual processing as the production and decoding of a series of increasingly useful internal representations of the distal scene. These operations are described in computational accounts by precise algorithms. Computer implementations of possible strategies employed by the visual system contribute to our understanding of the problems inherent in complex visual tasks such as edge detection and shape recognition, and make possible the rigorous testing of proposed solutions.


Sign in / Sign up

Export Citation Format

Share Document