scholarly journals Statistics of Visual Responses to Image Object Stimuli from Primate AIT Neurons to DNN Neurons

2018 ◽  
Vol 30 (2) ◽  
pp. 447-476 ◽  
Author(s):  
Qiulei Dong ◽  
Hong Wang ◽  
Zhanyi Hu

Under the goal-driven paradigm, Yamins et al. ( 2014 ; Yamins & DiCarlo, 2016 ) have shown that by optimizing only the final eight-way categorization performance of a four-layer hierarchical network, not only can its top output layer quantitatively predict IT neuron responses but its penultimate layer can also automatically predict V4 neuron responses. Currently, deep neural networks (DNNs) in the field of computer vision have reached image object categorization performance comparable to that of human beings on ImageNet, a data set that contains 1.3 million training images of 1000 categories. We explore whether the DNN neurons (units in DNNs) possess image object representational statistics similar to monkey IT neurons, particularly when the network becomes deeper and the number of image categories becomes larger, using VGG19, a typical and widely used deep network of 19 layers in the computer vision field. Following Lehky, Kiani, Esteky, and Tanaka ( 2011 , 2014 ), where the response statistics of 674 IT neurons to 806 image stimuli are analyzed using three measures (kurtosis, Pareto tail index, and intrinsic dimensionality), we investigate the three issues in this letter using the same three measures: (1) the similarities and differences of the neural response statistics between VGG19 and primate IT cortex, (2) the variation trends of the response statistics of VGG19 neurons at different layers from low to high, and (3) the variation trends of the response statistics of VGG19 neurons when the numbers of stimuli and neurons increase. We find that the response statistics on both single-neuron selectivity and population sparseness of VGG19 neurons are fundamentally different from those of IT neurons in most cases; by increasing the number of neurons in different layers and the number of stimuli, the response statistics of neurons at different layers from low to high do not substantially change; and the estimated intrinsic dimensionality values at the low convolutional layers of VGG19 are considerably larger than the value of approximately 100 reported for IT neurons in Lehky et al. ( 2014 ), whereas those at the high fully connected layers are close to or lower than 100. To the best of our knowledge, this work is the first attempt to analyze the response statistics of DNN neurons with respect to primate IT neurons in image object representation.

Author(s):  
Sebastian Hoppe Nesgaard Jensen ◽  
Mads Emil Brix Doest ◽  
Henrik Aanæs ◽  
Alessio Del Bue

AbstractNon-rigid structure from motion (nrsfm), is a long standing and central problem in computer vision and its solution is necessary for obtaining 3D information from multiple images when the scene is dynamic. A main issue regarding the further development of this important computer vision topic, is the lack of high quality data sets. We here address this issue by presenting a data set created for this purpose, which is made publicly available, and considerably larger than the previous state of the art. To validate the applicability of this data set, and provide an investigation into the state of the art of nrsfm, including potential directions forward, we here present a benchmark and a scrupulous evaluation using this data set. This benchmark evaluates 18 different methods with available code that reasonably spans the state of the art in sparse nrsfm. This new public data set and evaluation protocol will provide benchmark tools for further development in this challenging field.


2019 ◽  
Vol 8 (2S8) ◽  
pp. 1311-1313

With the increasing awareness of environmental protection, people are paying more and more attention to the protection of wild animals. Their survive-al is closely related to human beings. As progress in target detection has achieved unprecedented success in computer vision, we can more easily tar-get animals. Animal detection based on computer vision is an important branch of object recognition, which is applied to intelligent monitoring, smart driving, and environmental protection. At present, many animal detection methods have been proposed. However, animal detection is still a challenge due to the complexity of the background, the diversity of animal pos-es, and the obstruction of objects. An accurate algorithm is needed. In this paper, the fast Region-based Convolutional Neural Network (Faster R-CNN) is used. The proposed method was tested using the CAMERA_TRAP DATASET. The results show that the proposed animal detection method based on Faster R-CNN performs better in terms of detection accuracy when its performance is compared to conventional schemes


Author(s):  
Jinling Li ◽  
Yuhao Liu ◽  
Ahmed Tageldin ◽  
Mohamed H. Zaki ◽  
Greg Mori ◽  
...  

An approach for vehicle conflict analysis based on three-dimensional (3-D) vehicle detection is presented. Techniques for quantitative conflict measurements often use a point trajectory representation for vehicles. More accurate conflict measurement can be facilitated with a region-based vehicle representation instead. This paper describes a computer vision approach for extracting vehicle trajectories from video sequences. The method relied on a fusion of background subtraction and feature-based tracking to provide a three-dimensional (3-D) cuboid representation of the vehicle. Standard conflict measures, including time to collision and postencroachment time, were computed with the use of the 3-D cuboid vehicle representations. The use of these conflict measures was demonstrated on a challenging data set of video footage. Results showed that the region-based representation could provide more precise calculation of traffic conflict indicators compared with approaches based on a point representation.


Author(s):  
Lin Liu ◽  
Xuguang Wang ◽  
John Eck ◽  
Jun Liang

This chapter presents an innovative approach for simulating crime events and crime patterns. The theoretical basis of the crime simulation model is routine activities (RA) theory. Offenders, targets and crime places, the three basic elements of routine activities, are modeled as individual agents. The properties and behaviors of these agents change in space and time. The interactions of these three types of agents are modeled in a cellular automaton (CA). Tension, measuring the psychological impact of crime events to human beings, is the state variable of the CA. The model, after being calibrated by using a real crime data set in Cincinnati, is able to generate crime patterns similar to real patterns. Results from experimental runs of the model conform to known criminology theories. This type of RA/CA simulation model has the potential of being used to test new criminology theories and hypotheses.


Author(s):  
Rui Xu ◽  
Donald C. Wunsch II

To classify objects based on their features and characteristics is one of the most important and primitive activities of human beings. The task becomes even more challenging when there is no ground truth available. Cluster analysis allows new opportunities in exploring the unknown nature of data through its aim to separate a finite data set, with little or no prior information, into a finite and discrete set of “natural,” hidden data structures. Here, the authors introduce and discuss clustering algorithms that are related to machine learning and computational intelligence, particularly those based on neural networks. Neural networks are well known for their good learning capabilities, adaptation, ease of implementation, parallelization, speed, and flexibility, and they have demonstrated many successful applications in cluster analysis. The applications of cluster analysis in real world problems are also illustrated. Portions of the chapter are taken from Xu and Wunsch (2008).


BMJ Open ◽  
2020 ◽  
Vol 10 (7) ◽  
pp. e037161
Author(s):  
Hyunmin Ahn

ObjectivesWe investigated the usefulness of machine learning artificial intelligence (AI) in classifying the severity of ophthalmic emergency for timely hospital visits.Study designThis retrospective study analysed the patients who first visited the Armed Forces Daegu Hospital between May and December 2019. General patient information, events and symptoms were input variables. Events, symptoms, diagnoses and treatments were output variables. The output variables were classified into four classes (red, orange, yellow and green, indicating immediate to no emergency cases). About 200 cases of the class-balanced validation data set were randomly selected before all training procedures. An ensemble AI model using combinations of fully connected neural networks with the synthetic minority oversampling technique algorithm was adopted.ParticipantsA total of 1681 patients were included.Major outcomesModel performance was evaluated using accuracy, precision, recall and F1 scores.ResultsThe accuracy of the model was 99.05%. The precision of each class (red, orange, yellow and green) was 100%, 98.10%, 92.73% and 100%. The recalls of each class were 100%, 100%, 98.08% and 95.33%. The F1 scores of each class were 100%, 99.04%, 95.33% and 96.00%.ConclusionsWe provided support for an AI method to classify ophthalmic emergency severity based on symptoms.


2020 ◽  
Vol 10 (11) ◽  
pp. 4010 ◽  
Author(s):  
Kwang-il Kim ◽  
Keon Myung Lee

Marine resources are valuable assets to be protected from illegal, unreported, and unregulated (IUU) fishing and overfishing. IUU and overfishing detections require the identification of fishing gears for the fishing ships in operation. This paper is concerned with automatically identifying fishing gears from AIS (automatic identification system)-based trajectory data of fishing ships. It proposes a deep learning-based fishing gear-type identification method in which the six fishing gear type groups are identified from AIS-based ship movement data and environmental data. The proposed method conducts preprocessing to handle different lengths of messaging intervals, missing messages, and contaminated messages for the trajectory data. For capturing complicated dynamic patterns in trajectories of fishing gear types, a sliding window-based data slicing method is used to generate the training data set. The proposed method uses a CNN (convolutional neural network)-based deep neural network model which consists of the feature extraction module and the prediction module. The feature extraction module contains two CNN submodules followed by a fully connected network. The prediction module is a fully connected network which suggests a putative fishing gear type for the features extracted by the feature extraction module from input trajectory data. The proposed CNN-based model has been trained and tested with a real trajectory data set of 1380 fishing ships collected over a year. A new performance index, DPI (total performance of the day-wise performance index) is proposed to compare the performance of gear type identification techniques. To compare the performance of the proposed model, SVM (support vector machine)-based models have been also developed. In the experiments, the trained CNN-based model showed 0.963 DPI, while the SVM models showed 0.814 DPI on average for the 24-h window. The high value of the DPI index indicates that the trained model is good at identifying the types of fishing gears.


Author(s):  
Osama Alfarraj ◽  
Amr Tolba

Abstract The computer vision (CV) paradigm is introduced to improve the computational and processing system efficiencies through visual inputs. These visual inputs are processed using sophisticated techniques for improving the reliability of human–machine interactions (HMIs). The processing of visual inputs requires multi-level data computations for achieving application-specific reliability. Therefore, in this paper, a two-level visual information processing (2LVIP) method is introduced to meet the reliability requirements of HMI applications. The 2LVIP method is used for handling both structured and unstructured data through classification learning to extract the maximum gain from the inputs. The introduced method identifies the gain-related features on its first level and optimizes the features to improve information gain. In the second level, the error is reduced through a regression process to stabilize the precision to meet the HMI application demands. The two levels are interoperable and fully connected to achieve better gain and precision through the reduction in information processing errors. The analysis results show that the proposed method achieves 9.42% higher information gain and a 6.51% smaller error under different classification instances compared with conventional methods.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 475-476
Author(s):  
Arthur Francisco Araujo Fernandes ◽  
João R R Dorea ◽  
Robert Fitzgerald ◽  
William O Herring

Abstract Computer vision systems (CVS) have many applications in livestock, for example, they allow measuring traits of interest without the need for directly handling the animals, avoiding unnecessary animal stress. The objective in the current study was to devise an automated CVS for extraction of variables as body measurements and shape descriptors in pigs using depth images. These features were then tested as potential predictors of live body weight (BW) using a 5-fold cross validation (CV) with different modeling approaches: traditional multiple linear regression (LR), partial least squares (PLS), elastic networks (EL), and artificial neural networks (ANN). The devised CVS could analyze and extract features from a video fed at a rate of 12 frames per second. This resulted in a dataset with more than 32 thousand frames from 655 pigs. However, only the 580 pigs with more than 5 frames recorded were used for the development of the predictive models. From the body measures extracted from the images, body volume, area and length presented the highest correlations with BW, while widths and heights were highly correlated with each other (Figure 1). The results of the CV of the models developed for predictions of BW using a selected set of the more significant variables presented mean absolute errors (MAE) of 3.92, 3.78, 3.72, and 2.57 for PLS, LR, EN, and ANN respectively (Table 1). In conclusion, the CVS developed can automatically extract relevant variables from 3D images and a fully connected ANN with 6 hidden layers presented the overall best predictive results for BW.


Blood ◽  
2004 ◽  
Vol 104 (11) ◽  
pp. 2271-2271
Author(s):  
Carsten Schwaenen ◽  
Swen Wessendorf ◽  
Andreas Viardot ◽  
Sandra Ruf ◽  
Martina Enz ◽  
...  

Abstract Follicular Lymphoma (FL), one of the most frequent lymphoma entities in the western world, is characterized by a highly variable clinical course reaching from rapid progression with fatal outcome to cases with long term survival. In a recent study applying chromosomal comparative hybridization (CGH) to FL, in 70% of the cases genomic aberrations were detectable and a loss of genomic material on chromosomal bands 6q25-q27 was the strongest predictor for short overall survival. However, limitations of CGH as a screening method are a restricted genomic resolution to 3–10 Mbp and demanding non-automated evaluation procedures. Thus, high throughput analysis of genomic alterations for risk adapted patient stratification and monitoring within treatment trials should rely on efficient and automated diagnostic techniques. In this study, we used array CGH to a novel generation of DNA Chips containing 2800 genomic DNA probes. Target clones comprised i) contigs mapping to genomic regions of possible pathogenetic relevance in lymphoma (n=610 target clones mapping to e.g. 1p, 2p, 3q, 7q, 9p, 11q, 12q, 13q, 17p, 18q, X); ii) selected oncogenes and tumor suppressor genes (n=686) potentially relevant in hematologic neoplasms; and iii) a large genome-wide cluster of 1502 target DNA clones covering the genome at a distance of app. 2 Mbp (part of the golden path clone set). This chip represents a median genomic resolution of app. 1.5 Mbp. In total, DNAs from 70 FL samples were analyzed and results were compared to data from chromosomal CGH experiments and clinical data sets. The sensitivity of array CGH was considerably higher compared to chromosomal CGH (aberrations in 95% of cases vs 70% of cases). Most frequent aberrations were gain mapping to chromosome arms 2p (21%), 7p (24%), 7q (30%), 12p (17%), 12q (21%), 18p (21%) and 18q (34%) as well as losses mapping to chromosome arms 1p (19%), 6q (23%), 7p (20%), 11q (26%) and 17p (20%). In addition, several genomic aberrations were identified which have not been described before in FL. Currently, these aberrations are characterized in more detail and results will be correlated with the clinical data set. Moreover, three recurrent sites of genomic polymorphisms in human beings affecting chromosomes 5q, 14q and 15q were identified. In conclusion, these data underline the potential of array CGH for the sensitive detection of genomic imbalances in FL.


Sign in / Sign up

Export Citation Format

Share Document