scholarly journals Scalable Multi-grained Cross-modal Similarity Query with Interpretability

Author(s):  
Mingdong Zhu ◽  
Derong Shen ◽  
Lixin Xu ◽  
Xianfang Wang

AbstractCross-modal similarity query has become a highlighted research topic for managing multimodal datasets such as images and texts. Existing researches generally focus on query accuracy by designing complex deep neural network models and hardly consider query efficiency and interpretability simultaneously, which are vital properties of cross-modal semantic query processing system on large-scale datasets. In this work, we investigate multi-grained common semantic embedding representations of images and texts and integrate interpretable query index into the deep neural network by developing a novel Multi-grained Cross-modal Query with Interpretability (MCQI) framework. The main contributions are as follows: (1) By integrating coarse-grained and fine-grained semantic learning models, a multi-grained cross-modal query processing architecture is proposed to ensure the adaptability and generality of query processing. (2) In order to capture the latent semantic relation between images and texts, the framework combines LSTM and attention mode, which enhances query accuracy for the cross-modal query and constructs the foundation for interpretable query processing. (3) Index structure and corresponding nearest neighbor query algorithm are proposed to boost the efficiency of interpretable queries. (4) A distributed query algorithm is proposed to improve the scalability of our framework. Comparing with state-of-the-art methods on widely used cross-modal datasets, the experimental results show the effectiveness of our MCQI approach.

2021 ◽  
Author(s):  
Colin Conwell ◽  
David Mayo ◽  
Boris Katz ◽  
Michael A. Buice ◽  
George A. Alvarez ◽  
...  

How well do deep neural networks fare as models of mouse visual cortex? A majority of research to date suggests results far more mixed than those produced in the modeling of primate visual cortex. Here, we perform a large-scale benchmarking of dozens of deep neural network models in mouse visual cortex with multiple methods of comparison and multiple modes of verification. Using the Allen Brain Observatory's 2-photon calcium-imaging dataset of activity in over 59,000 rodent visual cortical neurons recorded in response to natural scenes, we replicate previous findings and resolve previous discrepancies, ultimately demonstrating that modern neural networks can in fact be used to explain activity in the mouse visual cortex to a more reasonable degree than previously suggested. Using our benchmark as an atlas, we offer preliminary answers to overarching questions about levels of analysis (e.g. do models that better predict the representations of individual neurons also predict representational geometry across neural populations?); questions about the properties of models that best predict the visual system overall (e.g. does training task or architecture matter more for augmenting predictive power?); and questions about the mapping between biological and artificial representations (e.g. are there differences in the kinds of deep feature spaces that predict neurons from primary versus posteromedial visual cortex?). Along the way, we introduce a novel, highly optimized neural regression method that achieves SOTA scores (with gains of up to 34%) on the publicly available benchmarks of primate BrainScore. Simultaneously, we benchmark a number of models (including vision transformers, MLP-Mixers, normalization free networks and Taskonomy encoders) outside the traditional circuit of convolutional object recognition. Taken together, our results provide a reference point for future ventures in the deep neural network modeling of mouse visual cortex, hinting at novel combinations of method, architecture, and task to more fully characterize the computational motifs of visual representation in a species so indispensable to neuroscience.


Author(s):  
Jingxian Li ◽  
Lixin Han ◽  
Xiaoshuang Li ◽  
Jun Zhu ◽  
Baohua Yuan ◽  
...  

Author(s):  
Jing Bi ◽  
Yongze Lin ◽  
Quanxi Dong ◽  
Haitao Yuan ◽  
MengChu Zhou

Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1514
Author(s):  
Seung-Ho Lim ◽  
WoonSik William Suh ◽  
Jin-Young Kim ◽  
Sang-Young Cho

The optimization for hardware processor and system for performing deep learning operations such as Convolutional Neural Networks (CNN) in resource limited embedded devices are recent active research area. In order to perform an optimized deep neural network model using the limited computational unit and memory of an embedded device, it is necessary to quickly apply various configurations of hardware modules to various deep neural network models and find the optimal combination. The Electronic System Level (ESL) Simulator based on SystemC is very useful for rapid hardware modeling and verification. In this paper, we designed and implemented a Deep Learning Accelerator (DLA) that performs Deep Neural Network (DNN) operation based on the RISC-V Virtual Platform implemented in SystemC in order to enable rapid and diverse analysis of deep learning operations in an embedded device based on the RISC-V processor, which is a recently emerging embedded processor. The developed RISC-V based DLA prototype can analyze the hardware requirements according to the CNN data set through the configuration of the CNN DLA architecture, and it is possible to run RISC-V compiled software on the platform, can perform a real neural network model like Darknet. We performed the Darknet CNN model on the developed DLA prototype, and confirmed that computational overhead and inference errors can be analyzed with the DLA prototype developed by analyzing the DLA architecture for various data sets.


Author(s):  
Yuheng Hu ◽  
Yili Hong

Residents often rely on newspapers and television to gather hyperlocal news for community awareness and engagement. More recently, social media have emerged as an increasingly important source of hyperlocal news. Thus far, the literature on using social media to create desirable societal benefits, such as civic awareness and engagement, is still in its infancy. One key challenge in this research stream is to timely and accurately distill information from noisy social media data streams to community members. In this work, we develop SHEDR (social media–based hyperlocal event detection and recommendation), an end-to-end neural event detection and recommendation framework with a particular use case for Twitter to facilitate residents’ information seeking of hyperlocal events. The key model innovation in SHEDR lies in the design of the hyperlocal event detector and the event recommender. First, we harness the power of two popular deep neural network models, the convolutional neural network (CNN) and long short-term memory (LSTM), in a novel joint CNN-LSTM model to characterize spatiotemporal dependencies for capturing unusualness in a region of interest, which is classified as a hyperlocal event. Next, we develop a neural pairwise ranking algorithm for recommending detected hyperlocal events to residents based on their interests. To alleviate the sparsity issue and improve personalization, our algorithm incorporates several types of contextual information covering topic, social, and geographical proximities. We perform comprehensive evaluations based on two large-scale data sets comprising geotagged tweets covering Seattle and Chicago. We demonstrate the effectiveness of our framework in comparison with several state-of-the-art approaches. We show that our hyperlocal event detection and recommendation models consistently and significantly outperform other approaches in terms of precision, recall, and F-1 scores. Summary of Contribution: In this paper, we focus on a novel and important, yet largely underexplored application of computing—how to improve civic engagement in local neighborhoods via local news sharing and consumption based on social media feeds. To address this question, we propose two new computational and data-driven methods: (1) a deep learning–based hyperlocal event detection algorithm that scans spatially and temporally to detect hyperlocal events from geotagged Twitter feeds; and (2) A personalized deep learning–based hyperlocal event recommender system that systematically integrates several contextual cues such as topical, geographical, and social proximity to recommend the detected hyperlocal events to potential users. We conduct a series of experiments to examine our proposed models. The outcomes demonstrate that our algorithms are significantly better than the state-of-the-art models and can provide users with more relevant information about the local neighborhoods that they live in, which in turn may boost their community engagement.


ChemMedChem ◽  
2021 ◽  
Author(s):  
Christoph Grebner ◽  
Hans Matter ◽  
Daniel Kofink ◽  
Jan Wenzel ◽  
Friedemann Schmidt ◽  
...  

1997 ◽  
pp. 931-935 ◽  
Author(s):  
Anders Lansner ◽  
Örjan Ekeberg ◽  
Erik Fransén ◽  
Per Hammarlund ◽  
Tomas Wilhelmsson

2021 ◽  
Vol 10 (9) ◽  
pp. 25394-25398
Author(s):  
Chitra Desai

Deep learning models have demonstrated improved efficacy in image classification since the ImageNet Large Scale Visual Recognition Challenge started since 2010. Classification of images has further augmented in the field of computer vision with the dawn of transfer learning. To train a model on huge dataset demands huge computational resources and add a lot of cost to learning. Transfer learning allows to reduce on cost of learning and also help avoid reinventing the wheel. There are several pretrained models like VGG16, VGG19, ResNet50, Inceptionv3, EfficientNet etc which are widely used.   This paper demonstrates image classification using pretrained deep neural network model VGG16 which is trained on images from ImageNet dataset. After obtaining the convolutional base model, a new deep neural network model is built on top of it for image classification based on fully connected network. This classifier will use features extracted from the convolutional base model.


2013 ◽  
Vol 441 ◽  
pp. 691-694
Author(s):  
Yi Qun Zeng ◽  
Jing Bin Wang

With the rapid development of information technology, data grows explosionly, how to deal with the large scale data become more and more important. Based on the characteristics of RDF data, we propose to compress RDF data. We construct an index structure called PAR-Tree Index, then base on the MapReduce parallel computing framework and the PAR-Tree Index to execute the query. Experimental results show that the algorithm can improve the efficiency of large data query.


Sign in / Sign up

Export Citation Format

Share Document