DeepSoft: a vision for a deep model of software

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents---or short passages---in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms---such as a person's name or a product model number---not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections---such as the document index of a commercial Web search engine---containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks. We ground our contributions with a detailed survey of the growing body of neural IR literature [Mitra and Craswell, 2018]. Our key contribution towards improving the effectiveness of deep ranking models is developing the Duet principle [Mitra et al., 2017] which emphasizes the importance of incorporating evidence based on both patterns of exact term matches and similarities between learned latent representations of query and document. To efficiently retrieve from large collections, we develop a framework to incorporate query term independence [Mitra et al., 2019] into any arbitrary deep model that enables large-scale precomputation and the use of inverted index for fast retrieval. In the context of stochastic ranking, we further develop optimization strategies for exposure-based objectives [Diaz et al., 2020]. Finally, this dissertation also summarizes our contributions towards benchmarking neural IR models in the presence of large training datasets [Craswell et al., 2019] and explores the application of neural methods to other IR tasks, such as query auto-completion.

Download Full-text

T^2-Net: A Semi-Supervised Deep Model for Turbulence Forecasting

2020 IEEE International Conference on Data Mining (ICDM) ◽

10.1109/icdm50108.2020.00182 ◽

2020 ◽

Author(s):

Denghui Zhang ◽

Yanchi Liu ◽

Wei Cheng ◽

Bo Zong ◽

Jingchao Ni ◽

...

Keyword(s):

Deep Model

Download Full-text

Context-aware Deep Model for Joint Mobility and Time Prediction

Proceedings of the 13th International Conference on Web Search and Data Mining ◽

10.1145/3336191.3371837 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yile Chen ◽

Cheng Long ◽

Gao Cong ◽

Chenliang Li

Keyword(s):

Joint Mobility ◽

Context Aware ◽

Time Prediction ◽

Deep Model

Download Full-text

Chain-Net: Learning Deep Model for Modulation Classification Under Synthetic Channel Impairment

GLOBECOM 2020 - 2020 IEEE Global Communications Conference ◽

10.1109/globecom42002.2020.9322394 ◽

2020 ◽

Author(s):

Thien Huynh-The ◽

Van-Sang Doan ◽

Cam-Hao Hua ◽

Quoc-Viet Pham ◽

Dong-Seong Kim

Keyword(s):

Modulation Classification ◽

Deep Model ◽

Channel Impairment

Download Full-text

Explainable Models of Disease Progression in ALS: Learning from Longitudinal Clinical Data with Recurrent Neural Networks and Deep Model Explanation

Computer Methods and Programs in Biomedicine Update ◽

10.1016/j.cmpbup.2021.100018 ◽

2021 ◽

pp. 100018

Author(s):

Marcel Müller ◽

Marta Gromicho ◽

Mamede de Carvalho ◽

Sara C. Madeira

Keyword(s):

Neural Networks ◽

Disease Progression ◽

Clinical Data ◽

Recurrent Neural Networks ◽

Deep Model ◽

Models Of Disease ◽

Model Explanation

Download Full-text

lncLocator 2.0: a cell-line-specific subcellular localization predictor for long non-coding RNAs with interpretable deep learning

Bioinformatics ◽

10.1093/bioinformatics/btab127 ◽

2021 ◽

Author(s):

Yang Lin ◽

Xiaoyong Pan ◽

Hong-Bin Shen

Keyword(s):

Subcellular Localization ◽

Cell Line ◽

Cell Lines ◽

Short Term Memory ◽

Computational Method ◽

Language Models ◽

Supplementary Information ◽

Deep Model ◽

A Cell ◽

Non Coding Rnas

Abstract Motivation Long non-coding RNAs (lncRNAs) are generally expressed in a tissue-specific way, and subcellular localizations of lncRNAs depend on the tissues or cell lines that they are expressed. Previous computational methods for predicting subcellular localizations of lncRNAs do not take this characteristic into account, they train a unified machine learning model for pooled lncRNAs from all available cell lines. It is of importance to develop a cell-line-specific computational method to predict lncRNA locations in different cell lines. Results In this study, we present an updated cell-line-specific predictor lncLocator 2.0, which trains an end-to-end deep model per cell line, for predicting lncRNA subcellular localization from sequences.We first construct benchmark datasets of lncRNA subcellular localizations for 15 cell lines. Then we learn word embeddings using natural language models, and these learned embeddings are fed into convolutional neural network, long short-term memory and multilayer perceptron to classify subcellular localizations. lncLocator 2.0 achieves varying effectiveness for different cell lines and demonstrates the necessity of training cell-line-specific models. Furthermore, we adopt Integrated Gradients to explain the proposed model in lncLocator 2.0, and find some potential patterns that determine the subcellular localizations of lncRNAs, suggesting that the subcellular localization of lncRNAs is linked to some specific nucleotides. Availability The lncLocator 2.0 is available at www.csbio.sjtu.edu.cn/bioinf/lncLocator2 and the source code can be found at https://github.com/Yang-J-LIN/lncLocator2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Pedestrian detection algorithm in traffic scene based on weakly supervised hierarchical deep model

International Journal of Advanced Robotic Systems ◽

10.1177/1729881417692311 ◽

2016 ◽

Vol 14 (1) ◽

pp. 172988141769231 ◽

Cited By ~ 2

Author(s):

Yingfeng Cai ◽

Youguo He ◽

Hai Wang ◽

Xiaoqiang Sun ◽

Long Chen ◽

...

Keyword(s):

Deep Learning ◽

Pedestrian Detection ◽

Recognition Rate ◽

Detection Algorithm ◽

Training Methods ◽

Two Dimensional ◽

Data Set ◽

Deep Model ◽

Unsupervised Training ◽

Weakly Supervised

The emergence and development of deep learning theory in machine learning field provide new method for visual-based pedestrian recognition technology. To achieve better performance in this application, an improved weakly supervised hierarchical deep learning pedestrian recognition algorithm with two-dimensional deep belief networks is proposed. The improvements are made by taking into consideration the weaknesses of structure and training methods of existing classifiers. First, traditional one-dimensional deep belief network is expanded to two-dimensional that allows image matrix to be loaded directly to preserve more information of a sample space. Then, a determination regularization term with small weight is added to the traditional unsupervised training objective function. By this modification, original unsupervised training is transformed to weakly supervised training. Subsequently, that gives the extracted features discrimination ability. Multiple sets of comparative experiments show that the performance of the proposed algorithm is better than other deep learning algorithms in recognition rate and outperforms most of the existing state-of-the-art methods in non-occlusion pedestrian data set while performs fair in weakly and heavily occlusion data set.

Download Full-text

CBAM: A Contextual Model for Network Anomaly Detection

Computers ◽

10.3390/computers10060079 ◽

2021 ◽

Vol 10 (6) ◽

pp. 79

Author(s):

Henry Clausen ◽

Gudmund Grov ◽

David Aspinall

Keyword(s):

Intrusion Detection ◽

Network Flows ◽

Concept Drift ◽

False Positive Rate ◽

Real Life ◽

Remote Access ◽

Detection Methods ◽

Short Term ◽

Deep Model ◽

Network Intrusion

Anomaly-based intrusion detection methods aim to combat the increasing rate of zero-day attacks, however, their success is currently restricted to the detection of high-volume attacks using aggregated traffic features. Recent evaluations show that the current anomaly-based network intrusion detection methods fail to reliably detect remote access attacks. These are smaller in volume and often only stand out when compared to their surroundings. Currently, anomaly methods try to detect access attack events mainly as point anomalies and neglect the context they appear in. We present and examine a contextual bidirectional anomaly model (CBAM) based on deep LSTM-networks that is specifically designed to detect such attacks as contextual network anomalies. The model efficiently learns short-term sequential patterns in network flows as conditional event probabilities. Access attacks frequently break these patterns when exploiting vulnerabilities, and can thus be detected as contextual anomalies. We evaluated CBAM on an assembly of three datasets that provide both representative network access attacks, real-life traffic over a long timespan, and traffic from a real-world red-team attack. We contend that this assembly is closer to a potential deployment environment than current NIDS benchmark datasets. We show that, by building a deep model, we are able to reduce the false positive rate to 0.16% while effectively detecting six out of seven access attacks, which is significantly lower than the operational range of other methods. We further demonstrate that short-term flow structures remain stable over long periods of time, making the CBAM robust against concept drift.

Download Full-text