Learning Improved Semantic Representations with Tree-Structured LSTM for Hashtag Recommendation: An Experimental Study

A hashtag is a type of metadata tag used on social networks, such as Twitter and other microblogging services. Hashtags indicate the core idea of a microblog post and can help people to search for specific themes or content. However, not everyone tags their posts themselves. Therefore, the task of hashtag recommendation has received significant attention in recent years. To solve the task, a key problem is how to effectively represent the text of a microblog post in a way that its representation can be utilized for hashtag recommendation. We study two major kinds of text representation methods for hashtag recommendation, including shallow textual features and deep textual features learned by deep neural models. Most existing work tries to use deep neural networks to learn microblog post representation based on the semantic combination of words. In this paper, we propose to adopt Tree-LSTM to improve the representation by combining the syntactic structure and the semantic information of words. We conduct extensive experiments on two real world datasets. The experimental results show that deep neural models generally perform better than traditional methods. Specially, Tree-LSTM achieves significantly better results on hashtag recommendation than standard LSTM, with a 30% increase in F1-score, which indicates that it is promising to utilize syntactic structure in the task of hashtag recommendation.

Download Full-text

Effective Deep Memory Networks for Distant Supervised Relation Extraction

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/559 ◽

2017 ◽

Cited By ~ 11

Author(s):

Xiaocheng Feng ◽

Jiang Guo ◽

Bing Qin ◽

Ting Liu ◽

Yongjie Liu

Keyword(s):

Relation Extraction ◽

Training Data ◽

Context Word ◽

Neural Models ◽

Attention Model ◽

Memory Experiment ◽

Feature Based ◽

Real World Datasets ◽

Major Attention ◽

Better Than

Distant supervised relation extraction (RE) has been an effective way of finding novel relational facts from text without labeled training data. Typically it can be formalized as a multi-instance multi-label problem.In this paper, we introduce a novel neural approach for distant supervised (RE) with specific focus on attention mechanisms.Unlike the feature-based logistic regression model and compositional neural models such as CNN, our approach includes two major attention-based memory components, which is capable of explicitly capturing the importance of each context word for modeling the representation of the entity pair, as well as the intrinsic dependencies between relations.Such importance degree and dependency relationship are calculated with multiple computational layers, each of which is a neural attention model over an external memory. Experiment on real-world datasets shows that our approach performs significantly and consistently better than various baselines.

Download Full-text

TransNFCM: Translation-Based Neural Fashion Compatibility Modeling

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301403 ◽

2019 ◽

Vol 33 ◽

pp. 403-410 ◽

Cited By ~ 8

Author(s):

Xun Yang ◽

Yunshan Ma ◽

Lizi Liao ◽

Meng Wang ◽

Tat-Seng Chua

Keyword(s):

Deep Neural Networks ◽

Representation Learning ◽

Fashion Products ◽

The Arts ◽

Urgent Task ◽

Textual Features ◽

The Rich ◽

Real World Datasets ◽

Specific Pair ◽

Mix And Match

Identifying mix-and-match relationships between fashion items is an urgent task in a fashion e-commerce recommender system. It will significantly enhance user experience and satisfaction. However, due to the challenges of inferring the rich yet complicated set of compatibility patterns in a large e-commerce corpus of fashion items, this task is still underexplored. Inspired by the recent advances in multirelational knowledge representation learning and deep neural networks, this paper proposes a novel Translation-based Neural Fashion Compatibility Modeling (TransNFCM) framework, which jointly optimizes fashion item embeddings and category-specific complementary relations in a unified space via an end-to-end learning manner. TransNFCM places items in a unified embedding space where a category-specific relation (category-comp-category) is modeled as a vector translation operating on the embeddings of compatible items from the corresponding categories. By this way, we not only capture the specific notion of compatibility conditioned on a specific pair of complementary categories, but also preserve the global notion of compatibility. We also design a deep fashion item encoder which exploits the complementary characteristic of visual and textual features to represent the fashion products. To the best of our knowledge, this is the first work that uses category-specific complementary relations to model the category-aware compatibility between items in a translation-based embedding space. Extensive experiments demonstrate the effectiveness of TransNFCM over the state-of-the-arts on two real-world datasets.

Download Full-text

Improving adversarial robustness of deep neural networks by using semantic information

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107141 ◽

2021 ◽

pp. 107141

Author(s):

Lina Wang ◽

Xingshu Chen ◽

Rui Tang ◽

Yawei Yue ◽

Yi Zhu ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Semantic Information

Download Full-text

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Download Full-text

Data-Driven Structural Health Monitoring and Damage Detection through Deep Learning: State-of-the-Art Review

Sensors ◽

10.3390/s20102778 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2778 ◽

Cited By ~ 12

Author(s):

Mohsen Azimi ◽

Armin Eslamlou ◽

Gokhan Pekcan

Keyword(s):

Deep Learning ◽

Structural Health Monitoring ◽

Health Monitoring ◽

High Speed ◽

Deep Neural Networks ◽

State Of The Art ◽

Data Driven ◽

Structural Health ◽

Promising Tool ◽

Significant Attention

Data-driven methods in structural health monitoring (SHM) is gaining popularity due to recent technological advancements in sensors, as well as high-speed internet and cloud-based computation. Since the introduction of deep learning (DL) in civil engineering, particularly in SHM, this emerging and promising tool has attracted significant attention among researchers. The main goal of this paper is to review the latest publications in SHM using emerging DL-based methods and provide readers with an overall understanding of various SHM applications. After a brief introduction, an overview of various DL methods (e.g., deep neural networks, transfer learning, etc.) is presented. The procedure and application of vibration-based, vision-based monitoring, along with some of the recent technologies used for SHM, such as sensors, unmanned aerial vehicles (UAVs), etc. are discussed. The review concludes with prospects and potential limitations of DL-based methods in SHM applications.

Download Full-text

Early emergence of syntactic awareness and cross-linguistic influence in bilingual children’s judgments

International Journal of Bilingualism ◽

10.1177/1367006911425818 ◽

2011 ◽

Vol 15 (4) ◽

pp. 521-534 ◽

Cited By ~ 12

Author(s):

Cassandra Foursha-Stevenson ◽

Elena Nicoladis

Keyword(s):

Syntactic Structure ◽

Language Choice ◽

Preschool Age ◽

Bilingual Children ◽

Metalinguistic Awareness ◽

Syntactic Awareness ◽

Grammaticality Judgment ◽

English Bilingual ◽

French And English ◽

Better Than

Bilingual children sometimes perform better than same-aged monolingual children on metalinguistic awareness tasks, such as a grammaticality judgment. Some of these differences can be attributed to bilinguals having to learn to control attention to language choice. This study tested the hypothesis that bilingual children, as young as preschool age, would score overall higher than monolingual children on a grammaticality judgment test. French–English bilingual preschoolers judged the acceptability of three constructions in French and English (i.e. adjective–noun ordering, obligatoriness of a determiner, and object pronoun placement). Their performance was compared with that of a group of age-matched English monolinguals. The results showed that the bilingual children scored higher than the monolingual children. These results demonstrate that syntactic awareness develops quite early for bilinguals. Additionally, the bilingual children demonstrated cross-linguistic influence of core syntactic structure in French, as their judgments were affected by English acceptability.

Download Full-text

Channel state information–based multi-level fingerprinting for indoor localization with deep learning

International Journal of Distributed Sensor Networks ◽

10.1177/1550147718806719 ◽

2018 ◽

Vol 14 (10) ◽

pp. 155014771880671 ◽

Cited By ~ 2

Author(s):

Tao Li ◽

Hai Wang ◽

Yuan Shao ◽

Qiang Niu

Keyword(s):

Deep Learning ◽

Channel State Information ◽

Deep Neural Networks ◽

Training Phase ◽

Channel State ◽

Positioning Accuracy ◽

State Information ◽

Filtering Method ◽

Multi Level ◽

Better Than

With the rapid growth of indoor positioning requirements without equipment and the convenience of channel state information acquisition, the research on indoor fingerprint positioning based on channel state information is increasingly valued. In this article, a multi-level fingerprinting approach is proposed, which is composed of two-level methods: the first layer is achieved by deep learning and the second layer is implemented by the optimal subcarriers filtering method. This method using channel state information is termed multi-level fingerprinting with deep learning. Deep neural networks are applied in the deep learning of the first layer of multi-level fingerprinting with deep learning, which includes two phases: an offline training phase and an online localization phase. In the offline training phase, deep neural networks are used to train the optimal weights. In the online localization phase, the top five closest positions to the location position are obtained through forward propagation. The second layer optimizes the results of the first layer through the optimal subcarriers filtering method. Under the accuracy of 0.6 m, the positioning accuracy of two common environments has reached, respectively, 96% and 93.9%. The evaluation results show that the positioning accuracy of this method is better than the method based on received signal strength, and it is better than the support vector machine method, which is also slightly improved compared with the deep learning method.

Download Full-text

Sentence Generation for Entity Description with Content-Plan Attention

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6439 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9057-9064

Author(s):

Bayu Trisedya ◽

Jianzhong Qi ◽

Rui Zhang

Keyword(s):

State Of The Art ◽

Neural Models ◽

Time Step ◽

Two Stage ◽

Sentence Generation ◽

Neural Data ◽

Attention Model ◽

Linear Sequence ◽

Proper Order ◽

Real World Datasets

We study neural data-to-text generation. Specifically, we consider a target entity that is associated with a set of attributes. We aim to generate a sentence to describe the target entity. Previous studies use encoder-decoder frameworks where the encoder treats the input as a linear sequence and uses LSTM to encode the sequence. However, linearizing a set of attributes may not yield the proper order of the attributes, and hence leads the encoder to produce an improper context to generate a description. To handle disordered input, recent studies propose two-stage neural models that use pointer networks to generate a content-plan (i.e., content-planner) and use the content-plan as input for an encoder-decoder model (i.e., text generator). However, in two-stage models, the content-planner may yield an incomplete content-plan, due to missing one or more salient attributes in the generated content-plan. This will in turn cause the text generator to generate an incomplete description. To address these problems, we propose a novel attention model that exploits content-plan to highlight salient attributes in a proper order. The challenge of integrating a content-plan in the attention model of an encoder-decoder framework is to align the content-plan and the generated description. We handle this problem by devising a coverage mechanism to track the extent to which the content-plan is exposed in the previous decoding time-step, and hence it helps our proposed attention model select the attributes to be mentioned in the description in a proper order. Experimental results show that our model outperforms state-of-the-art baselines by up to 3% and 5% in terms of BLEU score on two real-world datasets, respectively.

Download Full-text

A Self-Supervised Representation Learning of Sentence Structure for Authorship Attribution

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3491203 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-16

Author(s):

Fereshteh Jafariakinabad ◽

Kien A. Hua

Keyword(s):

Structural Information ◽

Syntactic Structure ◽

Representation Learning ◽

Authorship Attribution ◽

Sentence Structure ◽

Vector Representation ◽

Writing Style ◽

Neural Models ◽

Syntactic Information ◽

Classification Tasks

The syntactic structure of sentences in a document substantially informs about its authorial writing style. Sentence representation learning has been widely explored in recent years and it has been shown that it improves the generalization of different downstream tasks across many domains. Even though utilizing probing methods in several studies suggests that these learned contextual representations implicitly encode some amount of syntax, explicit syntactic information further improves the performance of deep neural models in the domain of authorship attribution. These observations have motivated us to investigate the explicit representation learning of syntactic structure of sentences. In this article, we propose a self-supervised framework for learning structural representations of sentences. The self-supervised network contains two components; a lexical sub-network and a syntactic sub-network which take the sequence of words and their corresponding structural labels as the input, respectively. Due to the n -to-1 mapping of words to their structural labels, each word will be embedded into a vector representation which mainly carries structural information. We evaluate the learned structural representations of sentences using different probing tasks, and subsequently utilize them in the authorship attribution task. Our experimental results indicate that the structural embeddings significantly improve the classification tasks when concatenated with the existing pre-trained word embeddings.

Download Full-text

Network Embedding via a Bi-Mode and Deep Neural Network Model

10.20944/preprints201712.0156.v1 ◽

2017 ◽

Author(s):

Yang Fang ◽

Xiang Zhao ◽

Zhen Tan

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Semantic Information ◽

Dimensional Space ◽

Relation Extraction ◽

Network Embedding ◽

Structure Information ◽

Second Mode ◽

Real World Datasets ◽

Low Dimensional

Network Embedding (NE) is an important method to learn the representations of network via a low-dimensional space. Conventional NE models focus on capturing the structure information and semantic information of vertices while neglecting such information for edges. In this work, we propose a novel NE model named BimoNet to capture both the structure and semantic information of edges. BimoNet is composed of two parts, i.e., the bi-mode embedding part and the deep neural network part. For bi-mode embedding part, the first mode named add-mode is used to express the entity-shared features of edges and the second mode named subtract-mode is employed to represent the entity-specific features of edges. These features actually reflect the semantic information. For deep neural network part, we firstly regard the edges in a network as nodes, and the vertices as links, which will not change the overall structure of the whole network. Then we take the nodes' adjacent matrix as the input of the deep neural network as it can obtain similar representations for nodes with similar structure. Afterwards, by jointly optimizing the objective function of these two parts, BimoNet could preserve both the semantic and structure information of edges. In experiments, we evaluate BimoNet on three real-world datasets and task of relation extraction, and BimoNet is demonstrated to outperform state-of-the-art baseline models consistently and significantly.

Download Full-text