Hierarchical Concept-Driven Language Model

For guiding natural language generation, many semantic-driven methods have been proposed. While clearly improving the performance of the end-to-end training task, these existing semantic-driven methods still have clear limitations: for example, (i) they only utilize shallow semantic signals (e.g., from topic models) with only a single stochastic hidden layer in their data generation process, which suffer easily from noise (especially adapted for short-text etc.) and lack of interpretation; (ii) they ignore the sentence order and document context, as they treat each document as a bag of sentences, and fail to capture the long-distance dependencies and global semantic meaning of a document. To overcome these problems, we propose a novel semantic-driven language modeling framework, which is a method to learn a Hierarchical Language Model and a Recurrent Conceptualization-enhanced Gamma Belief Network, simultaneously. For scalable inference, we develop the auto-encoding Variational Recurrent Inference, allowing efficient end-to-end training and simultaneously capturing global semantics from a text corpus. Especially, this article introduces concept information derived from high-quality lexical knowledge graph Probase, which leverages strong interpretability and anti-nose capability for the proposed model. Moreover, the proposed model captures not only intra-sentence word dependencies, but also temporal transitions between sentences and inter-sentence concept dependence. Experiments conducted on several NLP tasks validate the superiority of the proposed approach, which could effectively infer meaningful hierarchical concept structure of document and hierarchical multi-scale structures of sequences, even compared with latest state-of-the-art Transformer-based models.

Download Full-text

Relevance-Promoting Language Model for Short-Text Conversation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6340 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8253-8260 ◽

Cited By ~ 1

Author(s):

Xin Li ◽

Piji Li ◽

Wei Bi ◽

Xiaojiang Liu ◽

Wai Lam

Keyword(s):

Language Model ◽

Language Modeling ◽

The Self ◽

Experimental Results ◽

Training Data ◽

Short Text ◽

Training Strategy ◽

Proposed Model ◽

Multiple References ◽

Diversity Metrics

Despite the effectiveness of sequence-to-sequence framework on the task of Short-Text Conversation (STC), the issue of under-exploitation of training data (i.e., the supervision signals from query text is ignored) still remains unresolved. Also, the adopted maximization-based decoding strategies, inclined to generating the generic responses or responses with repetition, are unsuited to the STC task. In this paper, we propose to formulate the STC task as a language modeling problem and tailor-make a training strategy to adapt a language model for response generation. To enhance generation performance, we design a relevance-promoting transformer language model, which performs additional supervised source attention after the self-attention to increase the importance of informative query tokens in calculating the token-level representation. The model further refines the query representation with relevance clues inferred from its multiple references during training. In testing, we adopt a randomization-over-maximization strategy to reduce the generation of generic responses. Experimental results on a large Chinese STC dataset demonstrate the superiority of the proposed model on relevance metrics and diversity metrics.1

Download Full-text

Component-based machine learning paradigm for discovering rate-dependent and pressure-sensitive level-set plasticity models

Journal of Applied Mechanics ◽

10.1115/1.4052684 ◽

2021 ◽

pp. 1-13

Author(s):

Nikolaos Napoleon Vlassis ◽

Waiching Sun

Keyword(s):

Neural Network ◽

Machine Learning ◽

Path Dependence ◽

Time History ◽

Supervised Machine Learning ◽

Generation Process ◽

Data Generation ◽

Modeling Framework ◽

The Neural Network ◽

Path Dependent

Abstract Conventionally, neural network constitutive laws for path-dependent elasto-plastic solids are trained via supervised learning performed on recurrent neural network, with the time history of strain as input and the stress as input. However, training neural network to replicate path-dependent constitutive responses require significant more amount of data due to the path dependence. This demand on diverse and abundance of accurate data, as well as the lack of interpretability to guide the data generation process, could become major roadblocks for engineering applications. In this work, we attempt to simplify these training processes and improve the interpretability of the trained models by breaking down the training of material models into multiple supervised machine learning programs for elasticity, initial yielding and hardening laws that can be conducted sequentially. To predict pressure-sensitivity and rate dependence of the plastic responses, we reformulate the Hamliton-Jacobi equation such that the yield function is parametrized in the product space spanned by the principle stress, the accumulated plastic strain and time. To test the versatility of the neural network meta-modeling framework, we conduct multiple numerical experiments where neural networks are trained and validated against (1) data generated from known benchmark models, (2) data obtained from physical experiments and (3) data inferred from homogenizing sub-scale direct numerical simulations of microstructures. The neural network model is also incorporated into an offline FFT-FEM model to improve the efficiency of the multiscale calculations.

Download Full-text

Integrating Community Interest and Neighbor Semantic for Microblog Recommendation

International Journal of Web Services Research ◽

10.4018/ijwsr.2021040104 ◽

2021 ◽

Vol 18 (2) ◽

pp. 54-75

Author(s):

Mingxin Gan ◽

Xiongtao Zhang

Keyword(s):

Language Model ◽

Real Data ◽

Typical Characteristic ◽

Sina Weibo ◽

Short Text ◽

Community Interest ◽

Proposed Model ◽

Text Length ◽

Performance Results ◽

State Of Art

As a typical characteristic of microblog information, short text length makes a microblog recommendation hard for new users. Moreover, user cold start makes it difficult to explore accurately the interests of microblog users. Therefore, the authors proposed a microblog recommendation model that integrates both of the users' interest from their communities and the semantic from their neighbors' microblogs. Based on the Kullback-Leibler (KL) language model, the proposed model estimated an interest-based language model and a microblog-based language model. Specifically, the interest-based language model was estimated based on both of the user's word set of interest and that of their community interest. Meanwhile, the microblog-based language model was estimated by combining the word set of a microblog, the neighbor semantic, and the microblog set. Real data from Sina Weibo was crawled to evaluate recommendation performance. Results showed that the proposed model outperforms state-of-art models significantly.

Download Full-text

Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition

2021 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt48900.2021.9383515 ◽

2021 ◽

Author(s):

Zhong Meng ◽

Sarangarajan Parthasarathy ◽

Eric Sun ◽

Yashesh Gaur ◽

Naoyuki Kanda ◽

...

Keyword(s):

Speech Recognition ◽

Language Model ◽

Model Estimation ◽

End To End

Download Full-text

Leveraging Pre-Trained Language Model for Summary Generation on Short Text

IEEE Access ◽

10.1109/access.2020.3045748 ◽

2020 ◽

Vol 8 ◽

pp. 228798-228803

Author(s):

Shuai Zhao ◽

Fucheng You ◽

Zeng Yuan Liu

Keyword(s):

Language Model ◽

Short Text

Download Full-text

Electric Vehicle Charger Placement Optimization in Michigan Considering Monthly Traffic Demand and Battery Performance Variations

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120981958 ◽

2021 ◽

pp. 036119812098195

Author(s):

Fatemeh Fakhrmoosavi ◽

MohammadReza Kavianipour ◽

MohammadHossein (Sam) Shojaei ◽

Ali Zockaie ◽

Mehrnaz Ghamami ◽

...

Keyword(s):

Optimal Allocation ◽

Critical Factor ◽

Cold Weather ◽

Contributing Factors ◽

Modeling Framework ◽

Long Distance ◽

Charging Infrastructure ◽

Traffic Demand ◽

Adverse Weather ◽

Performance Variations

Limited charging infrastructure for electric vehicles (EVs) is one of the main barriers to adoption of these vehicles. In conjunction with limited battery range, the lack of charging infrastructure leads to range-anxiety, which may discourage many potential users. This problem is especially important for long-distance or intercity trips. Monthly traffic patterns and battery performance variations are two main contributing factors in defining the infrastructure needs of EV users, particularly in states with adverse weather conditions. Knowing this, the current study focuses on Michigan and its future needs to support the intercity trips of EVs across the state in two target years of 2020 and 2030, considering monthly traffic demand and battery performance variations. This study incorporates a recently developed modeling framework to suggest the optimal locations of fast EV chargers to be implemented in Michigan. Considering demand and battery performance variations is the major contribution of the current study to the proposed modeling framework by the same authors in the literature. Furthermore, many stakeholders in Michigan are engaged to estimate the input parameters. Therefore, the research study can be used by authorities as an applied model for optimal allocation of resources to place EV fast chargers. The results show that for charger placement, the reduced battery performance in cold weather is a more critical factor than the increased demand in warm seasons. To support foreseeable annual EV trips in Michigan in 2030, this study suggests 36 charging stations with 490 chargers and an investment cost of $23 million.

Download Full-text

BLAINDER—A Blender AI Add-On for Generation of Semantically Labeled Depth-Sensing Data

Sensors ◽

10.3390/s21062144 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2144

Author(s):

Stefan Reitmann ◽

Lorenzo Neumann ◽

Bernhard Jung

Keyword(s):

Point Clouds ◽

Training Data ◽

Sensor Data ◽

Generation Process ◽

Data Generation ◽

Depth Sensor ◽

Depth Sensing ◽

Depth Sensors ◽

3D Point Clouds ◽

Wide Range

Common Machine-Learning (ML) approaches for scene classification require a large amount of training data. However, for classification of depth sensor data, in contrast to image data, relatively few databases are publicly available and manual generation of semantically labeled 3D point clouds is an even more time-consuming task. To simplify the training data generation process for a wide range of domains, we have developed the BLAINDER add-on package for the open-source 3D modeling software Blender, which enables a largely automated generation of semantically annotated point-cloud data in virtual 3D environments. In this paper, we focus on classical depth-sensing techniques Light Detection and Ranging (LiDAR) and Sound Navigation and Ranging (Sonar). Within the BLAINDER add-on, different depth sensors can be loaded from presets, customized sensors can be implemented and different environmental conditions (e.g., influence of rain, dust) can be simulated. The semantically labeled data can be exported to various 2D and 3D formats and are thus optimized for different ML applications and visualizations. In addition, semantically labeled images can be exported using the rendering functionalities of Blender.

Download Full-text

Allocation and Scheduling of Handling Resources in the Railway Container Terminal Based on Crossing Crane Area

Sustainability ◽

10.3390/su13031190 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1190

Author(s):

Gang Ren ◽

Xiaohan Wang ◽

Jiaxin Cai ◽

Shujuan Guo

Keyword(s):

Programming Model ◽

Container Terminal ◽

Simultaneous Optimization ◽

Mixed Integer ◽

Gantry Crane ◽

Optimization Scheme ◽

Long Distance ◽

Proposed Model ◽

Hybrid Heuristic Algorithm ◽

Searching Ability

The integrated allocation and scheduling of handling resources are crucial problems in the railway container terminal (RCT). We investigate the integrated optimization problem for handling resources of the crane area, dual-gantry crane (GC), and internal trucks (ITs). A creative handling scheme is proposed to reduce the long-distance, full-loaded movement of GCs by making use of the advantages of ITs. Based on this scheme, we propose a flexible crossing crane area to balance the workload of dual-GC. Decomposing the integrated problem into four sub-problems, a multi-objective mixed-integer programming model (MIP) is developed. By analyzing the characteristic of the integrated problem, a three-layer hybrid heuristic algorithm (TLHHA) incorporating heuristic rule (HR), elite co-evolution genetic algorithm (ECEGA), greedy rule (GR), and simulated annealing (SA) is designed for solving the problem. Numerical experiments were conducted to verify the effectiveness of the proposed model and algorithm. The results show that the proposed algorithm has excellent searching ability, and the simultaneous optimization scheme could ensure the requirements for efficiency, effectiveness, and energy-saving, as well as the balance rate of dual-GC.

Download Full-text

Unsupervised learning by competing hidden units

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1820458116 ◽

2019 ◽

Vol 116 (16) ◽

pp. 7723-7731 ◽

Cited By ~ 16

Author(s):

Dmitry Krotov ◽

John J. Hopfield

Keyword(s):

Learning Algorithm ◽

Lower Layer ◽

Learning Rule ◽

Backpropagation Algorithm ◽

Feedforward Networks ◽

Feature Detectors ◽

End To End ◽

Hidden Layer ◽

Full Network ◽

Global Inhibition

It is widely believed that end-to-end training with the backpropagation algorithm is essential for learning good feature detectors in early layers of artificial neural networks, so that these detectors are useful for the task performed by the higher layers of that neural network. At the same time, the traditional form of backpropagation is biologically implausible. In the present paper we propose an unusual learning rule, which has a degree of biological plausibility and which is motivated by Hebb’s idea that change of the synapse strength should be local—i.e., should depend only on the activities of the pre- and postsynaptic neurons. We design a learning algorithm that utilizes global inhibition in the hidden layer and is capable of learning early feature detectors in a completely unsupervised way. These learned lower-layer feature detectors can be used to train higher-layer weights in a usual supervised way so that the performance of the full network is comparable to the performance of standard feedforward networks trained end-to-end with a backpropagation algorithm on simple tasks.

Download Full-text

Data Ownership and Secure Medical Data Transmission using Optimal Multiple Key-Based Homomorphic Encryption with Hyperledger Blockchain

International Journal of Image and Graphics ◽

10.1142/s0219467822400034 ◽

2021 ◽

Author(s):

Naresh Sammeta ◽

Latha Parthiban

Keyword(s):

Data Transmission ◽

Homomorphic Encryption ◽

Medical Data ◽

Healthcare Sector ◽

Transmission Model ◽

Generation Process ◽

Suggested Model ◽

Blockchain Technology ◽

Proposed Model ◽

Data Ownership

Recent healthcare systems are defined as highly complex and expensive. But it can be decreased with enhanced electronic health records (EHR) management, using blockchain technology. The healthcare sector in today’s world needs to address two major issues, namely data ownership and data security. Therefore, blockchain technology is employed to access and distribute the EHRs. With this motivation, this paper presents novel data ownership and secure medical data transmission model using optimal multiple key-based homomorphic encryption (MHE) with Hyperledger blockchain (OMHE-HBC). The presented OMHE-HBC model enables the patients to access their own data, provide permission to hospital authorities, revoke permission from hospital authorities, and permit emergency contacts. The proposed model involves the MHE technique to securely transmit the data to the cloud and prevent unauthorized access to it. Besides, the optimal key generation process in the MHE technique takes place using a hosted cuckoo optimization (HCO) algorithm. In addition, the proposed model enables sharing of EHRs by the use of multi-channel HBC, which makes use of one blockchain to save patient visits and another one for the medical institutions in recoding links that point to EHRs stored in external systems. A complete set of experiments were carried out in order to validate the performance of the suggested model, and the results were analyzed under many aspects. A comprehensive comparison of results analysis reveals that the suggested model outperforms the other techniques.

Download Full-text