scholarly journals A TWO-CHANNEL MODEL FOR REPRESENTATION LEARNING IN VIETNAMESE SENTIMENT CLASSIFICATION PROBLEM

2020 ◽  
Vol 36 (4) ◽  
pp. 305-323
Author(s):  
Quan Hoang Nguyen ◽  
Ly Vu ◽  
Quang Uy Nguyen

Sentiment classification (SC) aims to determine whether a document conveys a positive or negative opinion. Due to the rapid development of the digital world, SC has become an important research topic that affects many aspects of our life. In SC based on machine learning, the representation of the document strongly influences on its accuracy. Word Embedding (WE)-based techniques, i.e., Word2vec techniques, are proved to be beneficial techniques to the SC problem. However, Word2vec is often not enough to represent the semantic of documents with complex sentences of Vietnamese. In this paper, we propose a new representation learning model called a \textbf{two-channel vector} to learn a higher-level feature of a document in SC. Our model uses two neural networks to learn the semantic feature, i.e., Word2vec and the syntactic feature, i.e., Part of Speech tag (POS). Two features are then combined and input to a \textit{Softmax} function to make the final classification. We carry out intensive experiments on $4$ recent Vietnamese sentiment datasets to evaluate the performance of the proposed architecture. The experimental results demonstrate that the proposed model can significantly enhance the accuracy of SC problems compared to two single models and a state-of-the-art ensemble method.

Semantic Web ◽  
2022 ◽  
pp. 1-16
Author(s):  
Hu Zhang ◽  
Jingjing Zhou ◽  
Ru Li ◽  
Yue Fan

With the rapid development of neural networks, much attention has been focused on network embedding for complex network data, which aims to learn low-dimensional embedding of nodes in the network and how to effectively apply learned network representations to various graph-based analytical tasks. Two typical models exist namely the shallow random walk network representation method and deep learning models such as graph convolution networks (GCNs). The former one can be used to capture the linear structure of the network using depth-first search (DFS) and width-first search (BFS), whereas Hierarchical GCN (HGCN) is an unsupervised graph embedding that can be used to describe the global nonlinear structure of the network via aggregating node information. However, the two existing kinds of models cannot simultaneously capture the nonlinear and linear structure information of nodes. Thus, the nodal characteristics of nonlinear and linear structures are explored in this paper, and an unsupervised representation method based on HGCN that joins learning of shallow and deep models is proposed. Experiments on node classification and dimension reduction visualization are carried out on citation, language, and traffic networks. The results show that, compared with the existing shallow network representation model and deep network model, the proposed model achieves better performances in terms of micro-F1, macro-F1 and accuracy scores.


Author(s):  
Yuan Zhang ◽  
Hongshen Chen ◽  
Yihong Zhao ◽  
Qun Liu ◽  
Dawei Yin

Sequence tagging is the basis for multiple applications in natural language processing. Despite successes in learning long term token sequence dependencies with neural network, tag dependencies are rarely considered previously. Sequence tagging actually possesses complex dependencies and interactions among the input tokens and the output tags. We propose a novel multi-channel model, which handles different ranges of token-tag dependencies and their interactions simultaneously. A tag LSTM is augmented to manage the output tag dependencies and word-tag interactions, while three mechanisms are presented to efficiently incorporate token context representation and tag dependency. Extensive experiments on part-of-speech tagging and named entity recognition tasks show that  the proposed model outperforms the BiLSTM-CRF baseline by effectively incorporating the tag dependency feature.


2020 ◽  
Vol 15 (2) ◽  
Author(s):  
Alih Aji Nugroho

The world is entering a new phase of the digital era, including Indonesia. The unification of the real world and cyberspace is a sign, where the conditions of both can influence each other (Hyung Jun, 2018). The patterns of behavior and public relations in the virtual universe gave rise to new social interactions called the Digital Society. One part of Global Megatrends has also influenced public policy in Indonesia in recent years. Critical mass previously carried out conventionally is now a virtual movement. War of hashtags, petitions, and digital community comments are new tools and strategies for influencing policy. This paper attempts to analyze the extent of digital society's influence on public policy in Indonesia. As well as what public policy models are needed. Methodology used in this analysis is qualitative descriptive. Data collection through literature studies by critical mass digital recognition in Indonesia and trying to find a relationship between political participation through social media and democracy. By processing the pro and contra views regarding the selection of social media as a level of participation, this paper finds that there are overlapping interests that have the potential to distort the articulation of freedom of opinion and participation. - which is characteristic of a democratic state. The result is the rapid development of digital society which greatly influences the public policy process. Digital society imagines being able to participate formally in influencing policy in Indonesia. The democracy that developed in the digital society is cyberdemocracy. Public space in the digital world must be guaranteed security and its impact on the policies that will be determined. The recommendation given to the government is that a cyber data analyst is needed to oversee the issues that are developing in the digital world. Regulations related to the security of digital public spaces must be maximized. The government maximizes cooperation with related stakeholders.Keywords: Digital Society; Democracy; Public policy; Political Participation


2021 ◽  
Vol 25 (3) ◽  
pp. 711-738
Author(s):  
Phu Pham ◽  
Phuc Do

Link prediction on heterogeneous information network (HIN) is considered as a challenge problem due to the complexity and diversity in types of nodes and links. Currently, there are remained challenges of meta-path-based link prediction in HIN. Previous works of link prediction in HIN via network embedding approach are mainly focused on exploiting features of node rather than existing relations in forms of meta-paths between nodes. In fact, predicting the existence of new links between non-linked nodes is absolutely inconvincible. Moreover, recent HIN-based embedding models also lack of thorough evaluations on the topic similarity between text-based nodes along given meta-paths. To tackle these challenges, in this paper, we proposed a novel approach of topic-driven multiple meta-path-based HIN representation learning framework, namely W-MMP2Vec. Our model leverages the quality of node representations by combining multiple meta-paths as well as calculating the topic similarity weight for each meta-path during the processes of network embedding learning in content-based HINs. To validate our approach, we apply W-TMP2Vec model in solving several link prediction tasks in both content-based and non-content-based HINs (DBLP, IMDB and BlogCatalog). The experimental outputs demonstrate the effectiveness of proposed model which outperforms recent state-of-the-art HIN representation learning models.


2020 ◽  
pp. 1-17
Author(s):  
Dongqi Yang ◽  
Wenyu Zhang ◽  
Xin Wu ◽  
Jose H. Ablanedo-Rosas ◽  
Lingxiao Yang ◽  
...  

With the rapid development of commercial credit mechanisms, credit funds have become fundamental in promoting the development of manufacturing corporations. However, large-scale, imbalanced credit application information poses a challenge to accurate bankruptcy predictions. A novel multi-stage ensemble model with fuzzy clustering and optimized classifier composition is proposed herein by combining the fuzzy clustering-based classifier selection method, the random subspace (RS)-based classifier composition method, and the genetic algorithm (GA)-based classifier compositional optimization method to achieve accuracy in predicting bankruptcy among corporates. To overcome the inherent inflexibility of traditional hard clustering methods, a new fuzzy clustering-based classifier selection method is proposed based on the mini-batch k-means algorithm to obtain the best performing base classifiers for generating classifier compositions. The RS-based classifier composition method was applied to enhance the robustness of candidate classifier compositions by randomly selecting several subspaces in the original feature space. The GA-based classifier compositional optimization method was applied to optimize the parameters of the promising classifier composition through the iterative mechanism of the GA. Finally, six datasets collected from the real world were tested with four evaluation indicators to assess the performance of the proposed model. The experimental results showed that the proposed model outperformed the benchmark models with higher predictive accuracy and efficiency.


Author(s):  
Joonas Kokkoniemi ◽  
Janne Lehtomäki ◽  
Markku Juntti

AbstractThis paper documents a simple parametric polynomial line-of-sight channel model for 100–450 GHz band. The band comprises two popular beyond fifth generation (B5G) frequency bands, namely, the D band (110–170 GHz) and the low-THz band (around 275–325 GHz). The main focus herein is to derive a simple, compact, and accurate molecular absorption loss model for the 100–450 GHz band. The derived model relies on simple absorption line shape functions that are fitted to the actual response given by complex but exact database approach. The model is also reducible for particular sub-bands within the full range of 100–450 GHz, further simplifying the absorption loss estimate. The proposed model is shown to be very accurate by benchmarking it against the exact response and the similar models given by International Telecommunication Union Radio Communication Sector. The loss is shown to be within ±2 dBs from the exact response for one kilometer link in highly humid environment. Therefore, its accuracy is even much better in the case of usually considered shorter range future B5G wireless systems.


Author(s):  
Junshu Wang ◽  
Guoming Zhang ◽  
Wei Wang ◽  
Ka Zhang ◽  
Yehua Sheng

AbstractWith the rapid development of hospital informatization and Internet medical service in recent years, most hospitals have launched online hospital appointment registration systems to remove patient queues and improve the efficiency of medical services. However, most of the patients lack professional medical knowledge and have no idea of how to choose department when registering. To instruct the patients to seek medical care and register effectively, we proposed CIDRS, an intelligent self-diagnosis and department recommendation framework based on Chinese medical Bidirectional Encoder Representations from Transformers (BERT) in the cloud computing environment. We also established a Chinese BERT model (CHMBERT) trained on a large-scale Chinese medical text corpus. This model was used to optimize self-diagnosis and department recommendation tasks. To solve the limited computing power of terminals, we deployed the proposed framework in a cloud computing environment based on container and micro-service technologies. Real-world medical datasets from hospitals were used in the experiments, and results showed that the proposed model was superior to the traditional deep learning models and other pre-trained language models in terms of performance.


2017 ◽  
Vol 2017 ◽  
pp. 1-15 ◽  
Author(s):  
Jianwen Ding ◽  
Lei Zhang ◽  
Jingya Yang ◽  
Bin Sun ◽  
Jiying Huang

The rapid development of high-speed railway (HSR) and train-ground communications with high reliability, safety, and capacity promotes the evolution of railway dedicated mobile communication systems from Global System for Mobile Communications-Railway (GSM-R) to Long Term Evolution-Railway (LTE-R). The main challenges for LTE-R network planning are the rapidly time-varying channel and high mobility, because HSR lines consist of a variety of complex terrains, especially the composite scenarios where tunnels, cuttings, and viaducts are connected together within a short distance. Existing researches mainly focus on the path loss and delay spread for the individual HSR scenarios. In this paper, the broadband measurements are performed using a channel sounder at 950 MHz and 2150 MHz in a typical HSR composite scenario. Based on the measurements, the pivotal characteristics are analyzed for path loss exponent, power delay profile, and tap delay line model. Then, the deterministic channel model in which the 3D ray-tracing algorithm is applied in the composite scenario is presented and validated by the measurement data. Based on the ray-tracing simulations, statistical analysis of channel characteristics in delay and Doppler domain is carried out for the HSR composite scenario. The research results can be useful for radio interface design and optimization of LTE-R system.


2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Wen-Jun Li ◽  
Qiang Dong ◽  
Yan Fu

As the rapid development of mobile Internet and smart devices, more and more online content providers begin to collect the preferences of their customers through various apps on mobile devices. These preferences could be largely reflected by the ratings on the online items with explicit scores. Both of positive and negative ratings are helpful for recommender systems to provide relevant items to a target user. Based on the empirical analysis of three real-world movie-rating data sets, we observe that users’ rating criterions change over time, and past positive and negative ratings have different influences on users’ future preferences. Given this, we propose a recommendation model on a session-based temporal graph, considering the difference of long- and short-term preferences, and the different temporal effect of positive and negative ratings. The extensive experiment results validate the significant accuracy improvement of our proposed model compared with the state-of-the-art methods.


2017 ◽  
Vol 2017 ◽  
pp. 1-12 ◽  
Author(s):  
Xin Chen ◽  
Yong Fang ◽  
Weidong Xiang ◽  
Liang Zhou

In this paper, an extension of spatial channel model (SCM) for vehicle-to-vehicle (V2V) communication channel in roadside scattering environment is investigated for the first time theoretically and by simulations. Subsequently, to efficiently describe the roadside scattering environment and reflect the nonstationary properties of V2V channels, the proposed SCM V2V model divides the scattering objects into three categories of clusters according to the location of effective scatterers by introducing critical distance. We derive general expressions for the most important statistical properties of V2V channels, such as channel impulse response, power spectral density, angular power density, autocorrelation function, and Doppler spread of the proposed model. The impact of vehicle speed, traffic density, and angle of departure, angle of arrival, and other statistical performances on the V2V channel model is thoroughly discussed. Numerical simulation results are presented to validate the accuracy and effectiveness of the proposed model.


Sign in / Sign up

Export Citation Format

Share Document