scholarly journals Selected articles from the Fourth International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019)

2020 ◽  
Vol 20 (S4) ◽  
Author(s):  
Zhe He ◽  
Cui Tao ◽  
Jiang Bian ◽  
Rui Zhang

AbstractIn this introduction, we first summarize the Fourth International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019) held on October 26, 2019 in conjunction with the 18th International Semantic Web Conference (ISWC 2019) in Auckland, New Zealand, and then briefly introduce seven research articles included in this supplement issue, covering the topics on Knowledge Graph, Ontology-Powered Analytics, and Deep Learning.

2021 ◽  
pp. 097215092098485
Author(s):  
Sonika Gupta ◽  
Sushil Kumar Mehta

Data mining techniques have proven quite effective not only in detecting financial statement frauds but also in discovering other financial crimes, such as credit card frauds, loan and security frauds, corporate frauds, bank and insurance frauds, etc. Classification of data mining techniques, in recent years, has been accepted as one of the most credible methodologies for the detection of symptoms of financial statement frauds through scanning the published financial statements of companies. The retrieved literature that has used data mining classification techniques can be broadly categorized on the basis of the type of technique applied, as statistical techniques and machine learning techniques. The biggest challenge in executing the classification process using data mining techniques lies in collecting the data sample of fraudulent companies and mapping the sample of fraudulent companies against non-fraudulent companies. In this article, a systematic literature review (SLR) of studies from the area of financial statement fraud detection has been conducted. The review has considered research articles published between 1995 and 2020. Further, a meta-analysis has been performed to establish the effect of data sample mapping of fraudulent companies against non-fraudulent companies on the classification methods through comparing the overall classification accuracy reported in the literature. The retrieved literature indicates that a fraudulent sample can either be equally paired with non-fraudulent sample (1:1 data mapping) or be unequally mapped using 1:many ratio to increase the sample size proportionally. Based on the meta-analysis of the research articles, it can be concluded that machine learning approaches, in comparison to statistical approaches, can achieve better classification accuracy, particularly when the availability of sample data is low. High classification accuracy can be obtained with even a 1:1 mapping data set using machine learning classification approaches.


Symmetry ◽  
2019 ◽  
Vol 11 (3) ◽  
pp. 367 ◽  
Author(s):  
Martín López-Nores ◽  
Omar Bravo-Quezada ◽  
Maddalena Bassani ◽  
Angeliki Antoniou ◽  
Ioanna Lykourentzou ◽  
...  

Recent advances in semantic web and deep learning technologies enable new means for the computational analysis of vast amounts of information from the field of digital humanities. We discuss how some of the techniques can be used to identify historical and cultural symmetries between different characters, locations, events or venues, and how these can be harnessed to develop new strategies to promote intercultural and cross-border aspects that support the teaching and learning of history and heritage. The strategies have been put to the test in the context of the European project CrossCult, revealing enormous potential to encourage curiosity to discover new information and increase retention of learned information.


Author(s):  
Yu Zhu

The objective is to predict and analyze the behaviors of users in the social network platform by using the personality theory and computational technologies, thereby acquiring the personality characteristics of social network users more effectively. First, social network data are analyzed, which finds that the type of text data marks the majority. By using data mining technology, the raw data of numerous social network users can be obtained. Based on the random walk model, the data information of the text status of social network users is analyzed, and a user personality prediction method integrating multi-label learning is proposed. In addition, the online social network platform Weibo is taken as the research object. The blog information of Weibo users is obtained through crawler technology. Then, the users are labeled in accordance with personality characteristics. The Pearson correlation coefficient is used to evaluate the relation between the user personality characteristics and the user behavior characteristics of the Weibo users. The correlation between the network behaviors and personality characteristics of Weibo users is analyzed, and the scientificity of the prediction method is verified by the Big Five Model of Personality. By applying relevant technologies and algorithms of data mining and deep learning, the learning ability of neural networks on data characteristics can be improved. In terms of performance on analyzing text information of social network users, the user personality prediction method of integrated multi-label learning based on the random walk model has a large advantage. For the problem of personality prediction of social network users, through combining data mining technology and deep neural network technology in deep learning, the data processing results of social network user behaviors are more accurate.


Author(s):  
Shian-Chang Huang ◽  
Cheng-Feng Wu ◽  
Chei-Chang Chiou ◽  
Meng-Chen Lin

Author(s):  
Yujie Chen ◽  
Tengfei Ma ◽  
Xixi Yang ◽  
Jianmin Wang ◽  
Bosheng Song ◽  
...  

Abstract Motivation Adverse drug–drug interactions (DDIs) are crucial for drug research and mainly cause morbidity and mortality. Thus, the identification of potential DDIs is essential for doctors, patients and the society. Existing traditional machine learning models rely heavily on handcraft features and lack generalization. Recently, the deep learning approaches that can automatically learn drug features from the molecular graph or drug-related network have improved the ability of computational models to predict unknown DDIs. However, previous works utilized large labeled data and merely considered the structure or sequence information of drugs without considering the relations or topological information between drug and other biomedical objects (e.g. gene, disease and pathway), or considered knowledge graph (KG) without considering the information from the drug molecular structure. Results Accordingly, to effectively explore the joint effect of drug molecular structure and semantic information of drugs in knowledge graph for DDI prediction, we propose a multi-scale feature fusion deep learning model named MUFFIN. MUFFIN can jointly learn the drug representation based on both the drug-self structure information and the KG with rich bio-medical information. In MUFFIN, we designed a bi-level cross strategy that includes cross- and scalar-level components to fuse multi-modal features well. MUFFIN can alleviate the restriction of limited labeled data on deep learning models by crossing the features learned from large-scale KG and drug molecular graph. We evaluated our approach on three datasets and three different tasks including binary-class, multi-class and multi-label DDI prediction tasks. The results showed that MUFFIN outperformed other state-of-the-art baselines. Availability and implementation The source code and data are available at https://github.com/xzenglab/MUFFIN.


Author(s):  
Chong Chen ◽  
Ying Liu ◽  
Xianfang Sun ◽  
Shixuan Wang ◽  
Carla Di Cairano-Gilfedder ◽  
...  

Over the last few decades, reliability analysis has gained more and more attention as it can be beneficial in lowering the maintenance cost. Time between failures (TBF) is an essential topic in reliability analysis. If the TBF can be accurately predicted, preventive maintenance can be scheduled in advance in order to avoid critical failures. The purpose of this paper is to research the TBF using deep learning techniques. Deep learning, as a tool capable of capturing the highly complex and nonlinearly patterns, can be a useful tool for TBF prediction. The general principle of how to design deep learning model was introduced. By using a sizeable amount of automobile TBF dataset, we conduct an experiential study on TBF prediction by deep learning and several data mining approaches. The empirical results show the merits of deep learning in performance but comes with cost of high computational load.


Sign in / Sign up

Export Citation Format

Share Document