A General Methodology to Quantify Biases in Natural Language Data

Author(s):  
Jiawei Chen ◽  
Anbang Xu ◽  
Zhe Liu ◽  
Yufan Guo ◽  
Xiaotong Liu ◽  
...  
2021 ◽  
Vol 21 (2) ◽  
pp. 1-25
Author(s):  
Pin Ni ◽  
Yuming Li ◽  
Gangmin Li ◽  
Victor Chang

Cyber-Physical Systems (CPS), as a multi-dimensional complex system that connects the physical world and the cyber world, has a strong demand for processing large amounts of heterogeneous data. These tasks also include Natural Language Inference (NLI) tasks based on text from different sources. However, the current research on natural language processing in CPS does not involve exploration in this field. Therefore, this study proposes a Siamese Network structure that combines Stacked Residual Long Short-Term Memory (bidirectional) with the Attention mechanism and Capsule Network for the NLI module in CPS, which is used to infer the relationship between text/language data from different sources. This model is mainly used to implement NLI tasks and conduct a detailed evaluation in three main NLI benchmarks as the basic semantic understanding module in CPS. Comparative experiments prove that the proposed method achieves competitive performance, has a certain generalization ability, and can balance the performance and the number of trained parameters.


2002 ◽  
Vol 8 (2-3) ◽  
pp. 93-96
Author(s):  
AFZAL BALLIM ◽  
VINCENZO PALLOTTA

The automated analysis of natural language data has become a central issue in the design of intelligent information systems. Processing unconstrained natural language data is still considered as an AI-hard task. However, various analysis techniques have been proposed to address specific aspects of natural language. In particular, recent interest has been focused on providing approximate analysis techniques, assuming that when perfect analysis is not possible, partial results may be still very useful.


1983 ◽  
Vol 9 (1) ◽  
pp. 233-244
Author(s):  
Bonnie Webber ◽  
Aravind Joshi ◽  
Eric Mays ◽  
Kathleen McKeown

Author(s):  
Phyo Htet Hein ◽  
Varun Menon ◽  
Beshoy Morkos

Prior research performed by Morkos [1], culminated in the automated requirement change propagation prediction (ARCPP) tool which utilized natural language data in requirements to predict change propagation throughout a requirements document as a result of an initiating requirement change. Whereas the prior research proved requirements can be used to predict change propagation, the purpose of this case study is to understand why. Specifically, what parts of a requirement affect its ability to predict change propagation? This is performed by addressing two key research questions: (1) Is the requirement review depth affected by the number of relators selected to relate requirements and (2) What elements of a requirement are responsible for instigating change propagation, the physical (nouns) or functional (verbs) domain? The results of this study assist in understanding whether the physical or functional domain have a greater effect on the instigation of change propagation. The results indicated that the review depth, an indicator of the performance of the ARCPP tool, is not affected by the number of relators, but rather by the ability of relators in relating the propagating relationships. Further, nouns are found to be more contributing to predicting change propagation in requirements. Therefore, the physical domain is more effective in predicting requirement change propagation than the functional domain.


1995 ◽  
Vol 04 (04) ◽  
pp. 387-403
Author(s):  
SHUSAKU TSUMOTO ◽  
HIROSHI TANAKA ◽  
HIROMI AMANO ◽  
KIMIE OHYAMA ◽  
TAKAYUKI KURODA

Medical data consist of many kinds of data from different resources, such as natural language data, sound data from physical examinations, numerical data from laboratory examinations, time-series data from monitoring systems, and medical images (e.g. X-ray, Computer Tomography, and Magnetic Resonance Image). Therefore it has been pointed out that medical databases should be implemented as multidatabases. However, there have been few systems which integrate these data into multidatabases. In this paper, we report a system called COBRA (Computer-Operated Birth-defect Recognition Aid), which supports diagnosis and information retrieval of congenital malformation diseases and which also integrates natural language data, sound data, numerical data, and medical images into multidatabases on syndrome of congenital malformation. The results show that object-oriented scheme makes it easy to implement and integrate these knowledge-databases in COBRA, which suggests that these clinical databases should be implemented as object-oriented databases.


Sign in / Sign up

Export Citation Format

Share Document