Complex Terminology Extraction Model from Unstructured Web Text Based Linguistic and Statistical Knowledge

Fethi Fkih; Mohamed Nazih Omri

doi:10.4018/ijirr.2012070101

Complex Terminology Extraction Model from Unstructured Web Text Based Linguistic and Statistical Knowledge

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2012070101 ◽

2012 ◽

Vol 2 (3) ◽

pp. 1-18 ◽

Cited By ~ 12

Author(s):

Fethi Fkih ◽

Mohamed Nazih Omri

Keyword(s):

Conditional Random Fields ◽

Conditional Random Field ◽

Linguistic Knowledge ◽

Terminology Extraction ◽

Statistical Knowledge ◽

Textual Data ◽

Text Content ◽

Extraction Model ◽

Source Of Information

Textual data remain the most interesting source of information in the web. In the authors’ research, they focus on a very specific kind of information namely “complex terms”. Indeed, complex terms are defined as semantic units composed of several lexical units that can describe in a relevant and exhaustive way the text content. In this paper, they present a new model for complex terminology extraction (COTEM), which integrates linguistic and statistical knowledge. Thus, the authors try to focus on three main contributions: firstly, they show the possibility of using a linear Conditional Random Fields (CRF) for complex terminology extraction from a specialized text corpus. Secondly, prove the ability of a Conditional Random Field to model linguistic knowledge by incorporating grammatical observations in the CRF’s features. Finally, the authors present the benefits gained by the integration of statistical knowledge on the quality of the terminology extraction.

Download Full-text

Selection by Publication in Economics

Acta Oeconomica ◽

10.1556/aoecon.55.2005.3.1 ◽

2005 ◽

Vol 55 (3) ◽

pp. 255-269 ◽

Cited By ~ 1

Author(s):

András Simonovits

Keyword(s):

Selection Process ◽

Professional Standards ◽

Publication Process ◽

Entry Costs ◽

Scientific Papers ◽

Financial Interests ◽

Source Of Information

According to the dominant view, the quality of individual scientific papers can be evaluated by the standard of the journal in which they are published. This paper attempts to demonstrate the limits of this view in the field of economics. According to our main findings, a publication frequently serves as a signal of high professional standards rather than as a source of information; referees and editors frequently reject good papers and accept bad ones; citation indices only partially balance the distortions deriving from the selection process; there are essential “entry costs” to the publication process. Moreover, financial interests of publishers may contradict scientific interests. As long as leading economists do not give voice to their dissatisfaction, there is no hope for any reform of the selection process.

Download Full-text

Renal Manifestations of Fabry Disease: A Narrative Review

Canadian Journal of Kidney Health and Disease ◽

10.1177/2054358120985627 ◽

2021 ◽

Vol 8 ◽

pp. 205435812098562

Author(s):

Cassiano Augusto Braga Silva ◽

José A. Moura-Neto ◽

Marlene Antônia dos Reis ◽

Osvaldo Merege Vieira Neto ◽

Fellype Carvalho Barreto

Keyword(s):

Fabry Disease ◽

Specific Treatment ◽

Narrative Review ◽

Decision Making Process ◽

Formal Tool ◽

General Aspects ◽

Progressive Accumulation ◽

Source Of Information ◽

Worse Prognosis

Purpose of review: In this narrative review, we describe general aspects, histological alterations, treatment, and implications of Fabry disease (FD) nephropathy. This information should be used to guide physicians and patients in a shared decision-making process. Source of information: Original peer-reviewed articles, review articles, and opinion pieces were identified from PubMed and Google Scholar databases. Only sources in English were accessed. Methods: We performed a focused narrative review assessing the main aspects of FD nephropathy. The literature was critically analyzed from a theoretical and contextual perspective, and thematic analysis was performed. Key findings: FD nephropathy is related to the progressive accumulation of GL3, which occurs in all types of renal cells. It is more prominent in podocytes, which seem to play an important role in the pathogenesis of this nephropathy. A precise detection of renal disorders is of fundamental importance because the specific treatment of FD is usually delayed, making reversibility unlikely and leading to a worse prognosis. Limitations: As no formal tool was applied to assess the quality of the included studies, selection bias may have occurred. Nonetheless, we have attempted to provide a comprehensive review on the topic using current studies from experts in FD and extensive review of the literature.

Download Full-text

Machine Learning for Dissimulating Reality

Proceedings ◽

10.3390/proceedings2021077017 ◽

2021 ◽

Vol 77 (1) ◽

pp. 17

Author(s):

Andrea Giussani

Keyword(s):

Machine Learning ◽

Language Processing ◽

Huge Amount ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Technological Advances ◽

Textual Data ◽

Musical Scores ◽

Mathematical Formulas

In the last decade, advances in statistical modeling and computer science have boosted the production of machine-produced contents in different fields: from language to image generation, the quality of the generated outputs is remarkably high, sometimes better than those produced by a human being. Modern technological advances such as OpenAI’s GPT-2 (and recently GPT-3) permit automated systems to dramatically alter reality with synthetic outputs so that humans are not able to distinguish the real copy from its counteracts. An example is given by an article entirely written by GPT-2, but many other examples exist. In the field of computer vision, Nvidia’s Generative Adversarial Network, commonly known as StyleGAN (Karras et al. 2018), has become the de facto reference point for the production of a huge amount of fake human face portraits; additionally, recent algorithms were developed to create both musical scores and mathematical formulas. This presentation aims to stimulate participants on the state-of-the-art results in this field: we will cover both GANs and language modeling with recent applications. The novelty here is that we apply a transformer-based machine learning technique, namely RoBerta (Liu et al. 2019), to the detection of human-produced versus machine-produced text concerning fake news detection. RoBerta is a recent algorithm that is based on the well-known Bidirectional Encoder Representations from Transformers algorithm, known as BERT (Devlin et al. 2018); this is a bi-directional transformer used for natural language processing developed by Google and pre-trained over a huge amount of unlabeled textual data to learn embeddings. We will then use these representations as an input of our classifier to detect real vs. machine-produced text. The application is demonstrated in the presentation.

Download Full-text

Islam dan Pluralisme Pendidikan Agama

Al-Riwayah Jurnal Kependidikan ◽

10.47945/al-riwayah.v13i2.423 ◽

2021 ◽

Vol 13 (2) ◽

pp. 301-316

Author(s):

Hasruddin Dute

Keyword(s):

Religious Education ◽

Education System ◽

Quality Of Education ◽

Moral Value ◽

Formal Institution ◽

Library Research ◽

Source Of Information

This study aims to explain Pluralism and Islam as inseparable entities. Can be distinguished conceptually but cannot be separated in real reality. The method used. The method used in this research is library research, or can be used in library materials as a source of information to answer problems about educators in education. Islam and Religious Education Pluralism make religion a concept to create a sense of unity in the realm of ukhuwah basyariyah in advancing and improving the quality of education; Therefore, it is the education system that makes religion a moral value and not a formal institution that is formed. Penelitian bertujuan untuk menjelaskan Pluralisme dan Islam sebagai entitas yang tidak bisa dipisahkan. Dapat dibedakan secara konseptual namun tidak bisa dipisahkan dalam realitas nyata. Metode yang dipakai. Metode yang dipakai di dalam penelitian ini adalah penelitian kepustakaan, atau dapat digunakan dalam bahan pustaka sebagai sumber informasi untuk menjawab permasalahan tentang pendidik dalam pendidikan. Islam dan Pluralisme Pendidikan Agama menjadikan agama sebagai konsep untuk menimbulkan rasa persatuan dalam ranah ukhuwah basyariyah dalam memajukan dan meningkatkan mutu pendidikan; Oleh karena itu, sistem pendidikanlah yang menjadikan agama sebagai nilai moral dan bukan lembaga formal yang terbentuk.

Download Full-text

Youtube as a source of information on "Manual Blood Pressure Measurement"

Kosuyolu Heart Journal ◽

10.51645/khj.2021.m183 ◽

2021 ◽

Author(s):

Mehmet Fatih Yılmaz ◽

Sedat Kalkan

Keyword(s):

Blood Pressure ◽

Pressure Measurement ◽

Health Workers ◽

Blood Pressure Measurement ◽

Average Score ◽

Educational Value ◽

Manual Blood Pressure ◽

The Individual ◽

Source Of Information

Objectives: The aim of the study is to evaluate the quality and reliability of videos on manual blood pressure measurement on Youtube. Patients and Methods: In January 2021, the first 100 videos found as a result of a search with the keywords 'manual blood pressure measurement' on Youtube were watched and evaluated. According to exclusion criteria, 75 videos were included in the study. Duplicate videos, irrelevant videos, and videos in languages other than English were excluded from the study. Each video was scored according to the questions prepared based on the guidelines. The GQS score and the 'Reliability' score were used to assess the quality of the videos. Results: According to the checklist prepared according to the hypertension consensus report, the mean score of the videos was 8.33 ± 2.1. When the videos were evaluated according to their sources, the average score of the videos of the health sites was 9±2.5, the average score of the videos of the individual health workers was 8.66±1.8, the average score of the videos of the unidentified people was 7.54±2.1. Conclusion: Manual blood pressure measurement videos on Youtube have little educational value. Videos of health websites should be preferred for education.

Download Full-text

Index Blending: Enabling the Development of Definitive, Discipline-Specific Resources

Information Technology and Libraries ◽

10.6017/ital.v26i2.3279 ◽

2007 ◽

Vol 26 (2) ◽

pp. 27

Author(s):

Sam Brooks ◽

Mark Herrick

Keyword(s):

Applied Sciences ◽

Mass Media ◽

Full Text ◽

Research Experience ◽

Specific Research ◽

Database Development ◽

Factors Associated ◽

Bibliographic Records ◽

Source Of Information

Index Blending is the process of database development whereby various components are merged and refined to create a single encompassing source of information. Once a research need is determined for a given area of study, existing resources are examined for value and possible contribution to the end product. Index Blending focuses on the quality of bibliographic records as the primary factor with the addition of full text to enhance the end user’s research experience as an added convenience. Key examples of the process of Index Blending involve the fields of communication and mass media, hospitality and tourism, as well as computers and applied sciences. When academia, vendors, subject experts, lexicographers, and other contributors are brought together through the various factors associated with Index Blending, relevant discipline-specific research may be greatly enhanced.

Download Full-text

Operasi Tangkap Tangan (OTT) Tinjauan Berdasarkan KUHAP Dan Undang Undang Nomor 30 Tahun 2002 Tentang Komisi Pemberantasan Korupsi (KPK)

Corruptio ◽

10.25041/corruptio.v1i2.2096 ◽

2020 ◽

Vol 1 (2) ◽

pp. 75

Author(s):

Frisca Tyara M Fanhar

Keyword(s):

Criminal Procedure ◽

Legal Basis ◽

Administration System ◽

Procedure Code ◽

Reliable Source ◽

Standard Operating ◽

Source Of Information ◽

The Ideal ◽

Legal Force

The ambiguity regarding the mechanism and limits of the authority of the arresting operations carried out by the corruption eradication commission raises public assumption that the authority exercised by the corruption eradication commission has violated the law and even violated human rights, namely taking arbitrary actions (unprocedure).The problem in writing this study is How can the legal force of the operation of the corruption eradication commission arrest if viewed from the aspect of the Criminal Procedure Code and Law Number 30 of 2002 Concerning the Corruption Eradication Commission? What are the criteria for an alleged crime using a arrest operation? What is the ideal way for the Corruption Eradication Commission to carry out arrest operations? This study uses a Normative and Empirical Juridical approach. Normative research is carried out on matters that are theoretical principles of law, whereas an empirical approach is carried out to study law in reality. The results of the study found that legal force of the act of arrest operations of the corruption eradication commission if viewed from the aspect of the Criminal Procedure Code and Law Number 30 of 2002 Concerning the Corruption Eradication Commission actually the act of arrest operations of the corruption eradication commission does not have a strong legal basis from the juridical aspects of criminal law. problem of violating the principle of due processof law Criteria for Alleged Crime Using a hand arrest operation due to the type or quality of the target of corruption is not a simple crime and therefore the need for a hand arrest operation, and ideally the commission of asan corruption In carrying out arrest operations, it is necessary to have a good case administration system starting from the stage of collecting data and information that is based on an accurate and reliable source of information, after that conducting an investigation in accordance with the standard operating procedures that have been determined then at the execution stage such as conduct monitoring, undercover, tapping and then the operation of arresting the authority must be in accordance with the legislation. The suggestion that can be done is that the legal basis for Operation of Catching Hands must be immediately included in the article instrument in the corruption eradication commission law so that its authority is not at issue

Download Full-text

Effect of Temporal Arrangement on Multimedia Learning

Inquiry@Queen's Undergraduate Research Conference Proceedings ◽

10.24908/iqurcp.8565 ◽

2018 ◽

Author(s):

Victoria Chen

Keyword(s):

Prior Knowledge ◽

Cognitive Processing ◽

Teaching And Learning ◽

Multimedia Learning ◽

Primary Source ◽

Sources Of Information ◽

Multimedia Teaching ◽

Text Information ◽

Source Of Information

The purpose of this study is to examine whether Multimedia learning theory (Mayer, 1997; Schnotz & Kürschner, 2007) holds true when images are the primary source of information and text information is secondary. I will test how temporal arrangement of audio and image presentations affects quality of learning in this situation. I hypothesize that when audio is played before or after the image participants will require increased cognitive processing to mentally integrate the two sources of information resulting in deeper learning and transfer of learning. On the other hand when audio is played while the image is shown, I hypothesize that participants with high prior knowledge of the subject will score lower than participants with low prior knowledge, because prior knowledge will interfere with knowledge from the two sources causing a redundancy effect. This experiment will lead to greater understanding of multimedia teaching and learning in classrooms as well as how it affects deeper learning.

Download Full-text

Dynamic Behavior Analysis of Railway Passengers

Research Anthology on Strategies for Using Social Media as a Service and Tool in Business ◽

10.4018/978-1-7998-9020-1.ch039 ◽

2021 ◽

pp. 766-792

Author(s):

Myneni Madhu Bala ◽

Venkata Krishnaiah Ravilla ◽

Kamakshi Prasad V ◽

Akhil Dandamudi

Keyword(s):

Sentiment Analysis ◽

Dynamic Behavior ◽

Text Processing ◽

Data Extraction ◽

Emergency Situations ◽

Twitter Data ◽

Passenger Travel ◽

Text Content ◽

Comprehensive Framework

This chapter discusses mainly on dynamic behavior of railway passengers by using twitter data during regular and emergency situations. Social network data is providing dynamic and realistic data in various fields. As per the current chapter theme, if the twitter data of railway field is considered then it can be used for enhancement of railway services. Using this data, a comprehensive framework for modeling passenger tweets data which incorporates passenger opinions towards facilities provided by railways are discussed. The major issues elaborated regarding dynamic data extraction, preparation of twitter text content and text processing for finding sentiment levels is presented by two case studies; which are sentiment analysis on passenger's opinions about quality of railway services and identification of passenger travel demands using geotagged twitter data. The sentiment analysis ascertains passenger opinions towards facilities provided by railways either positive or negative based on their journey experiences.

Download Full-text

A Multiscale CNN-CRF Framework for Environmental Microorganism Image Segmentation

BioMed Research International ◽

10.1155/2020/4621403 ◽

2020 ◽

Vol 2020 ◽

pp. 1-27

Author(s):

Jinghua Zhang ◽

Chen Li ◽

Frank Kulwa ◽

Xin Zhao ◽

Changhao Sun ◽

...

Keyword(s):

Image Segmentation ◽

State Of The Art ◽

Conditional Random Field ◽

Segmentation Method ◽

Recall Accuracy ◽

Evaluation Indexes ◽

Segmentation Quality ◽

Segmentation Approach ◽

Overall Evaluation

To assist researchers to identify Environmental Microorganisms (EMs) effectively, a Multiscale CNN-CRF (MSCC) framework for the EM image segmentation is proposed in this paper. There are two parts in this framework: The first is a novel pixel-level segmentation approach, using a newly introduced Convolutional Neural Network (CNN), namely, “mU-Net-B3”, with a dense Conditional Random Field (CRF) postprocessing. The second is a VGG-16 based patch-level segmentation method with a novel “buffer” strategy, which further improves the segmentation quality of the details of the EMs. In the experiment, compared with the state-of-the-art methods on 420 EM images, the proposed MSCC method reduces the memory requirement from 355 MB to 103 MB, improves the overall evaluation indexes (Dice, Jaccard, Recall, Accuracy) from 85.24%, 77.42%, 82.27%, and 96.76% to 87.13%, 79.74%, 87.12%, and 96.91%, respectively, and reduces the volume overlap error from 22.58% to 20.26%. Therefore, the MSCC method shows great potential in the EM segmentation field.

Download Full-text