Information Extraction from Unstructured Data Sets: An Application to Cardiac Arrhythmia Detection

Author(s):  
Omar Behadada
2019 ◽  
Vol 11 ◽  
pp. 184797901989077 ◽  
Author(s):  
Kiran Adnan ◽  
Rehan Akbar

During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.


2008 ◽  
Vol 31 ◽  
pp. 543-590 ◽  
Author(s):  
M. Michelson ◽  
C. A. Knoblock

In order for agents to act on behalf of users, they will have to retrieve and integrate vast amounts of textual data on the World Wide Web. However, much of the useful data on the Web is neither grammatical nor formally structured, making querying difficult. Examples of these types of data sources are online classifieds like Craigslist and auction item listings like eBay. We call this unstructured, ungrammatical data "posts." The unstructured nature of posts makes query and integration difficult because the attributes are embedded within the text. Also, these attributes do not conform to standardized values, which prevents queries based on a common attribute value. The schema is unknown and the values may vary dramatically making accurate search difficult. Creating relational data for easy querying requires that we define a schema for the embedded attributes and extract values from the posts while standardizing these values. Traditional information extraction (IE) is inadequate to perform this task because it relies on clues from the data, such as structure or natural language, neither of which are found in posts. Furthermore, traditional information extraction does not incorporate data cleaning, which is necessary to accurately query and integrate the source. The two-step approach described in this paper creates relational data sets from unstructured and ungrammatical text by addressing both issues. To do this, we require a set of known entities called a "reference set." The first step aligns each post to each member of each reference set. This allows our algorithm to define a schema over the post and include standard values for the attributes defined by this schema. The second step performs information extraction for the attributes, including attributes not easily represented by reference sets, such as a price. In this manner we create a relational structure over previously unstructured data, supporting deep and accurate queries over the data as well as standard values for integration. Our experimental results show that our technique matches the posts to the reference set accurately and efficiently and outperforms state-of-the-art extraction systems on the extraction task from posts.


Author(s):  
Hadeel AbdElraheem Altejani Badawi ◽  
Maysaa AbdAlgader Abdalrahman Megdar ◽  
Mohammed A. Zarrouq Yousif ◽  
Ebtisam Muawia MohammedKhair Mustafa ◽  
Najwan Othman Mohammed Abdalrheem

2019 ◽  
Vol 35 (10) ◽  
pp. 1659-1670 ◽  
Author(s):  
Mihran Yenikomshian ◽  
John Jarvis ◽  
Cody Patton ◽  
Christopher Yee ◽  
Richard Mortimer ◽  
...  

Author(s):  
Chetan M. Jadhav ◽  
V. K. Bairagi

<p>The term Arrhythmia refers to any change from the normal sequence in the electrical impulses. It is also treated as abnormal heart rhythms or irregular heartbeats. The rate of growth of Cardiac Arrhythmia disease is very high &amp; its effects can be observed in any age group in society. Arrhythmia detection can be done in many ways but effective &amp; simple method for detection &amp; diagnosis of  Cardiac Arrhythmia is by doing analysis of Electrocardiogram signals from ECG sensors. ECG signal can give us the detail information of heart activities, so we can use ECG signals to detect the rhythm &amp; behaviour of heart beats resulting into detection &amp; diagnosis of Cardiac Arrhythmia. In this paper new &amp; improved methodology for early Detection &amp; Classification of Cardiac Arrhythmia has been proposed. In this paper ECG signals are captured using ECG sensors &amp; this ECG signals are used &amp; processed to get the required data regarding heart beats of the human being &amp; then proposed methodology applies for Detection &amp; Classification of Cardiac Arrhythmia. Detection of Cardiac Arrhythmia using ECG signals allows us for easy &amp; reliable way with low cost solution to diagnose Arrhythmia in its prior early stage.</p>


2019 ◽  
Vol 40 (5) ◽  
pp. 054009 ◽  
Author(s):  
Shenda Hong ◽  
Yuxi Zhou ◽  
Meng Wu ◽  
Junyuan Shang ◽  
Qingyun Wang ◽  
...  

2018 ◽  
Vol 7 (4.19) ◽  
pp. 1041
Author(s):  
Santosh V. Chobe ◽  
Dr. Shirish S. Sane

There is an explosive growth of information on Internet that makes extraction of relevant data from various sources, a difficult task for its users. Therefore, to transform the Web pages into databases, Information Extraction (IE) systems are needed. Relevant information in Web documents can be extracted using information extraction and presented in a structured format.By applying information extraction techniques, information can be extracted from structured, semi-structured, and unstructured data. This paper presents some of the major information extraction tools. Here, advantages and limitations of the tools are discussed from a user’s perspective.  


Sign in / Sign up

Export Citation Format

Share Document