scholarly journals Factors Affecting The Usability Of Unstructured Big Data

Author(s):  
Joshua Devadason ◽  
◽  
Rehan Akbar

Big data is a valuable asset for organisation as it analyses and help to understand the customers, changes within their business environment, market analysis and future trends. The big data is multifaceted (different data types and versatile), and mostly exists in unstructured formats. The extraction of value from this data is challenging. The usability and productivity of this multifaceted unstructured data is greatly compromised. A number of factors and associated reasons affect the usability of unstructured big data. The present research work investigates these factors and associated reasons behind the usability issues of multifaceted unstructured big data. The identification of these factors contribute to develop solutions to reduce the lack of usability of highly unstructured big data. A detailed study of existing literature followed by survey questionnaire has been conducted to identify the factors and their reasons. Descriptive statistics has been used to analyse and interpret the data and results.

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Kiran Adnan ◽  
Rehan Akbar

Abstract Process of information extraction (IE) is used to extract useful information from unstructured or semi-structured data. Big data arise new challenges for IE techniques with the rapid growth of multifaceted also called as multidimensional unstructured data. Traditional IE systems are inefficient to deal with this huge deluge of unstructured big data. The volume and variety of big data demand to improve the computational capabilities of these IE systems. It is necessary to understand the competency and limitations of the existing IE techniques related to data pre-processing, data extraction and transformation, and representations for huge volumes of multidimensional unstructured data. Numerous studies have been conducted on IE, addressing the challenges and issues for different data types such as text, image, audio and video. Very limited consolidated research work have been conducted to investigate the task-dependent and task-independent limitations of IE covering all data types in a single study. This research work address this limitation and present a systematic literature review of state-of-the-art techniques for a variety of big data, consolidating all data types. Recent challenges of IE are also identified and summarized. Potential solutions are proposed giving future research directions in big data IE. The research is significant in terms of recent trends and challenges related to big data analytics. The outcome of the research and recommendations will help to improve the big data analytics by making it more productive.


2019 ◽  
Vol 11 ◽  
pp. 184797901989077 ◽  
Author(s):  
Kiran Adnan ◽  
Rehan Akbar

During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.


2020 ◽  
Vol 83 ◽  
pp. 01008
Author(s):  
Matej Černý

This paper is focused on the issue, how the business can analyze all data types (structured and unstructured) in one cooperative environment. With structured data handle Business Intelligence and with unstructured data on the other side Big Data. As a solution to this issue, we have suggested our Business Intelligence and Big Data ecosystem. This model - the ecosystem is based on already proven data processing processes running in Business Intelligence and in Big Data areas. Both processes are integrated into one unit. We have also described their common functioning.


Author(s):  
Patil N. S. ◽  
Kiran P ◽  
Kiran N. P. ◽  
Naresh Patel K. M.

Data analysis, data management, and big data play a major role in both social and business perspective, in the last decade. Nowadays, the graph database is the hottest and trending research topic. A graph database is preferred to deal with the dynamic and complex relationships in connected data and offer better results. Every data element is represented as a node. For example, in social media site, a person is represented as a node, and its properties name, age, likes, and dislikes, etc and the nodes are connected with the relationships via edges. Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques to handle with. This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management.


2020 ◽  
Vol 34 (6) ◽  
pp. 701-708
Author(s):  
Venkat Rayala ◽  
Satyanarayan Reddy Kalli

Clustering emerged as powerful mechanism to analyze the massive data generated by modern applications; the main aim of it is to categorize the data into clusters where objects are grouped into the particular category. However, there are various challenges while clustering the big data recently. Deep Learning has been powerful paradigm for big data analysis, this requires huge number of samples for training the model, which is time consuming and expensive. This can be avoided though fuzzy approach. In this research work, we design and develop an Improvised Fuzzy C-Means (IFCM)which comprises the encoder decoder Convolutional Neural Network (CNN) model and Fuzzy C-means (FCM) technique to enhance the clustering mechanism. Encoder decoder based CNN is used for learning feature and faster computation. In general, FCM, we introduce a function which measure the distance between the cluster center and instance which helps in achieving the better clustering and later we introduce Optimized Encoder Decoder (OED) CNN model for improvising the performance and for faster computation. Further in order to evaluate the proposed mechanism, three distinctive data types namely Modified National Institute of Standards and Technology (MNIST), fashion MNIST and United States Postal Service (USPS) are used, also evaluation is carried out by considering the performance metric like Accuracy, Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI). Moreover, comparative analysis is carried out on each dataset and comparative analysis shows that IFCM outperforms the existing model.


Chapter 6 provides a summary of the topics around the Community of Inquiry, big data frameworks and tools, and additional commentary on these constructs. Additionally, the authors provide a concrete example of research work that has been updated with use of emerging big data technologies, provide concrete advice for future researchers working in these same or similar research areas, and describe further insights and sharing of the authors' research as it connects to constructs related to the CoI framework and online teaching and learning. Finally, the chapter includes predictions for future trends relating to big data and the constructs of the Community of Inquiry. Overall predictions are towards automated data analysis tools that are capable of looking into newer areas of analyses such as affective computing. A list of additional readings is included.


Author(s):  
Stephen H. Kiasler ◽  
William H. Money ◽  
Stephen J. Cohen

The world of data has been evolving due to the expansion of operations and the complexity of the data processed by systems. Big Data is no longer numbers and characters but are now unstructured data types collected by a variety of devices. Recent work has postulated that the Big Data evolutionary process is making a conceptual leap to incorporate intelligence. This challenges system engineers with new issues as they envision and create service systems to process and incorporate these new data sets and structures. This article proposes that Big Data has not yet made a complete evolutionary leap, but rather that a new class of data—a higher level of abstraction—is needed to integrate this “intelligence” concept. This article examines previous definitions of Smart Data, offers a new conceptualization for smart objects (SO), examines the smart data concept, and identifies issues and challenges of understanding smart objects as a new data managed software paradigm. It concludes that smart objects incorporate new features and have different properties from passive and inert Big Data.


2021 ◽  
Vol 2 (1) ◽  
pp. 61-85
Author(s):  
Akshay Kumar ◽  
T. V. Vijay Kumar

Advances in technology have resulted in the generation of a large volume of heterogeneous big data for large enterprises engaged in e-commerce, healthcare, education, etc. This is being created at a rapid rate but is low in its veracity. This big data includes large sets of semi-structured and unstructured data and is stored over a distributed file system (DFS). This data can be processed in a fault tolerant manner using several frameworks, tools, and advanced database technologies. Big data can provide important information, which can be used for business decision making. View materialization, which has been widely studied for structured databases or data warehouse, has been extended to big data to enhance efficiency of big data query processing. This paper focuses on the selection of big data views for materialization. The big data views can be identified by extracting a set of query attributes from the set of query workload of an enterprise. The query attributes are interrelated resulting in the creation of alternate access paths for query evaluation. The cost of query processing using big data views involves the integrity of different data types of heterogeneous big data, frequency of queries, change in the size of big data, selected sets of big data materialized views, and updates on big data and these sets of materialized views. The cost of query processing is computed using the stored size of big data views on the DFS system, which is a consistent processing framework of DFS. A big data view selection algorithm that is capable of selecting views from structured, semi-structured, and unstructured data has been proposed in this paper. The proposed algorithm would select big data views that would result in faster processing of most user queries resulting in efficient decision making.


OENO One ◽  
2019 ◽  
Vol 53 (2) ◽  
pp. 107-127 ◽  
Author(s):  
Gabriella Petrovic ◽  
Jose-Luis Aleixandre-Tudo ◽  
Astrid Buica

Aim: Assimilable Nitrogen (YAN) has been identified as one of the main drivers of wine quality, influencing the production of various aromas and ensuring a successful fermentation to dryness. Due to the number of factors affecting YAN concentration and composition, paired with the complexities of yeast metabolism, more data is required to enable a comprehensive understanding of this important component of the grape juice matrix. The use of high throughput and information-rich techniques such as InfraRed spectroscopy can lead to a fast generation of a large amount of data. In addition, there is a possibility to maximise the information output of the generated data when combined with various descriptive and exploratory statistical techniques.Conclusion: Given the recent developments in the fields of analytical equipment and chemometrics, the review explores the possibility of a Big Data approach for the research of one of the most important and versatile grape juice parameters, namely YAN


Sign in / Sign up

Export Citation Format

Share Document