scholarly journals A Knowledge Graph Approach for the Detection of Digital Human Profiles in Big Data

Author(s):  
Cu Kim Long ◽  
Ha Quoc Trung ◽  
Tran Ngoc Thang ◽  
Nguyen Tien Dong ◽  
Pham Van Hai

Digital transformation is a long process that changes the managing human profiles in both offline and online approaches. This generates the amount of huge data stored in both relational databases and many others like social networks or graph databases. To exploit effectively big data, several measures and algorithms in Picture Fuzzy Graph (PFG) are applied to solve many complex problems in the real-world problems. The paper has presented a novel approach using a knowledge graph to find a human profile including the detection of humans in large data. In the proposed model, digital human profiles are collected from conventional databases combination with social networks in real-time, and a knowledge graph is created to represent complex-relational user attributes of human profile in large datasets. PFG is applied to quantify the degree centrality of nodes. Furthermore, techniques and algorithms on the graph are used to classify the nodes. The experiments in the knowledge graph implemented to illustrate the proposed model. The main contribution in this paper is to identify the right persons among complex-relational groups, locations in real-time based on large datasets on the social networks, relational databases and graph databases.

2021 ◽  
Vol 75 (3) ◽  
pp. 76-82
Author(s):  
G.T. Balakayeva ◽  
◽  
D.K. Darkenbayev ◽  
M. Turdaliyev ◽  
◽  
...  

The growth rate of these enterprises has increased significantly in the last decade. Research has shown that over the past two decades, the amount of data has increased approximately tenfold every two years - this exceeded Moore's Law, which doubles the power of processors. About thirty thousand gigabytes of data are accumulated every second, and their processing requires an increase in the efficiency of data processing. Uploading videos, photos and letters from users on social networks leads to the accumulation of a large amount of data, including unstructured ones. This leads to the need for enterprises to work with big data of different formats, which must be prepared in a certain way for further work in order to obtain the results of modeling and calculations. In connection with the above, the research carried out in the article on processing and storing large data of an enterprise, developing a model and algorithms, as well as using new technologies is relevant. Undoubtedly, every year the information flows of enterprises will increase and in this regard, it is important to solve the issues of storing and processing large amounts of data. The relevance of the article is due to the growing digitalization, the increasing transition to professional activities online in many areas of modern society. The article provides a detailed analysis and research of these new technologies.


Author(s):  
Antonio Sarasa-Cabezuelo

The appearance of the “big data” phenomenon has meant a change in the storage and information processing needs. This new context is characterized by 1) enormous amounts of information are available in heterogeneous formats and types, 2) information must be processed almost in real time, and 3) data models evolve periodically. Relational databases have limitations to respond to these needs in an optimal way. For these reasons, some companies such as Google or Amazon decided to create new database models (different from the relational model) that solve the needs raised in the context of big data without the limitations of relational databases. These new models are the origin of the so-called NonSQL databases. Currently, NonSQL databases have been constituted as an alternative mechanism to the relational model and its use is widely extended. The main objective of this chapter is to introduce the NonSQL databases.


2018 ◽  
Vol 7 (3.33) ◽  
pp. 248
Author(s):  
Young-Woon Kim ◽  
Hyeopgeon Lee

In the automobile industry, the contract information of vehicles contracted through sales activities, as well as the order data of customers who purchased cars, and vehicle maintenance history information all accumulate in relational databases over time. Although accumulated customer and vehicle information is used for marketing purposes, processing and analyzing this massive data is difficult, as its volume con-stantly increases. This problem of managing big data is commonly solved by utilizing the MapReduce distributed structure of Hadoop, which uses big data distributed processing technology, and R, which is a widely used big data analysis technology. Among the methods that interconnect Hadoop and R, the R and Hadoop integrated programming environment (RHIPE) was developed in this study as a real-time big data analysis system for marketing in the automobile industry. RHIPE allows us to maintain an interactive environment and use the powerful analytical features of R, which is an interpreter language, while achieving a high processing speed using Map and Reduce func-tions. In this study, we developed a real-time big data analysis system that can analyze the orders, reservations, and maintenance history contained in big data using the RHIPE method. 


2019 ◽  
Vol 8 (2) ◽  
pp. 2490-2494

Big data is a new technology, which is defined by large amount of data, so it is possible to extract value from the capturing and analysis process. Large data faced many challenges due to various features such as volume, speed, variation, value, complexity and performance. Many organizations face challenges while facing test strategies for structured and unstructured data validation, establishing a proper testing environment, working with non relational databases and maintaining functional testing. These challenges have low quality data in production, delay in execution and increase in cost. Reduce the map for data intensive business and scientific applications Provides parallel and scalable programming model. To get the performance of big data applications, defined as response time, maximum online user data capacity size, and a certain maximum processing capacity. In proposed, to test the health care big data . In health care data contains text file, image file, audio file and video file. To test the big data document, by using two concepts such as big data preprocessing testing and post processing testing. To classify the data from unstructured format to structured format using SVM algorithm. In preprocessing testing test all the data, for the purpose data accuracy. In preprocessing testing such as file size testing, file extension testing and de-duplication testing. In Post Processing to implement the map reduce concept for the use of easily to fetch the data.


2020 ◽  
Vol 69 (1) ◽  
pp. 323-326
Author(s):  
N.B. Zhapsarbek ◽  

In the modern world, specialists and the information systems they create are increasingly faced with the need to store, process and move huge amounts of data. The definition of large amounts of data, Big Data, is used to denote technologies such as storing and analyzing large amounts of data that require high speed and real-time decision making during processing. In this case, large volumes, high accumulation rate, and the lack of a strict internal structure of "big data" are considered. All of this also means that classic relational databases are not well suited for storing them. In this article, we showed solutions for processing large amounts of data for pharmacy chains using NoSQL. This paper presents technologies for modeling large amounts of data using NoSQL, including MongoDB, and also analyzes possible solutions, limitations that do not allow this to be done effectively. This article provides an overview of three modern approaches to working with big data: NoSQL, DataMining and real-time processing of event flows. In this article, as an implementation of the studied methods and technology, we consider a database of pharmacies for processing, searching, analyzing, forecasting big data. Also, when using NoSQL, we showed work with structured and poorly structured data in parallel in different aspects and showed a comparative analysis of the newly developed application for pharmacy workers.


2015 ◽  
Vol 09 (04) ◽  
pp. 523-545 ◽  
Author(s):  
Shao-Ting Wang ◽  
Jennifer Jin ◽  
Pete Rivett ◽  
Atsushi Kitazawa

Graph databases can be defined as databases that use graph structures with nodes, edges and properties to store data. Semantic queries and graph-oriented operations are used to access them. With a rapidly growing amount of information on the Internet in recent years, relational databases suffer performance degradation as a large number of nodes are added due to the number of entries in join tables. Therefore, based on the network nature of Internet activities, graph databases are designed for fast access to complex data found in social networks, recommendation engines and networked system. The main objective of this survey is to present the work that has been done in the area of graph database, including query languages, processing, and related application.


2018 ◽  
Vol 2 (1) ◽  
Author(s):  
Fernando Almeida

The evolution of information systems and the growth in the use of the Internet and social networks has caused an explosion in the amount of available data relevant to the activities of the companies. Therefore, the treatment of these available data is vital to support operational, tactical and strategic decisions. This paper aims to present the concept of big data and the main technologies that support the analysis of large data volumes. The potential of big data is explored considering nine sectors of activity, such as financial, retail, healthcare, transports, agriculture, energy, manufacturing, public, and media and entertainment. In addition, the main current opportunities, vulnerabilities and privacy challenges of big data are discussed. It was possible to conclude that despite the potential for using the big data to grow in the previously identified areas, there are still some challenges that need to be considered and mitigated, namely the privacy of information, the existence of qualified human resources to work with Big Data and the promotion of a data-driven organizational culture.


2019 ◽  
Vol 16 (8) ◽  
pp. 3332-3337 ◽  
Author(s):  
S. Dhamodaran ◽  
G. Mahalakshmi ◽  
P. Harika ◽  
J. Refonaa ◽  
K. AshokKumar

The authorities are to be considered to make a real-time decision and future planning by various analyzing geo-social media posts in Geo-social Network. However, there are millions of Geo-social Network users who are producing overwhelming of data, called “Big Data” that is challenging to be analyzed and make the required real-time decisions. In our proposed system proposal of the efficient system for inquiring Geo-social Networks and harvesting the data as well as user’s location information (Dhamodaran, S. and Lakshmi, M., 2017. Design and Analysis of Spatial-Temporal Model Using Hydrological Techniques. IEEE International Conference on Computing of Power, Energy and Communication. pp.1–4). System architecture is proposed that processes an abundant amount of various social networks’ of data to monitor. Earth events, incidents, medical diseases, user trends, and views to make future real-time decisions and facilitate future planning (Dhamodarn, S., et al., Identification of User Poi in Spatial Data Using Android Application. International Conference on Computation of Power, Energy Information and Communication (ICCPEIC), IEEE. ISBN: 978-1-5090-0901-5). Twitter and Flickr have been analyzed using the proposed architecture in order to identify current events or disasters, such as earthquakes, snow, Ebola virus, and fires. The system is evaluated with respect to an efficiency of data while considering the system throughput.


2019 ◽  
Vol 11 (12) ◽  
pp. 249 ◽  
Author(s):  
Ilaria Bartolini ◽  
Marco Patella

The avalanche of (both user- and device-generated) multimedia data published in online social networks poses serious challenges to researchers seeking to analyze such data for many different tasks, like recommendation, event recognition, and so on. For some such tasks, the classical “batch” approach of big data analysis is not suitable, due to constraints of real-time or near-real-time processing. This led to the rise of stream processing big data platforms, like Storm and Flink, that are able to process data with a very low latency. However, this complicates the task of data analysis since any implementation has to deal with the technicalities of such platforms, like distributed processing, synchronization, node faults, etc. In this paper, we show how the RAM 3 S framework could be profitably used to easily implement a variety of applications (such as clothing recommendations, job suggestions, and alert generation for dangerous events), being independent of the particular stream processing big data platforms used. Indeed, by using RAM 3 S, researchers can concentrate on the development of their data analysis application, completely ignoring the details of the underlying platform.


Sign in / Sign up

Export Citation Format

Share Document