scholarly journals Big Data Market Optimization Pricing Model Based on Data Quality

Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Jian Yang ◽  
Chongchong Zhao ◽  
Chunxiao Xing

In recent years, data has become a special kind of information commodity and promoted the development of information commodity economy through distribution. With the development of big data, the data market emerged and provided convenience for data transactions. However, the issues of optimal pricing and data quality allocation in the big data market have not been fully studied yet. In this paper, we proposed a big data market pricing model based on data quality. We first analyzed the dimensional indicators that affect data quality, and a linear evaluation model was established. Then, from the perspective of data science, we analyzed the impact of quality level on big data analysis (i.e., machine learning algorithms) and defined the utility function of data quality. The experimental results in real data sets have shown the applicability of the proposed quality utility function. In addition, we formulated the profit maximization problem and gave theoretical analysis. Finally, the data market can maximize profits through the proposed model illustrated with numerical examples.

2019 ◽  
Vol 9 (15) ◽  
pp. 3065 ◽  
Author(s):  
Dresp-Langley ◽  
Ekseth ◽  
Fesl ◽  
Gohshi ◽  
Kurz ◽  
...  

Detecting quality in large unstructured datasets requires capacities far beyond the limits of human perception and communicability and, as a result, there is an emerging trend towards increasingly complex analytic solutions in data science to cope with this problem. This new trend towards analytic complexity represents a severe challenge for the principle of parsimony (Occam’s razor) in science. This review article combines insight from various domains such as physics, computational science, data engineering, and cognitive science to review the specific properties of big data. Problems for detecting data quality without losing the principle of parsimony are then highlighted on the basis of specific examples. Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time, and meaning can be extracted rapidly from large sets of unstructured image or video data parsimoniously through relatively simple unsupervised machine learning algorithms. Why we still massively lack in expertise for exploiting big data wisely to extract relevant information for specific tasks, recognize patterns and generate new information, or simply store and further process large amounts of sensor data is then reviewed, and examples illustrating why we need subjective views and pragmatic methods to analyze big data contents are brought forward. The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics, and the development of increasingly autonomous artificial intelligence (AI) aimed at coping with the big data deluge in the near future.


MapReduce is a programming model used for processing Big Data. There are had been considerable research in improvement of performance of MapReduce model. This paper examines performance of MapReduce model based on K Means algorithm inside the Hadoop cluster. Different input size had been taken on various configurations to discover the impact of CPU cores and primary memory size. Results of this evaluation had been shown that the number of cores had maximum impact of the performance of MapReduce model.


Author(s):  
H. Li ◽  
W. Huang ◽  
Z. Zha ◽  
J. Yang

Abstract. With the wide application of Big Data, Artificial Intelligence and Internet of Things in geographic information technology and industry, geospatial big data arises at the historic moment. In addition to the traditional "5V" characteristics of big data, which are Volume, Velocity, Variety, Veracity and Valuable, geospatial big data also has the characteristics of "Location Attribute". At present, the study of geospatial big data are mainly concentrated in: knowledge mining and discovery of geospatial data, Spatiotemporal big data mining, the impact of geospatial big data on visualization, social perception and smart city, geospatial big data services for government decision-making support four aspects. Based on the connotation and extension of geospatial big data, this paper comprehensively defines geospatial big data comprehensively. The application of geospatial big data in location visualization, industrial thematic geographic information comprehensive service and geographic data science and knowledge service is introduced in detail. Furthermore, the key technologies and design indicators of the National Geospatial Big Data Platform are elaborated from the perspectives of infrastructure, functional requirements and non-functional requirements, and the design and application of the National Geospatial Public Service Big Data Platform are illustrated. The challenges and opportunities of geospatial big data are discussed from the perspectives of open resource sharing, management decision support and data security. Finally, the development trend and direction of geospatial big data are summarized and prospected, so as to build a high-quality geospatial big data platform and play a greater role in social public application services and administrative management decision-making.


2016 ◽  
Vol 20 (1) ◽  
pp. 13-28 ◽  
Author(s):  
David H. Olsen ◽  
Pamela A. Dupin-Bryant

Big data and data science have experienced unprecedented growth in recent years.  The big data market continues to exhibit strong momentum as countless businesses transform into data-driven companies. From salary surges to incredible growth in the number of positions, data science is one of the hottest areas in the job market. Significant demand and limited supply of professionals with data competencies has greatly affected the hiring market and this demand/supply imbalance will likely continue in the future. A major key in supplying the market with qualified big data professionals, is bridging the gap from traditional Information Systems (IS) learning outcomes to those outcomes requisite in this emerging field. The purpose of this paper is to share an SQL Character Data Tutorial.  Utilizing the 5E Instructional Model, this tutorial helps students (a) become familiar with SQL code, (b) learn when and how to use SQL string functions, (c) understand and apply the concept of data cleansing, (d) gain problem solving skills in the context of typical string manipulations, and (e) gain an understanding of typical needs related to string queries. The tutorial utilizes common, recognizable quotes from popular culture to engage students in the learning process and enhance understanding. This tutorial should prove helpful to educators who seek to provide a rigorous, practical, and relevant big data experience in their courses.


Information ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 459
Author(s):  
Jose Antonio Jijon-Vorbeck ◽  
Isabel Segura-Bedmar

Due to the globalisation of the COVID-19 pandemic, and the expansion of social media as the main source of information for many people, there have been a great variety of different reactions surrounding the topic. The World Health Organization (WHO) announced in December 2020 that they were currently fighting an “infodemic” in the same way as they were fighting the pandemic. An “infodemic” relates to the spread of information that is not controlled or filtered, and can have a negative impact on society. If not managed properly, an aggressive or negative tweet can be very harmful and misleading among its recipients. Therefore, authorities at WHO have called for action and asked the academic and scientific community to develop tools for managing the infodemic by the use of digital technologies and data science. The goal of this study is to develop and apply natural language processing models using deep learning to classify a collection of tweets that refer to the COVID-19 pandemic. Several simpler and widely used models are applied first and serve as a benchmark for deep learning methods, such as Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers (BERT). The results of the experiments show that the deep learning models outperform the traditional machine learning algorithms. The best approach is the BERT-based model.


2019 ◽  
Vol 8 (S1) ◽  
pp. 67-69
Author(s):  
S. Palaniammal ◽  
V. S. Thangamani

In Journal of Banking and Finance [1] we are living in the era of the big data. The rapid development of scientific and data technology over the past decade has brought not only new and sophisticated analytical tools into Financial and Banking services, but also introduced the power of data science application in everyday strategic and operational management. Data analytics and science developments have been particularly valuable to financial organizations that heavily depend on financial information in their decision making processes. The article presents the research that focuses on the impact of the data and technology trends on decision making, particularly in Finance and Banking services. It covers an overview of the benefits associated with the decision analytics and the use of big data by financial organizations. The aim of the research is to highlight the areas of impact where the big data trends are creating disruptive changes to the way the Finance and banking industry traditionally operates. For example, we can see rapid changes to organisation structures, approach to competition and customer as well as the recognition of the importance of data analytics in strategic and tactical decision making. Investment in data analytics is no longer considered a luxury, but necessity, especially for the financial organizations in developing countries. Technology and data science are both forcing and enabling the financial and banking industry to respond to transformative demands and adapt to rapidly changing market conditions in order to survive and thrive in highly competitive global environment. Financial companies operating in developing countries must develop strong understanding of data-related trends and impacts as well as opportunities. This knowledge should not only be utilized for survival efforts, but also seen as the opportunity to engage at global level through innovation, flexibility, and early adoption of data science benefits. The paper also recommends further studies in related areas, which would provide additional value and awareness to the organizations that are considering their participation in the global data and analytical trends.


2021 ◽  
Vol 8 ◽  
Author(s):  
Yoshihiko Raita ◽  
Carlos A. Camargo ◽  
Liming Liang ◽  
Kohei Hasegawa

Clinicians handle a growing amount of clinical, biometric, and biomarker data. In this “big data” era, there is an emerging faith that the answer to all clinical and scientific questions reside in “big data” and that data will transform medicine into precision medicine. However, data by themselves are useless. It is the algorithms encoding causal reasoning and domain (e.g., clinical and biological) knowledge that prove transformative. The recent introduction of (health) data science presents an opportunity to re-think this data-centric view. For example, while precision medicine seeks to provide the right prevention and treatment strategy to the right patients at the right time, its realization cannot be achieved by algorithms that operate exclusively in data-driven prediction modes, as do most machine learning algorithms. Better understanding of data science and its tasks is vital to interpret findings and translate new discoveries into clinical practice. In this review, we first discuss the principles and major tasks of data science by organizing it into three defining tasks: (1) association and prediction, (2) intervention, and (3) counterfactual causal inference. Second, we review commonly-used data science tools with examples in the medical literature. Lastly, we outline current challenges and future directions in the fields of medicine, elaborating on how data science can enhance clinical effectiveness and inform medical practice. As machine learning algorithms become ubiquitous tools to handle quantitatively “big data,” their integration with causal reasoning and domain knowledge is instrumental to qualitatively transform medicine, which will, in turn, improve health outcomes of patients.


Sign in / Sign up

Export Citation Format

Share Document