Methodological framework for data processing based on the Data Science paradigm

Author(s):  
F. Pacheco ◽  
C. Rangel ◽  
J. Aguilar ◽  
M. Cerrada ◽  
J. Altamiranda
2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
J Doetsch ◽  
I Lopes ◽  
R Redinha ◽  
H Barros

Abstract The usage and exchange of “big data” is at the forefront of the data science agenda where Record Linkage plays a prominent role in biomedical research. In an era of ubiquitous data exchange and big data, Record Linkage is almost inevitable, but raises ethical and legal problems, namely personal data and privacy protection. Record Linkage refers to the general merging of data information to consolidate facts about an individual or an event that are not available in a separate record. This article provides an overview of ethical challenges and research opportunities in linking routine data on health and education with cohort data from very preterm (VPT) infants in Portugal. Portuguese, European and International law has been reviewed on data processing, protection and privacy. A three-stage analysis was carried out: i) interplay of threefold law-levelling for Record Linkage at different levels; ii) impact of data protection and privacy rights for data processing, iii) data linkage process' challenges and opportunities for research. A framework to discuss the process and its implications for data protection and privacy was created. The GDPR functions as utmost substantial legal basis for the protection of personal data in Record Linkage, and explicit written consent is considered the appropriate basis for the processing sensitive data. In Portugal, retrospective access to routine data is permitted if anonymised; for health data if it meets data processing requirements declared with an explicit consent; for education data if the data processing rules are complied. Routine health and education data can be linked to cohort data if rights of the data subject and requirements and duties of processors and controllers are respected. A strong ethical context through the application of the GDPR in all phases of research need to be established to achieve Record Linkage between cohort and routine collected records for health and education data of VPT infants in Portugal. Key messages GDPR is the most important legal framework for the protection of personal data, however, its uniform approach granting freedom to its Member states hampers Record Linkage processes among EU countries. The question remains whether the gap between data protection and privacy is adequately balanced at three legal levels to guarantee freedom for research and the improvement of health of data subjects.


Author(s):  
Sabrina Lechler ◽  
Angelo Canzaniello ◽  
Bernhard Roßmann ◽  
Heiko A. von der Gracht ◽  
Evi Hartmann

Purpose Particularly in volatile, uncertain, complex and ambiguous (VUCA) business conditions, staff in supply chain management (SCM) look to real-time (RT) data processing to reduce uncertainties. However, based on the premise that data processing can be perfectly mastered, such expectations do not reflect reality. The purpose of this paper is to investigate whether RT data processing reduces SCM uncertainties under real-world conditions. Design/methodology/approach Aiming to facilitate communication on the research question, a Delphi expert survey was conducted to identify challenges of RT data processing in SCM operations and to assess whether it does influence the reduction of SCM uncertainty. In total, 14 prospective statements concerning RT data processing in SCM operations were developed and evaluated by 68 SCM and data-science experts. Findings RT data processing was found to have an ambivalent influence on the reduction of SCM complexity and associated uncertainty. Analysis of the data collected from the study participants revealed a new type of uncertainty related to SCM data itself. Originality/value This paper discusses the challenges of gathering relevant, timely and accurate data sets in VUCA environments and creates awareness of the relationship between data-related uncertainty and SCM uncertainty. Thus, it provides valuable insights for practitioners and the basis for further research on this subject.


Author(s):  
Andrew McCullum

In 2015, Central Asia made some vital enhancements in nature for cross-fringe e-business: Kazakhstan's promotion to the World Trade Organization (WTO) will help business straightforwardness, while the Kyrgyz Republic's enrollment in the Eurasian Customs Union grows its buyer base. Why e-business? Two reasons to begin with, e-trade diminishes the expense of separation. Focal Asia is the most elevated exchange cost locale on the planet: unlimited separations from real markets make discovering purchasers testing, shipping merchandise moderate, and fare costs high. Second, e-business can pull in populaces that are customarily under-spoke to in fare markets, for example, ladies, little organizations and rustic business visionaries.


Author(s):  
Wajid Ali ◽  
Muhammad Usman Shafique ◽  
Muhammad Arslan Majeed ◽  
Muhammad Faizan ◽  
Ahmad Raza

Data Science emerged as an important discipline and its education is essential for success in almost every aspect of life.  Here comes the age of Big data. Big data impacts all aspects of our lives and society is admitting it. Data processing and other techniques are combined to convert abundant data into valuable information for society, organizations, and people. Specific strategies and approaches are needed to provide better to educate future data scientists to overcome the challenges of Big data. In this paper, we discussed the general concept of data science, Big data, and areas of Big data computing.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Zichun Tian

This paper uses Python and its external data processing package to conduct an in-depth analysis machine study of Airbnb review data. Increasingly, travelers are now using Airbnb instead of staying in traditional hotels. However, in such a growing and competitive Airbnb market, many hosts may find it difficult to make their listings attractive among the many. With the development of data science, the author can now analyse large amounts of data to obtain compelling evidence that helps Airbnb hosts find certain patterns in some popular properties. By learning and emulating these patterns, many hosts may be able to increase the popularity of their properties. By using Python to analyse all data from all aspects of Airbnb listings, the author proposes to test and find correlations between certain variables and popular listings. To ensure that the results are representative and general, the author used a database containing many multidimensional details and information about Airbnb listings to date. To obtain the desired results, the author uses the Pandas, NLTK, and matplotlib packages to better process and visualize the data. Finally, the author will make some recommendations to Airbnb hosts based on the evidence generated from the data in many ways. In the future, the author will build on this to further optimize the design.


Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1438
Author(s):  
Carlos Alberto de Braganca Pereira ◽  
Adriano Polpo ◽  
Agatha Sacramento Rodrigues

With the increase in data processing and storage capacity, a large amount of data is available [...]


2022 ◽  
Vol 15 (1) ◽  
pp. 1-20
Author(s):  
Ravinder Kumar ◽  
Lokesh Kumar Shrivastav

Designing a system for analytics of high-frequency data (Big data) is a very challenging and crucial task in data science. Big data analytics involves the development of an efficient machine learning algorithm and big data processing techniques or frameworks. Today, the development of the data processing system is in high demand for processing high-frequency data in a very efficient manner. This paper proposes the processing and analytics of stochastic high-frequency stock market data using a modified version of suitable Gradient Boosting Machine (GBM). The experimental results obtained are compared with deep learning and Auto-Regressive Integrated Moving Average (ARIMA) methods. The results obtained using modified GBM achieves the highest accuracy (R2 = 0.98) and minimum error (RMSE = 0.85) as compared to the other two approaches.


2020 ◽  
Vol 64 (6) ◽  
pp. 368-372
Author(s):  
Aleksandr A. Zavyalov ◽  
Dmitry A. Andreev

Introduction. In Moscow, the state-of-the-art information technologies for cancer care data processing are widely used in routine practice. Data Science approaches are increasingly applied in the field of radiation oncology. Novel arrays of radiotherapy performance indices can be introduced into real-time cancer care quality and safety monitoring. The purpose of the study. The short review of the critical structural elements of automated Big Data processing and its perspectives in the light of the internal quality and safety control organization in radiation oncology departments. Material and methods. The PubMed (Medline) and E-Library databases were used to search the articles published mainly in the last 2-3 years. In total, about 20 reports were selected. Results. This paper highlights the applicability of the next-generation Data Science approaches to quality and safety assurance in radiation oncological units. The structural pillars for automated Big Data processing are considered. Big Data processing technologies can facilitate improvements in quality management at any radiotherapy stage. Simultaneously, the high requirements for quality and integrity across indices in the databases are crucial. Detailed dose data may also be linked to outcomes and survival indices integrated into larger registries. Discussion. Radiotherapy quality control could be automated to some extent through further introduction of information technologies making comparisons of the real-time quality measures with digital targets in terms of minimum norms / standards. The implementation of automated systems generating early electronic notifications and rapid alerts in case of serious quality violation could drastically improve the internal medical processes in local clinics. Conclusion. The role of Big Data tools in internal quality and safety control will dramatically increase over time.


Author(s):  
Janga Vijay Kumar ◽  
Syed Abdul Moeed ◽  
C. Madan Kumar ◽  
G. Ashmitha

Sign in / Sign up

Export Citation Format

Share Document