scholarly journals „Data Engineering“ in der Hochschullehre

Author(s):  
Ralf Schenkel ◽  
Stefanie Scherzinger ◽  
Marina Tropmann-Frick

ZusammenfassungDas Themenheft zu „Data Engineering for Data Science“ gibt uns Anlass, die Rolle dieses Themas in der akademischen Datenbanklehre im Rahmen einer kleinen Umfrage zu erfassen. In diesem Artikel geben wir die Ergebnisse gesammelt wieder. Uns haben 17 Rückmeldungen aus der GI-Fachgruppe Datenbanksysteme erreicht. Im Vergleich zu einer früheren Umfrage zur Lehre im Bereich „Cloud“, 2014 im Datenbankspektrum vorgestellt, zeichnet sich ab, dass Data-Engineering-Inhalte zunehmend auch in grundständigen Lehrveranstaltungen gelehrt werden, sowie außerhalb der Kerninformatik. Data Engineering scheint sich als ein Querschnittsthema zu etablieren, das nicht nur den Masterstudiengängen vorbehalten ist.

2019 ◽  
Vol 8 (2S11) ◽  
pp. 3491-3495

The term Data Engineering did not get much popularity as the terminologies like Data Science or Data Analytics, mainly because the importance of this technique or concept is normally observed or experienced only during working with data or handling data or playing with data as a Data Scientist or Data Analyst. Though neither of these two, but as an academician and the urge to learn, while working with Python, this topic ‘Data engineering’ and one of its major sub topic or concept ‘Data Wrangling’ has drawn attention and this paper is a small step to explain the experience of handling data which uses Wrangling concept, using Python. So Data Wrangling, earlier referred to as Data Munging (when done by hand or manually), is the method of transforming and mapping data from one available data format into another format with the idea of making it more appropriate and important for a variety of relatedm purposes such as analytics. Data wrangling is the modern name used for data pre-processing rather Munging. The Python Library used for the research work shown here is called Pandas. Though the major Research Area is ‘Application of Data Analytics on Academic Data using Python’, this paper focuses on a small preliminary topic of the mentioned research work named Data wrangling using Python (Pandas Library).


Author(s):  
Yingxu Wang ◽  
Jun Peng

Big data are pervasively generated by human cognitive processes, formal inferences, and system quantifications. This paper presents the cognitive foundations of big data systems towards big data science. The key perceptual model of big data systems is the recursively typed hyperstructure (RTHS). The RTHS model reveals the inherited complexities and unprecedented difficulty in big data engineering. This finding leads to a set of mathematical and computational models for efficiently processing big data systems. The cognitive relationship between data, information, knowledge, and intelligence is formally described.


Author(s):  
Meike Klettke ◽  
Uta Störl

AbstractData-driven methods and data science are important scientific methods in many research fields. All data science approaches require professional data engineering components. At the moment, computer science experts are needed for solving these data engineering tasks. Simultaneously, scientists from many fields (like natural sciences, medicine, environmental sciences, and engineering) want to analyse their data autonomously. The arising task for data engineering is the development of tools that can support an automated data curation and are utilisable for domain experts. In this article, we will introduce four generations of data engineering approaches classifying the data engineering technologies of the past and presence. We will show which data engineering tools are needed for the scientific landscape of the next decade.


CITAS ◽  
2016 ◽  
Vol 2 (1) ◽  
pp. 39-46
Author(s):  
Ixent Galpin

The role of data scientist has been described as the “sexiest job of the 21st Century”. While possibly there is a degree of hype associated with such a claim, there are factors at play such as the unprecedented growth in the amount of data being generated. This paper characterises the already established disciplines which underpin data science, viz., data engineering, statistics, and data mining. Following a characterisation of the previous fields, data science is found to be most closely related to data mining. However, in contrast to data mining, data science promises to operate over datasets that exhibit significant challenges in terms of the four Vs: Volume, Variety, Velocity and Veracity. This paper notes that the current emphasis, both in industry and academia, is on the first three Vs, which pose mainly scientific or technological challenges, rather than Veracity, which is a truly scientific (and arguably a more complex) challenge. Data Science can be seen to have a more ambitious objective than what traditionally data mining has: as a science, data science aims to lead to the creation of new theories and knowledge. This paper notes that, ironically, the veracity dimension, which is arguably the closest one relating to this objective, is being neglected. Despite the current media frenzy about data science, the paper concludes that more time is needed to see whether it will emerge as discipline in its own right.


2018 ◽  
Vol 115 (50) ◽  
pp. 12630-12637 ◽  
Author(s):  
Katy Börner ◽  
Olga Scrivner ◽  
Mike Gallant ◽  
Shutian Ma ◽  
Xiaozhong Liu ◽  
...  

Rapid research progress in science and technology (S&T) and continuously shifting workforce needs exert pressure on each other and on the educational and training systems that link them. Higher education institutions aim to equip new generations of students with skills and expertise relevant to workforce participation for decades to come, but their offerings sometimes misalign with commercial needs and new techniques forged at the frontiers of research. Here, we analyze and visualize the dynamic skill (mis-)alignment between academic push, industry pull, and educational offerings, paying special attention to the rapidly emerging areas of data science and data engineering (DS/DE). The visualizations and computational models presented here can help key decision makers understand the evolving structure of skills so that they can craft educational programs that serve workforce needs. Our study uses millions of publications, course syllabi, and job advertisements published between 2010 and 2016. We show how courses mediate between research and jobs. We also discover responsiveness in the academic, educational, and industrial system in how skill demands from industry are as likely to drive skill attention in research as the converse. Finally, we reveal the increasing importance of uniquely human skills, such as communication, negotiation, and persuasion. These skills are currently underexamined in research and undersupplied through education for the labor market. In an increasingly data-driven economy, the demand for “soft” social skills, like teamwork and communication, increase with greater demand for “hard” technical skills and tools.


Author(s):  
Nicolas Alder ◽  
Tobias Bleifuß ◽  
Leon Bornemann ◽  
Felix Naumann ◽  
Tim Repke

Zusammenfassung Im Januar und Februar 2020 boten wir auf der openHPI Plattform einen Massive Open Online Course (MOOC) mit dem Ziel an, Nicht-Fachleute in die Begriffe, Ideen, und Herausforderungen von Data Science einzuführen. In über hundert kleinen Kurseinheiten erläuterten wir über sechs Wochen hinweg ebenso viele Schlagworte. Wir berichten über den Aufbau des Kurses, unsere Ziele, die Interaktion mit den Teilnehmerinnen und Teilnehmern und die Ergebnisse des Kurses.


Sign in / Sign up

Export Citation Format

Share Document