Data, data management, data analytics, and data science technologies

2021 ◽  
pp. 183-196
Author(s):  
Richard Busulwa ◽  
Nina Evans
Author(s):  
Silvia Chiusano ◽  
Tania Cerquitelli ◽  
Robert Wrembel ◽  
Daniele Quercia

Author(s):  
Richard Busulwa ◽  
Nina Evans ◽  
Aaron Oh ◽  
Moon Kang

2021 ◽  
Vol 14 (11) ◽  
pp. 2296-2304
Author(s):  
Phanwadee Sinthong ◽  
Michael J. Carey

In the last few years, the field of data science has been growing rapidly as various businesses have adopted statistical and machine learning techniques to empower their decision-making and applications. Scaling data analyses to large volumes of data requires the utilization of distributed frameworks. This can lead to serious technical challenges for data analysts and reduce their productivity. AFrame, a data analytics library, is implemented as a layer on top of Apache AsterixDB, addressing these issues by providing the data scientists' familiar interface, Pandas Dataframe, and transparently scaling out the evaluation of analytical operations through a Big Data management system. While AFrame is able to leverage data management facilities (e.g., indexes and query optimization) and allows users to interact with a large volume of data, the initial version only generated SQL++ queries and only operated against AsterixDB. In this work, we describe a new design that retargets AFrame's incremental query formation to other query-based database systems, making it more flexible for deployment against other data management systems with composable query languages.


2021 ◽  
Vol 3 (6) ◽  
Author(s):  
César de Oliveira Ferreira Silva ◽  
Mariana Matulovic ◽  
Rodrigo Lilla Manzione

Abstract Groundwater governance uses modeling to support decision making. Therefore, data science techniques are essential. Specific difficulties arise because variables must be used that cannot be directly measured, such as aquifer recharge and groundwater flow. However, such techniques involve dealing with (often not very explicitly stated) ethical questions. To support groundwater governance, these ethical questions cannot be solved straightforward. In this study, we propose an approach called “open-minded roadmap” to guide data analytics and modeling for groundwater governance decision making. To frame the ethical questions, we use the concept of geoethical thinking, a method to combine geoscience-expertise and societal responsibility of the geoscientist. We present a case study in groundwater monitoring modeling experiment using data analytics methods in southeast Brazil. A model based on fuzzy logic (with high expert intervention) and three data-driven models (with low expert intervention) are tested and evaluated for aquifer recharge in watersheds. The roadmap approach consists of three issues: (a) data acquisition, (b) modeling and (c) the open-minded (geo)ethical attitude. The level of expert intervention in the modeling stage and model validation are discussed. A search for gaps in the model use is made, anticipating issues through the development of application scenarios, to reach a final decision. When the model is validated in one watershed and then extrapolated to neighboring watersheds, we found large asymmetries in the recharge estimatives. Hence, we can show that more information (data, expertise etc.) is needed to improve the models’ predictability-skill. In the resulting iterative approach, new questions will arise (as new information comes available), and therefore, steady recourse to the open-minded roadmap is recommended. Graphic abstract


2021 ◽  
Vol 29 (1) ◽  
pp. 177-185
Author(s):  
Gunasekaran Manogaran ◽  
P. Mohamed Shakeel ◽  
S. Baskar ◽  
Ching-Hsien Hsu ◽  
Seifedine Nimer Kadry ◽  
...  

2019 ◽  
Author(s):  
Sitti Zuhaerah Thalhah ◽  
Mohammad Tohir ◽  
Phong Thanh Nguyen ◽  
K. Shankar ◽  
Robbi Rahim

For development in military applications, industrial and government the predictive analytics and decision models have long been cornerstones. In modern healthcare system technologies and big data analytics and modeling of multi-source data system play an increasingly important role. Into mathematical models in these domains various problems arising that can be formulated, by using computational techniques, sophisticated optimization and decision analysis it can be analyzed. This paper studies the use of data science in healthcare applications and the mathematical issues in data science.


Sign in / Sign up

Export Citation Format

Share Document