scholarly journals Navigating Diverse Data Science Learning: Critical Reflections Towards Future Practice

Author(s):  
Yehia Elkhatib
2017 ◽  
Vol 1387 (1) ◽  
pp. 5-11 ◽  
Author(s):  
Nabil R. Adam ◽  
Robert Wieder ◽  
Debopriya Ghosh

2019 ◽  
Author(s):  
Mathew Abrams ◽  
Jan G. Bjaalie ◽  
Samir Das ◽  
Gary F. Egan ◽  
Satrajit S Ghosh ◽  
...  

There is great need for coordination around standards and best practices in neuroscience to support efforts to make neuroscience a data-centric discipline. Major brain initiatives launched around the world are poised to generate huge stores of neuroscience data. At the same time, neuroscience, like many domains in biomedicine, is confronting the issues of transparency, rigor, and reproducibility. Widely used, validated standards and best practices are key to addressing the challenges in both big and small data science, as they are essential for integrating diverse data and for developing a robust, effective and sustainable infrastructure to support open and reproducible neuroscience. However, developing community standards and gaining their adoption is difficult. The current landscape is characterized both by a lack of robust, validated standards and a plethora of overlapping, underdeveloped, untested and underutilized standards and best practices. The International Neuroinformatics Coordinating Facility (INCF), established in 2005, is an independent organization dedicated to promoting data sharing through the coordination of infrastructure and standards. INCF has recently implemented a formal procedure for evaluating and endorsing community standards and best practices in support of the FAIR principles. By formally serving as a standards organization dedicated to open and FAIR neuroscience, INCF helps evaluate, promulgate and coordinate standards and best practices across neuroscience. Here, we provide an overview of the process and discuss how neuroscience can benefit from having a dedicated standards body.


Author(s):  
Karl E. Misulis ◽  
Mark E. Frisse

Data science is the study of how analytics techniques can be applied to large and diverse data sets. This field is emerging because of the availability of massive data sets in both consumer and health sectors, new machine learning and other analytics requiring large-scale computation, and the vital need to identify risk factors, trends, and other relationships not apparent when applying traditional analytics methods to smaller structured data sets. In some organizations, the primary role of a clinical informatics professional no longer is focused on how electronic health records are used in healthcare delivery but instead is focused on how patient encounter information can be collected efficiently, aggregated with information from other encounters or sources, and analyzed to improve our understanding of how population studies can improve the care of individuals. Such an understanding is critical to improving care quality and lowering healthcare costs.


Author(s):  
Mathew Birdsall Abrams ◽  
Jan G. Bjaalie ◽  
Samir Das ◽  
Gary F. Egan ◽  
Satrajit S. Ghosh ◽  
...  

AbstractThere is great need for coordination around standards and best practices in neuroscience to support efforts to make neuroscience a data-centric discipline. Major brain initiatives launched around the world are poised to generate huge stores of neuroscience data. At the same time, neuroscience, like many domains in biomedicine, is confronting the issues of transparency, rigor, and reproducibility. Widely used, validated standards and best practices are key to addressing the challenges in both big and small data science, as they are essential for integrating diverse data and for developing a robust, effective, and sustainable infrastructure to support open and reproducible neuroscience. However, developing community standards and gaining their adoption is difficult. The current landscape is characterized both by a lack of robust, validated standards and a plethora of overlapping, underdeveloped, untested and underutilized standards and best practices. The International Neuroinformatics Coordinating Facility (INCF), an independent organization dedicated to promoting data sharing through the coordination of infrastructure and standards, has recently implemented a formal procedure for evaluating and endorsing community standards and best practices in support of the FAIR principles. By formally serving as a standards organization dedicated to open and FAIR neuroscience, INCF helps evaluate, promulgate, and coordinate standards and best practices across neuroscience. Here, we provide an overview of the process and discuss how neuroscience can benefit from having a dedicated standards body.


2021 ◽  
Vol 49 (4) ◽  
pp. 6-11
Author(s):  
Jonas Traub ◽  
Zoi Kaoudi ◽  
Jorge-Arnulfo Quiané-Ruiz ◽  
Volker Markl

Data science and artificial intelligence are driven by a plethora of diverse data-related assets, including datasets, data streams, algorithms, processing software, compute resources, and domain knowledge. As providing all these assets requires a huge investment, data science and artificial intelligence technologies are currently dominated by a small number of providers who can afford these investments. This leads to lock-in effects and hinders features that require a flexible exchange of assets among users. In this paper, we introduce Agora, our vision towards a unified ecosystem that brings together data, algorithms, models, and computational resources and provides them to a broad audience. Agora (i) treats assets as first-class citizens and leverages a fine-grained exchange of assets, (ii) allows for combining assets to novel applications, and (iii) flexibly executes such applications on available resources. As a result, it enables easy creation and composition of data science pipelines as well as their scalable execution. In contrast to existing data management systems, Agora operates in a heavily decentralized and dynamic environment: Data, algorithms, and even compute resources are dynamically created, modified, and removed by different stakeholders. Agora presents novel research directions for the data management community as a whole: It requires to combine our traditional expertise in scalable data processing and management with infrastructure provisioning as well as economic and application aspects of data, algorithms, and infrastructure.


2021 ◽  
Vol 14 (6) ◽  
pp. 1102-1110
Author(s):  
Anton Tsitsulin ◽  
Marina Munkhoeva ◽  
Davide Mottin ◽  
Panagiotis Karras ◽  
Ivan Oseledets ◽  
...  

Low-dimensional representations, or embeddings , of a graph's nodes facilitate several practical data science and data engineering tasks. As such embeddings rely, explicitly or implicitly, on a similarity measure among nodes, they require the computation of a quadratic similarity matrix, inducing a tradeoff between space complexity and embedding quality. To date, no graph embedding work combines (i) linear space complexity, (ii) a nonlinear transform as its basis, and (iii) nontrivial quality guarantees. In this paper we introduce FREDE ( FREquent Directions Embedding ), a graph embedding based on matrix sketching that combines those three desiderata. Starting out from the observation that embedding methods aim to preserve the covariance among the rows of a similarity matrix, FREDE iteratively improves on quality while individually processing rows of a nonlinearly transformed PPR similarity matrix derived from a state-of-the-art graph embedding method and provides, at any iteration , column-covariance approximation guarantees in due course almost indistinguishable from those of the optimal approximation by SVD. Our experimental evaluation on variably sized networks shows that FREDE performs almost as well as SVD and competitively against state-of-the-art embedding methods in diverse data science tasks, even when it is based on as little as 10% of node similarities.


2017 ◽  
Vol 6 (1) ◽  
pp. 242
Author(s):  
Muhammad MH Muhammad MH

The low yield of fourth grade students learn science SD Negeri 004 Tembilahan this is the background of this research, this is evidenced by the acquisition of the average student learning outcomes that is equal to 63.31. This study aims to improve student learning outcomes through the implementation of the method of administration tasks. This research is a classroom action research subjects fourth graders SD Negeri 004 Tembilahan by the number of 32 students, the study was conducted as much as two cycles consisting of two meetings. The data used in this study focuses on improving learning outcomes and student learning completeness. Based on the results obtained under the data science learning outcomes increased in each cycle, the pre-cycle (basic score) average learning results obtained by the students was 63.31 by the number of students who completed by 16 (50.00%), in cycle I rise to 70.31 by the number of students who pass the number 25 (78.12%), and the second cycle increased to 73.59 by the number of students who completed totaling 32 students. Based on these results, it can be concluded that the application of the method of administration tasks can improve student learning outcomes IPA IV SD Negeri 004 Tembilahan.


Sign in / Sign up

Export Citation Format

Share Document