complex datasets
Recently Published Documents


TOTAL DOCUMENTS

75
(FIVE YEARS 28)

H-INDEX

10
(FIVE YEARS 2)

Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1648
Author(s):  
Amelia Carolina Sparavigna

Image analysis basically refers to any extraction of information from images, which can be as simple as QR codes required in logistics and digital certifications or related to large and complex datasets, such as the collections of images used for biometric identification or the sets of satellite surveys employed in the monitoring of Earth’s climate changes [...]


2021 ◽  
Author(s):  
Vaishali Dhanoa ◽  
Conny Walchshofer ◽  
Andreas Hinterreiter ◽  
Holger Stitz ◽  
Eduard Gröller ◽  
...  

Dashboards are used ubiquitously to gain and present insights into data by means of interactive visualizations.To bridge the gap between non-expert dashboard users and potentially complex datasets and/or visualizations, a variety of onboarding strategies are employed, including videos, narration, and interactive tutorials. We propose a process model for dashboard onboarding which formalizes and unifies such diverse onboarding strategies. Our model introduces the onboarding loop alongside the dashboard usage loop. Unpacking the onboarding loop reveals how each onboarding strategy combines selected building blocks of the dashboard with an onboarding narrative. Specific means are applied to this narration sequence for onboarding, which results in onboarding artifacts that are presented to the user via an interface. We concretize these concepts by showing how our process model can be used to describe a selection of real-world onboarding examples. Finally, we discuss how our model can serve as an actionable blueprint for developing new onboarding systems.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 979
Author(s):  
Pierre-Luc Germain ◽  
Aaron Lun ◽  
Will Macnair ◽  
Mark D. Robinson

Doublets are prevalent in single-cell sequencing data and can lead to artifactual findings. A number of strategies have therefore been proposed to detect them. Building on the strengths of existing approaches, we developed scDblFinder, a fast, flexible and accurate Bioconductor-based doublet detection method. Here we present the method, justify its design choices, demonstrate its performance on both single-cell RNA and accessibility sequencing data, and provide some observations on doublet formation, detection, and enrichment analysis. Even in complex datasets, scDblFinder can accurately identify most heterotypic doublets, and was already found by an independent benchmark to outcompete alternatives.


2021 ◽  
Vol 6 ◽  
Author(s):  
Sabrina D. Robertson ◽  
Andrea Bixler ◽  
Melissa R. Eslinger ◽  
Monica M. Gaudier-Diaz ◽  
Adam J. Kleinschmit ◽  
...  

As educators and researchers, we often enjoy enlivening classroom discussions by including examples of cutting-edge high-throughput (HT) technologies that propelled scientific discovery and created repositories of new information. We also call for the use of evidence-based teaching practices to engage students in ways that promote equity and learning. The complex datasets produced by HT approaches can open the doors to discovery of novel genes, drugs, and regulatory networks, so students need experience with the effective design, implementation, and analysis of HT research. Nevertheless, we miss opportunities to contextualize, define, and explain the potential and limitations of HT methods. One evidence-based approach is to engage students in realistic HT case studies. HT cases immerse students with messy data, asking them to critically consider data analysis, experimental design, ethical implications, and HT technologies.The NSF HITS (High-throughput Discovery Science and Inquiry-based Case Studies for Today’s Students) Research Coordination Network in Undergraduate Biology Education seeks to improve student quantitative skills and participation in HT discovery. Researchers and instructors in the network learn about case pedagogy, HT technologies, publicly available datasets, and computational tools. Leveraging this training and interdisciplinary teamwork, HITS participants then create and implement HT cases. Our initial case collection has been used in >15 different courses at a variety of institutions engaging >600 students in HT discovery. We share here our rationale for engaging students in HT science, our HT cases, and network model to encourage other life science educators to join us and further develop and integrate HT complex datasets into curricula.


2021 ◽  
Vol 1 (8) ◽  
pp. 516-520
Author(s):  
Sayantan Dutta ◽  
Aleena L. Patel ◽  
Shannon E. Keenan ◽  
Stanislav Y. Shvartsman

2021 ◽  
Vol 6 ◽  
Author(s):  
Chris Cummins ◽  
Michael Franke

Numerical descriptions furnish us with an apparently precise and objective way of summarising complex datasets. In practice, the issue is less clear-cut, partly because the use of numerical expressions in natural language invites inferences that go beyond their mathematical meaning, and consequently quantitative descriptions can be true but misleading. This raises important practical questions for the hearer: how should they interpret a quantitative description that is being used to further a particular argumentative agenda, and to what extent should they treat it as a good argument for a particular conclusion? In this paper, we discuss this issue with reference to notions of argumentative strength, and consider the strategy that a rational hearer should adopt in interpreting quantitative information that is being used argumentatively by the speaker. We exemplify this with reference to United Kingdom universities’ reporting of their REF 2014 evaluations. We argue that this reporting is typical of argumentative discourse involving quantitative information in two important respects. Firstly, a hearer must take into account the speaker’s agenda in order not to be misled by the information provided; but secondly, the speaker’s choice of utterance is typically suboptimal in its argumentative strength, and this creates a considerable challenge for accurate interpretation.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nathan Wong ◽  
Daehwan Kim ◽  
Zachery Robinson ◽  
Connie Huang ◽  
Irina M. Conboy

AbstractFlow cytometry (FCM) is an analytic technique that is capable of detecting and recording the emission of fluorescence and light scattering of cells or particles (that are collectively called “events”) in a population1. A typical FCM experiment can produce a large array of data making the analysis computationally intensive2. Current FCM data analysis platforms (FlowJo3, etc.), while very useful, do not allow interactive data processing online due to the data size limitations. Here we report a more effective way to analyze FCM data on the web. Freecyto is a free and intuitive Python-flask-based web application that uses a weighted k-means clustering algorithm to facilitate the interactive analysis of flow cytometry data. A key limitation of web browsers is their inability to interactively display large amounts of data. Freecyto addresses this bottleneck through the use of the k-means algorithm to quantize the data, allowing the user to access a representative set of data points for interactive visualization of complex datasets. Moreover, Freecyto enables the interactive analyses of large complex datasets while preserving the standard FCM visualization features, such as the generation of scatterplots (dotplots), histograms, heatmaps, boxplots, as well as a SQL-based sub-population gating feature2. We also show that Freecyto can be applied to the analysis of various experimental setups that frequently require the use of FCM. Finally, we demonstrate that the data accuracy is preserved when Freecyto is compared to conventional FCM software.


SoftwareX ◽  
2021 ◽  
Vol 13 ◽  
pp. 100653
Author(s):  
Marcos Nieto ◽  
Orti Senderos ◽  
Oihana Otaegui
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document