scholarly journals neuPrint: Analysis Tools for EM Connectomics

Author(s):  
Jody Clements ◽  
Tom Dolafi ◽  
Lowell Umayam ◽  
Nicole L. Neubarth ◽  
Stuart Berg ◽  
...  

AbstractDue to technological advances in electron microscopy (EM) and deep learning, it is now practical to reconstruct a connectome, a description of neurons and the connections between them, for significant volumes of neural tissue. The limited scope of past reconstructions meant they were primarily used by domain experts, and performance was not a serious problem. But the new reconstructions, of common laboratory creatures such as the fruit fly Drosophila melanogaster, upend these assumptions. These natural neural networks now contain tens of thousands of neurons and tens of millions of connections between them, with yet larger reconstructions pending, and are of interest to a large community of non-specialists. This requires new tools that are easy to use and efficiently handle large data. We introduce neuPrint to address these data analysis challenges. neuPrint is a database and analysis ecosystem that organizes connectome data in a manner conducive to biological discovery. In particular, we propose a data model that allows users to access the connectome at different levels of abstraction primarily through a graph database, neo4j, and its powerfully expressive query language Cypher. neuPrint is compatible with modern connectome reconstruction workflows, providing tools for assessing reconstruction quality, and offering both batch and incremental updates to match modern connectome reconstruction flows. Finally, we introduce a web interface and programmer API that targets a diverse user skill set. We demonstrate the effectiveness and efficiency of neuPrint through example database queries.

2016 ◽  
Vol 4 (3) ◽  
pp. 22-37 ◽  
Author(s):  
Nayem Rahman

Scorecard-based measurement techniques are used by organizations to measure the performance of their business operations. A scorecard approach could be applied to a database system to measure performance of SQL (Structured Query Language) being executed and the extent of resources being used by SQL. In a large data warehouse, thousands of jobs run daily via batch cycles to refresh different subject areas. Simultaneously, thousands of queries by business intelligence tools and ad-hoc queries are being executed twenty-four by seven. There needs to be a controlling mechanism to make sure these batch jobs and queries are efficient and do not consume database systems resources more than optimal. The authors propose measurement of SQL query performance via a scorecard tool. The motivation behind using a scorecard tool is to make sure that the resource consumption of SQL queries is predictable and the database system environment is stable. The experimental results show that queries that pass scorecard evaluation criteria tend to utilize optimal level of database systems computing resources. These queries also show improved parallel efficiency (PE) in using computing resources (CPU, I/O and spool space) that demonstrate the usefulness of SQL scorecard.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Peter Baumann ◽  
Dimitar Misev ◽  
Vlad Merticariu ◽  
Bang Pham Huu

AbstractMulti-dimensional arrays (also known as raster data or gridded data) play a key role in many, if not all science and engineering domains where they typically represent spatio-temporal sensor, image, simulation output, or statistics “datacubes”. As classic database technology does not support arrays adequately, such data today are maintained mostly in silo solutions, with architectures that tend to erode and not keep up with the increasing requirements on performance and service quality. Array Database systems attempt to close this gap by providing declarative query support for flexible ad-hoc analytics on large n-D arrays, similar to what SQL offers on set-oriented data, XQuery on hierarchical data, and SPARQL and CIPHER on graph data. Today, Petascale Array Database installations exist, employing massive parallelism and distributed processing. Hence, questions arise about technology and standards available, usability, and overall maturity. Several papers have compared models and formalisms, and benchmarks have been undertaken as well, typically comparing two systems against each other. While each of these represent valuable research to the best of our knowledge there is no comprehensive survey combining model, query language, architecture, and practical usability, and performance aspects. The size of this comparison differentiates our study as well with 19 systems compared, four benchmarked to an extent and depth clearly exceeding previous papers in the field; for example, subsetting tests were designed in a way that systems cannot be tuned to specifically these queries. It is hoped that this gives a representative overview to all who want to immerse into the field as well as a clear guidance to those who need to choose the best suited datacube tool for their application. This article presents results of the Research Data Alliance (RDA) Array Database Assessment Working Group (ADA:WG), a subgroup of the Big Data Interest Group. It has elicited the state of the art in Array Databases, technically supported by IEEE GRSS and CODATA Germany, to answer the question: how can data scientists and engineers benefit from Array Database technology? As it turns out, Array Databases can offer significant advantages in terms of flexibility, functionality, extensibility, as well as performance and scalability—in total, the database approach of offering “datacubes” analysis-ready heralds a new level of service quality. Investigation shows that there is a lively ecosystem of technology with increasing uptake, and proven array analytics standards are in place. Consequently, such approaches have to be considered a serious option for datacube services in science, engineering and beyond. Tools, though, vary greatly in functionality and performance as it turns out.


2014 ◽  
Vol 67 (5) ◽  
pp. 791-809 ◽  
Author(s):  
Philipp Last ◽  
Christian Bahlke ◽  
Martin Hering-Bertram ◽  
Lars Linsen

AIS was primarily developed to exchange vessel-related data among vessels or AIS stations by using very-high frequency (VHF) technology to increase safety at sea. This study evaluates the formal integrity, availability, and the reporting intervals of AIS data with a focus on vessel movement prediction. In contrast to former studies, this study is based on a large data collection of over 85 million AIS messages, which were continuously received within a time period of two months. Thus, the evaluated data represent a comprehensive and up-to-date view of the current usage of AIS systems installed on vessels. Results of previous studies concerning the availability of AIS data are confirmed and extended. New aspects such as reporting intervals are additionally evaluated. Received messages are stored in a database, which allows for performing database queries to evaluate the obtained data in an automatic way. This study shows that almost ten years after becoming mandatory for professional operating vessels, AIS still lacks availability for both static and dynamic data and that the reporting intervals are not as reliable as specified within the technical AIS standard.


2017 ◽  
pp. 99
Author(s):  
Pamela S. Soltis ◽  
Douglas E. Soltis

Technological advances in molecular biology have greatly increased the speed and efficiency of DNA sequencing, making it possible to construct large molecular data sets for phylogeny reconstruction relatively quickly. Despite their potential for improving our understanding of phylogeny, these large data sets also provide many challenges. In this paper, we discuss several of these challenges, including 1) the failure of a search to find the most parsimonious trees (the local optimum) in a reasonable amount of time, 2) the difference between a local optimum and the global optimum, and 3) the existence of multiple classes (islands) of most parsimonious trees. We also discuss possible strategies to improve the' likelihood of finding the most parsimonious tree(s) and present two examples from our work on angiosperm phylogeny. We conclude with a discussion of two alternatives to analyses of entire large data sets, the exemplar approach and compartmentalization, and suggest that additional consideration must be given to issues of data analysis for large data sets, whether morphological or molecular.


Author(s):  
Suchith Reddy Arukala ◽  
Rathish Kumar Pancharathi

The construction sector is a resource-driven and resource-dependent industry. A rising global interest to incorporate sustainability principles in the policy-making means a careful balancing of economic growth with sustainability. To achieve this end in the Indian building sector, a triple-bottom-line-based building assessment tool like GRIHA and IGBC was introduced for assessing building sustainability. However, to revitalize the ideas of Reduce, Replace, Reuse, Recycle and Renovate (the ‘5Rs’) into implementable solutions, the technological dimension is introduced to form a quadruple bottom line (QBL) approach, i.e., social, environmental, economic and technological (SEET), for achieving sustainable construction. This study aims to address the necessity to add a new dimension, viz. technological advances in the sustainability arena of the construction industry. The objective of the study is to include technological advances in building materials, construction processes and techniques and design philosophies in the developed SBAT framework. In this extended and upgraded SBAT 2.0, advances in sustainability (AS) criterion accounts for 11.5 per cent showing its significance in achieving building sustainability. The use of discrete reinforcement, additive manufacturing, 3D printing, design based on packing density and rheological properties of concrete, use of alkali-activated materials in the mix-design and performance-based design concepts that affect future sustainability are successfully brought into the fold of SBAT framework.


Author(s):  
Kate Leader

The live presence of a defendant at trial is a long-standing feature of adversarial criminal trial. So much of what constitutes the adversarial method of adjudication is dependent on qualities that arise from this presence: confrontation and demeanor assessment, among other factors, play important roles in how truth is constructed. As such, performative matters—how a defendant enacts and inhabits her role, how she is positioned or silenced-- have long been of concern to legal scholars. These performative concerns are also centrally implicated in defendant rights, such as the right to a fair trial. But today we face new challenges that call into question fundamental beliefs around trials, defendant presence, and fairness. First, technological advances have led to defendants appearing remotely in hearings from the prison in which they are held. Second, the trial itself is arguably vanishing in most adversarial jurisdictions. Third, the use of trials in absentia means that criminal trials may take place in a defendant’s absence; in England and Wales for less serious offenses this can be done without inquiring why a defendant isn’t there. This chapter therefore seeks to understand the performative implications of these challenges by shifting the conversation from presence to absence. What difference does it make if a defendant is no longer there? Does being there facilitate greater fairness, despite the obvious issues of constraint and silencing? Drawing on sociolegal, political, and performance theory the chapter considers the implications of absence in the criminal trial, asking what happens when the defendant disappears.


2021 ◽  
pp. 2141001
Author(s):  
Sanqiang Wei ◽  
Hongxia Hou ◽  
Hua Sun ◽  
Wei Li ◽  
Wenxia Song

The plots in certain literary works are very complicated and hinder readers from understanding them. Therefore tools should be proposed to support readers; comprehension of complex literary works supports their understanding by providing the most important information to readers. A human reader must capture multiple levels of abstraction and meaning to formulate an understanding of a document. Hence, in this paper, an Improved [Formula: see text]-means clustering algorithm (IKCA) has been proposed for literary word classification. For text data, the words that can express exact semantic in a class are generally better features. This paper uses the proposed technique to capture numerous cluster centroids for every class and then select the high-frequency words in centroids the text features for classification. Furthermore, neural networks have been used to classify text documents and [Formula: see text]-mean to cluster text documents. To develop the model based on unsupervised and supervised techniques to meet and identify the similarity between documents. The numerical results show that the suggested model will enhance to increases quality comparison of the existing Algorithm and [Formula: see text]-means algorithm, accuracy comparison of ALA and IKCA (95.2%), time is taken for clustering is less than 2 hours, success rate (97.4%) and performance ratio (98.1%).


Author(s):  
Janusz Kacprzyk ◽  
Slawomir Zadrozny

We consider linguistic database summaries in the sense of Yager (1982), in an implementable form proposed by Kacprzyk & Yager (2001) and Kacprzyk, Yager & Zadrozny (2000), exemplified by, for a personnel database, “most employees are young and well paid” (with some degree of truth) and their extensions as a very general tool for a human consistent summarization of large data sets. We advocate the use of the concept of a protoform (prototypical form), vividly advocated by Zadeh and shown by Kacprzyk & Zadrozny (2005) as a general form of a linguistic data summary. Then, we present an extension of our interactive approach to fuzzy linguistic summaries, based on fuzzy logic and fuzzy database queries with linguistic quantifiers. We show how fuzzy queries are related to linguistic summaries, and that one can introduce a hierarchy of protoforms, or abstract summaries in the sense of latest Zadeh’s (2002) ideas meant mainly for increasing deduction capabilities of search engines. We show an implementation for the summarization of Web server logs.


2019 ◽  
Vol 18 ◽  
pp. 160940691988069 ◽  
Author(s):  
Rebecca L. Brower ◽  
Tamara Bertrand Jones ◽  
La’Tara Osborne-Lampkin ◽  
Shouping Hu ◽  
Toby J. Park-Gaghan

Big qualitative data (Big Qual), or research involving large qualitative data sets, has introduced many newly evolving conventions that have begun to change the fundamental nature of some qualitative research. In this methodological essay, we first distinguish big data from big qual. We define big qual as data sets containing either primary or secondary qualitative data from at least 100 participants analyzed by teams of researchers, often funded by a government agency or private foundation, conducted either as a stand-alone project or in conjunction with a large quantitative study. We then present a broad debate about the extent to which big qual may be transforming some forms of qualitative inquiry. We present three questions, which examine the extent to which large qualitative data sets offer both constraints and opportunities for innovation related to funded research, sampling strategies, team-based analysis, and computer-assisted qualitative data analysis software (CAQDAS). The debate is framed by four related trends to which we attribute the rise of big qual: the rise of big quantitative data, the growing legitimacy of qualitative and mixed methods work in the research community, technological advances in CAQDAS, and the willingness of government and private foundations to fund large qualitative projects.


2020 ◽  
Vol 6 (6) ◽  
pp. 55
Author(s):  
Gerasimos Arvanitis ◽  
Aris S. Lalos ◽  
Konstantinos Moustakas

Recently, spectral methods have been extensively used in the processing of 3D meshes. They usually take advantage of some unique properties that the eigenvalues and the eigenvectors of the decomposed Laplacian matrix have. However, despite their superior behavior and performance, they suffer from computational complexity, especially while the number of vertices of the model increases. In this work, we suggest the use of a fast and efficient spectral processing approach applied to dense static and dynamic 3D meshes, which can be ideally suited for real-time denoising and compression applications. To increase the computational efficiency of the method, we exploit potential spectral coherence between adjacent parts of a mesh and then we apply an orthogonal iteration approach for the tracking of the graph Laplacian eigenspaces. Additionally, we present a dynamic version that automatically identifies the optimal subspace size that satisfies a given reconstruction quality threshold. In this way, we overcome the problem of the perceptual distortions, due to the fixed number of subspace sizes that is used for all the separated parts individually. Extensive simulations carried out using different 3D models in different use cases (i.e., compression and denoising), showed that the proposed approach is very fast, especially in comparison with the SVD based spectral processing approaches, while at the same time the quality of the reconstructed models is of similar or even better reconstruction quality. The experimental analysis also showed that the proposed approach could also be used by other denoising methods as a preprocessing step, in order to optimize the reconstruction quality of their results and decrease their computational complexity since they need fewer iterations to converge.


Sign in / Sign up

Export Citation Format

Share Document