complex data structure Latest Research Papers

Background Mitochondrial dysfunction is involved in many complex diseases. Efficient and accurate evaluation of mitochondrial functionality is crucial for understanding pathology as well as facilitating novel therapeutic developments. As a popular platform, Seahorse extracellular flux (XF) analyzer is widely used for measuring mitochondrial oxygen consumption rate (OCR) in living cells. A hidden feature of Seahorse XF OCR data is that it has a complex data structure, caused by nesting and crossing between measurement cycles, wells and plates. Surprisingly, statistical analysis of Seahorse XF data has not received sufficient attention, and current methods completely ignore the complex data structure, impairing the robustness of statistical inference. Results To rigorously incorporate the complex structure into data analysis, here we developed a Bayesian hierarchical modeling framework, OCRbayes, and demonstrated its applicability based on analysis of published data sets. Conclusions We showed that OCRbayes can analyze Seahorse XF OCR experimental data derived from either single or multiple plates. Moreover, OCRbayes has potential to be used for diagnosing patients with mitochondrial diseases.

Download Full-text

In-Memory Interval Joins

The VLDB Journal ◽

10.1007/s00778-020-00639-0 ◽

2021 ◽

Author(s):

Panagiotis Bouros ◽

Nikos Mamoulis ◽

Dimitrios Tsitsigkos ◽

Manolis Terrovitis

Keyword(s):

Parallel Computation ◽

State Of The Art ◽

Complex Data ◽

Plane Sweep ◽

Join Algorithm ◽

Sweep Algorithm ◽

Join Algorithms ◽

Domain Partitioning ◽

Complex Data Structure ◽

Independent Tasks

AbstractThe interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory approach based on plane sweep (PS) for modern hardware was proposed which greatly outperforms previous work. However, this approach relies on a complex data structure and its parallelization has not been adequately studied. In this article, we investigate in-memory interval joins in two directions. First, we explore the applicability of a largely ignored forward scan (FS)-based plane sweep algorithm, for single-threaded join evaluation. We propose four optimizations for FS that greatly reduce its cost, making it competitive or even faster than the state-of-the-art. Second, we study in depth the parallel computation of interval joins. We design a non-partitioning-based approach that determines independent tasks of the join algorithm to run in parallel. Then, we address the drawbacks of the previously proposed hash-based partitioning and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach, we propose a novel breakdown of the partition-joins into mini-joins to be scheduled in the available CPU threads and propose an adaptive domain partitioning, aiming at load balancing. We also investigate how the partitioning phase can benefit from modern parallel hardware. Our thorough experimental analysis demonstrates the advantage of our novel partitioning-based approach for parallel computation.

Download Full-text

Building a database for complex industrial monitoring systems

Radio Industry (Russia) ◽

10.21778/2413-9599-2021-31-1-65-73 ◽

2021 ◽

Vol 31 (1) ◽

pp. 65-73

Author(s):

M. Z. Benenson ◽

E. A. Alekseeva

Keyword(s):

Variable Number ◽

Software Implementation ◽

Monitoring Systems ◽

Complex Data ◽

Application Development ◽

Industrial Facilities ◽

Software Interface ◽

Wide Range ◽

Industrial Monitoring ◽

Complex Data Structure

Problem statement. When creating monitoring systems for industrial facilities for a range of purposes, it becomes necessary to solve processing and storing objects with a complex data structure. The user must be provided with tools for processing and storing the defined data and object types that they have defined.Objective. Development of a software implementation of the interface for interaction with the database built into industrial facilities’ monitoring system.Results. A software interface for interacting with an object-oriented database has been developed. Three programming classes are used to describe various types of industrial system objects. Class methods have been developed that allow setting a variable number of attributes for different object types. The authors propose a method for extracting an object with specified attribute values, similar to the QBE method, and a method for complex (natural) queries written in the application development language.Practical implications. The proposed software implementation of the interface for interaction with the built-in database can be used to create a wide range of industrial monitoring systems. This approach allows to significantly reduce the computing resources required for the implementation of such systems, reduces the time and cost of their development.

Download Full-text

Corpona – The Pythonic Way of Processing Corpora

Multilingual Facilitation ◽

10.31885/9789515150257.3 ◽

2021 ◽

pp. 25-30

Author(s):

Khalid Alnajjar ◽

◽

Mika Hämäläinen

Keyword(s):

Data Structure ◽

Open Source ◽

Easy Access ◽

Complex Data ◽

Complex Data Structure

Every NLP researcher has to work with different XML or JSON encoded files. This often involves writing code that serves a very specific purpose. Corpona is meant to streamline any workflow that involves XML and JSON based corpora, by offering easy and reusable functionalities. The current functionalities relate to easy parsing and access to XML files, easy access to sub-items in a nested JSON structure and visualization of a complex data structure. Corpona is fully open-source and it is available on GitHub and Zenodo.

Download Full-text

OCRbayes: A Bayesian hierarchical modeling framework for Seahorse extracellular flux oxygen consumption rate data analysis

10.1101/2021.03.16.435639 ◽

2021 ◽

Author(s):

Xiang Zhang ◽

Taolin Yuan ◽

Jaap Keijer ◽

Vincent C. J. de Boer

Keyword(s):

Oxygen Consumption ◽

Data Structure ◽

Consumption Rate ◽

Hierarchical Modeling ◽

Bayesian Hierarchical Modeling ◽

Complex Data ◽

Modeling Framework ◽

Bayesian Hierarchical ◽

Extracellular Flux ◽

Complex Data Structure

Mitochondrial dysfunction is involved in many complex diseases. Efficient and accurate evaluation of mitochondrial functionality is crucial for understanding pathology as well as facilitating novel therapeutic developments. As a popular platform, Seahorse extracellular flux (XF) analyzer is widely used for measuring mitochondrial oxygen consumption rate (OCR) in living cells. A hidden feature of Seahorse XF OCR data is that it has a complex data structure, caused by nesting and crossing between measurement cycles, wells and plates. Surprisingly, statistical analysis of Seahorse XF data has not received sufficient attention, and current methods completely ignore the complex data structure, impairing the robustness of statistical inference. To rigorously incorporate the complex structure into data analysis, here we developed a Bayesian hierarchical modeling framework, OCRbayes, and demonstrated its applicability based on analysis of published data sets. We showed that OCRbayes can analyze Seahorse XF OCR experimental data derived from either single or multiple plates. Moreover, OCRbayes has potential to be used for diagnosing patients with mitochondrial diseases.

Download Full-text

Group Reconstruction and Max-Pooling Residual Capsule Network

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/310 ◽

2019 ◽

Cited By ~ 1

Author(s):

Xinpeng Ding ◽

Nannan Wang ◽

Xinbo Gao ◽

Jie Li ◽

Xiaoyu Wang

Keyword(s):

Routing Algorithm ◽

Large Data ◽

Data Networks ◽

Complex Data ◽

Low Level ◽

Max Pooling ◽

The Arts ◽

Routing Mechanism ◽

High Level ◽

Complex Data Structure

In capsule networks, the mapping of low-level capsules to high-level capsules is achieved by a routing-by-agreement algorithm. Since the capsule is made up of collections of neurons and the routing mechanism involves all the capsules instead of simply discarding some of the neurons like Max-Pooling, the capsule network has stronger representation ability than the traditional neural network. However, considering too much low-level capsules' information will cause its corresponding upper layer capsules to be interfered by other irrelevant information or noise capsules. Therefore, the original capsule network does not perform well on complex data structure. What's worse, computational complexity becomes a bottleneck in dealing with large data networks. In order to solve these shortcomings, this paper proposes a group reconstruction and max-pooling residual capsule network (GRMR-CapsNet). We build a block in which all capsules are divided into different groups and perform group reconstruction routing algorithm to obtain the corresponding high-level capsules. Between the lower and higher layers, Capsule Max-Pooling is adopted to prevent overfitting. We conduct experiments on CIFAR-10/100 and SVHN datasets and the results show that our method can perform better against state-of-the-arts.

Download Full-text

Using the coq theorem prover to verify complex data structure invariants

Proceedings of the 15th ACM-IEEE International Conference on Formal Methods and Models for System Design - MEMOCODE '17 ◽

10.1145/3127041.3127061 ◽

2017 ◽

Author(s):

Kenneth Roe ◽

Scott F. Smith

Keyword(s):

Data Structure ◽

Theorem Prover ◽

Complex Data ◽

Complex Data Structure

Download Full-text

Semi-automatic exploratory data analytics for actionable discoveries through subgroup mining

10.32469/10355/71303 ◽

2017 ◽

Author(s):

◽

Danlu Liu

Keyword(s):

Real Data ◽

Search Space ◽

Complex Data ◽

New Paradigm ◽

Quantitative Measurements ◽

Exploratory Data ◽

Index Contrast ◽

Exploratory Data Mining ◽

Contrast Measurement ◽

Complex Data Structure

People are born with the curiosity to see differences between groups. These differences are useful for understanding the root causes of certain discrepancies, such as populations and diseases. However, without prior knowledge of the data, it is extremely challenging to identify which groups differ most, let alone to discover what associations contribute to the differences. The challenges are mainly from the large searching space with complex data structure, as well as the lack of efficient quantitative measurements that are closely related to the meaning the differences. To tackle these issues, we developed a novel exploratory data mining method to identify ranked subgroups that are highly contrasted for further in-depth analyses. The underpinning components of this method include (1) a semi-greedy forward floating selection algorithm to reduce the search space, (2) a deep-exploring approach to aggregate a collection of sizable and creditable candidate feature sets for subgroups identification using in-memory computing techniques, (3) a G-index contrast measurement to guide the exploratory process and to evaluate the patterns of subgroup pairs, and (4) a ranking method to provide mined results from highly contrasted subgroups. Computational experiments were conducted on both synthesized and real data. The algorithm performed adequately in recognizing known subgroups and discovering new and unexpected subgroups. This exploratory data analysis method will provide a new paradigm to select data-driven hypotheses that will produce potentially successful actionable outcomes to tailor to subpopulations of individuals, such as consumers in E-commerce and patients in clinical trials.

Download Full-text

Sketch-based semantic feature modeling with q-complex data structure

10.14711/thesis-b1198697 ◽

2012 ◽

Author(s):

Long Zeng

Keyword(s):

Data Structure ◽

Feature Modeling ◽

Semantic Feature ◽

Complex Data ◽

Complex Data Structure

Download Full-text

Construction of a Flexible Data Structures Laboratory

CLEI electronic journal ◽

10.19153/cleiej.10.1.5 ◽

2007 ◽

Vol 10 (1) ◽

Author(s):

Jorge Villalobos ◽

Danilo Pérez ◽

Juan Castro ◽

Camilo Jiménez

Keyword(s):

Data Structure ◽

Computer Science ◽

Data Structures ◽

Efficient Algorithm ◽

Science Curriculum ◽

Problem Solution ◽

Complex Data ◽

Different Types ◽

Computer Science Curriculum ◽

Complex Data Structure

In a computer science curriculum, the data structures course is considered fundamental. In that course, students must generate the ability to desingn the more suitable data structures for a problem solution. They must also write an efficient algorithm in order to solve the problem. Students must understand that there are different types of data structures, each of them with associated algorithms of different complexity. A data structures laboratory is a set of computional tools that helps students in the experimentation with the concepts introduced in the curse. The main objetive of this experimentation is to generate the student's needed abilities for manipulating complex data structure. This paper presents the main characteristics of the laboratory built as a sopport of the course. we illustrate the huge possibilities of the tool with an example.

Download Full-text

complex data structure
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

OCRbayes: A Bayesian hierarchical modeling framework for Seahorse extracellular flux oxygen consumption rate data analysis

In-Memory Interval Joins

Building a database for complex industrial monitoring systems

Corpona – The Pythonic Way of Processing Corpora

OCRbayes: A Bayesian hierarchical modeling framework for Seahorse extracellular flux oxygen consumption rate data analysis

Group Reconstruction and Max-Pooling Residual Capsule Network

Using the coq theorem prover to verify complex data structure invariants

Semi-automatic exploratory data analytics for actionable discoveries through subgroup mining

Sketch-based semantic feature modeling with q-complex data structure

Construction of a Flexible Data Structures Laboratory

Export Citation Format

complex data structureRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

OCRbayes: A Bayesian hierarchical modeling framework for Seahorse extracellular flux oxygen consumption rate data analysis

In-Memory Interval Joins

Building a database for complex industrial monitoring systems

Corpona – The Pythonic Way of Processing Corpora

OCRbayes: A Bayesian hierarchical modeling framework for Seahorse extracellular flux oxygen consumption rate data analysis

Group Reconstruction and Max-Pooling Residual Capsule Network

Using the coq theorem prover to verify complex data structure invariants

Semi-automatic exploratory data analytics for actionable discoveries through subgroup mining

Sketch-based semantic feature modeling with q-complex data structure

Construction of a Flexible Data Structures Laboratory

complex data structure
Recently Published Documents