Generating realistic data sets for combinatorial auctions

Author(s):  
A. Bonaccorsi ◽  
B. Codenotti ◽  
N. Dimitri ◽  
M. Leoncini ◽  
G. Resta ◽  
...  
2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Stinus Lindgreen ◽  
Karen L. Adair ◽  
Paul P. Gardner

Abstract Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition and functional capacity. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html


2015 ◽  
Author(s):  
Stinus Lindgreen ◽  
Karen L Adair ◽  
Paul Gardner

Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html


2021 ◽  
Author(s):  
Francesco Tudisco ◽  
Desmond Higham

Abstract Network scientists have shown that there is great value in studying pairwise interactions between components in a system. From a linear algebra point of view, this involves defining and evaluating functions of the associated adjacency matrix. Recent work indicates that there are further benefits from accounting directly for higher order interactions, notably through a hypergraph representation where an edge may involve multiple nodes. Building on these ideas, we motivate, define and analyze a class of spectral centrality measures for identifying important nodes and hyperedges in hypergraphs, generalizing existing network science concepts. By exploiting the latest developments in nonlinear Perron-Frobenius theory, we show how the resulting constrained nonlinear eigenvalue problems have unique solutions that can be computed efficiently via a nonlinear power method iteration. We illustrate the measures on realistic data sets.


2009 ◽  
Vol 31 (1) ◽  
pp. 7-11 ◽  
Author(s):  
Robert N. Goldman ◽  
John D. McKenzie Jr.
Keyword(s):  

Author(s):  
Siddhartha V. Jayanti ◽  
Robert E. Tarjan

AbstractWe develop and analyze concurrent algorithms for the disjoint set union (“union-find” ) problem in the shared memory, asynchronous multiprocessor model of computation, with CAS (compare and swap) or DCAS (double compare and swap) as the synchronization primitive. We give a deterministic bounded wait-free algorithm that uses DCAS and has a total work bound of $$O\biggl ( m \cdot \left( \log {\left( \frac{np}{m} + 1 \right) } + \alpha {\left( n, \frac{m}{np} \right) } \right) \biggr )$$ O ( m · log np m + 1 + α n , m np ) for a problem with n elements and m operations solved by p processes, where $$\alpha $$ α is a functional inverse of Ackermann’s function. We give two randomized algorithms that use only CAS and have the same work bound in expectation. The analysis of the second randomized algorithm is valid even if the scheduler is adversarial. Our DCAS and randomized algorithms take $$O(\log n)$$ O ( log n ) steps per operation, worst-case for the DCAS algorithm, high-probability for the randomized algorithms. Our work and step bounds grow only logarithmically with p, making our algorithms truly scalable. We prove that for a class of symmetric algorithms that includes ours, no better step or work bound is possible. Our work is theoretical, but Alistarh et al (In search of the fastest concurrent union-find algorithm, 2019), Dhulipala et al (A framework for static and incremental parallel graph connectivity algorithms, 2020) and Hong et al (Exploring the design space of static and incremental graph connectivity algorithms on gpus, 2020) have implemented some of our algorithms on CPUs and GPUs and experimented with them. On many realistic data sets, our algorithms run as fast or faster than all others.


2020 ◽  
Vol 21 (12) ◽  
pp. 4380
Author(s):  
Viet-Khoa Tran-Nguyen ◽  
Didier Rognan

Developing realistic data sets for evaluating virtual screening methods is a task that has been tackled by the cheminformatics community for many years. Numerous artificially constructed data collections were developed, such as DUD, DUD-E, or DEKOIS. However, they all suffer from multiple drawbacks, one of which is the absence of experimental results confirming the impotence of presumably inactive molecules, leading to possible false negatives in the ligand sets. In light of this problem, the PubChem BioAssay database, an open-access repository providing the bioactivity information of compounds that were already tested on a biological target, is now a recommended source for data set construction. Nevertheless, there exist several issues with the use of such data that need to be properly addressed. In this article, an overview of benchmarking data collections built upon experimental PubChem BioAssay input is provided, along with a thorough discussion of noteworthy issues that one must consider during the design of new ligand sets from this database. The points raised in this review are expected to guide future developments in this regard, in hopes of offering better evaluation tools for novel in silico screening procedures.


Author(s):  
Beatriz Andrés ◽  
Raquel Sanchis ◽  
Raúl Poler ◽  
Manuel Díaz-Madroñero ◽  
Josefa Mula

<p>The goal of C2NET European H2020 Funded Project is the creation of cloud-enabled tools for supporting the SMEs supply network optimization of manufacturing and logistic assets based on collaborative demand, production and delivery plans. In the scope of C2NET Project, and particularly in the Optimisation module (C2NET OPT), this paper proposes a novel holistic mixed integer linear programing (MILP) model to optimise the injection sequencing in a multi-machine case. The results of the MILP will support the production planner decision-making process in the calculation of (i) moulds setup in certain machines, and (ii) the amount of products to produce in order to minimise the setup, inventory, and backorders costs. The designed MILP takes part of the algorithms repository created in C2NET European Funded Project to solve realistic industry planning problems. The MILP is verified in realistic data considering three data sets with different sizes, in order to test it’s the computation efficiency.</p>


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Francesco Tudisco ◽  
Desmond J. Higham

AbstractNetwork scientists have shown that there is great value in studying pairwise interactions between components in a system. From a linear algebra point of view, this involves defining and evaluating functions of the associated adjacency matrix. Recent work indicates that there are further benefits from accounting directly for higher order interactions, notably through a hypergraph representation where an edge may involve multiple nodes. Building on these ideas, we motivate, define and analyze a class of spectral centrality measures for identifying important nodes and hyperedges in hypergraphs, generalizing existing network science concepts. By exploiting the latest developments in nonlinear Perron−Frobenius theory, we show how the resulting constrained nonlinear eigenvalue problems have unique solutions that can be computed efficiently via a nonlinear power method iteration. We illustrate the measures on realistic data sets.


2020 ◽  
Author(s):  
J. Griffié ◽  
T.A. Pham ◽  
C. Sieben ◽  
R. Lang ◽  
V. Cevher ◽  
...  

AbstractAlthough single molecule localisation microscopy enables for the visualisation of cells nanoscale organisation, its dissemination remains limited mainly due to the complexity of the associated imaging acquisition, impacting on outputs’ reliability and reproducibility. We propose here the first all-in-one fully virtual environment for SMLM acquisition: Virtual-SMLM, including on-the-fly interactivity and real time display. It relies on a novel realistic approach to simulate fluorophores photo-physics based on independent pseudo-continuous emission traces. It also facilitates for user-specific experimental and optical environment design. As such, it constitutes a unique tool for the training of both users and machine learning approaches to automated SMLM, as well as for experimental validation, whilst providing realistic data sets for the development of image reconstruction algorithms and data analysis software.


Sign in / Sign up

Export Citation Format

Share Document