submodular set function
Recently Published Documents


TOTAL DOCUMENTS

17
(FIVE YEARS 8)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Author(s):  
◽  
Susan Jowett

<p>A connectivity function is a symmetric, submodular set function. Connectivity functions arise naturally from graphs, matroids and other structures. This thesis focuses mainly on recognition problems for connectivity functions, that is when a connectivity function comes from a particular type of structure. In particular we give a method for identifying when a connectivity function comes from a graph, which uses no more than a polynomial number of evaluations of the connectivity function. We also give a proof that no such method can exist for matroids.</p>


2021 ◽  
Author(s):  
◽  
Susan Jowett

<p>A connectivity function is a symmetric, submodular set function. Connectivity functions arise naturally from graphs, matroids and other structures. This thesis focuses mainly on recognition problems for connectivity functions, that is when a connectivity function comes from a particular type of structure. In particular we give a method for identifying when a connectivity function comes from a graph, which uses no more than a polynomial number of evaluations of the connectivity function. We also give a proof that no such method can exist for matroids.</p>


2021 ◽  
Vol 14 (13) ◽  
pp. 3281-3294
Author(s):  
Theofilos Mailis ◽  
Yannis Kotidis ◽  
Stamatis Christoforidis ◽  
Evgeny Kharlamov ◽  
Yannis Ioannidis

Knowledge Graphs (KGs) are collections of interconnected and annotated entities that have become powerful assets for data integration, search enhancement, and other industrial applications. Knowledge Graphs such as DBPEDIA may contain billion of triple relations and are intensively queried with millions of queries per day. A prominent approach to enhance query answering on Knowledge Graph databases is View Materialization, ie., the materialization of an appropriate set of computations that will improve query performance. We study the problem of view materialization and propose a view selection methodology for processing query workloads with more than a million queries. Our approach heavily relies on subgraph pattern mining techniques that allow to create efficient summarizations of massive query workloads while also identifying the candidate views for materialization. In the core of our work is the correspondence between the view selection problem to that of Maximizing a Nondecreasing Submodular Set Function Subject to a Knapsack Constraint . The latter leads to a tractable view-selection process for native triple stores that allows a (1 - e ---1 )-approximation of the optimal selection of views. Our experimental evaluation shows that all the steps of the view-selection process are completed in a few minutes, while the corresponding rewritings accelerate 67.68% of the queries in the DBPEDIA query workload. Those queries are executed in 2.19% of their initial time on average.


2021 ◽  
pp. 104741
Author(s):  
Francesco Cellinese ◽  
Gianlorenzo D'Angelo ◽  
Gianpiero Monaco ◽  
Yllka Velaj

2019 ◽  
Vol 12 (01) ◽  
pp. 2050007 ◽  
Author(s):  
Shuyang Gu ◽  
Ganquan Shi ◽  
Weili Wu ◽  
Changhong Lu

We study the problem of maximizing non-monotone diminish return (DR)-submodular function on the bounded integer lattice, which is a generalization of submodular set function. DR-submodular functions consider the case that we can choose multiple copies for each element in the ground set. This generalization has many applications in machine learning. In this paper, we propose a [Formula: see text]-approximation algorithm with a running time of [Formula: see text], where [Formula: see text] is the size of the ground set, [Formula: see text] is the upper bound of integer lattice. Discovering important properties of DR-submodular function, we propose a fast double greedy algorithm which improves the running time.


2019 ◽  
Author(s):  
Gizem Caylak ◽  
Oznur Tastan ◽  
A. Ercument Cicek

AbstractGenome-wide association studies explain a fraction of the underlying heritability of genetic diseases. Investigating epistatic interactions between two or more loci help closing this gap. Unfortunately, sheer number of loci combinations to process and hypotheses to test prohibit the process both computationally and statistically. Epistasis test prioritization algorithms rank likely-epistatic SNP pairs to limit the number of tests. Yet, they still suffer from very low precision. It was shown in the literature that selecting SNPs that are individually correlated with the phenotype and also diverse with respect to genomic location, leads to better phenotype prediction due to genetic complementation. Here, we propose that an algorithm that pairs SNPs from such diverse regions and ranks them can improve prediction power. We propose an epistasis test prioritization algorithm which optimizes a submodular set function to select a diverse and complementary set of genomic regions that span the underlying genome. SNP pairs from these regions are then further ranked w.r.t. their co-coverage of the case cohort. We compare our algorithm with the state-of-the-art on three GWAS and show that (i) we substantially improve precision (from 0.003 to 0.652) while maintaining the significance of selected pairs, (ii) decrease the number of tests by 25 folds, and (iii) decrease the runtime by 4 folds. We also show that promoting SNPs from regulatory/coding regions improves the performance (up to 0.8). Potpourri is available at http:/ciceklab.cs.bilkent.edu.tr/potpourri.


2018 ◽  
Author(s):  
Serhan Yilmaz ◽  
Oznur Tastan ◽  
A. Ercument Cicek

AbstractPhenotypic heritability of complex traits and diseases is seldom explained by individual genetic variants. Algorithms that select SNPs which are close and connected on a biological network have been successful in finding biologically-interpretable and predictive loci. However, we argue that the connectedness constraint favors selecting redundant features that affect similar biological processes and therefore does not necessarily yield better predictive performance. In this paper, we propose a novel method called SPADIS that selects SNPs that cover diverse regions in the underlying SNP-SNP network. SPADIS favors the selection of remotely located SNPs in order to account for the complementary additive effects of SNPs that are associated with the phenotype. This is achieved by maximizing a submodular set function with a greedy algorithm that ensures a constant factor (1−1/e) approximation. We compare SPADIS to the state-of-the-art method SConES, on a dataset of Arabidopsis Thaliana genotype and continuous flowering time phenotypes. SPADIS has better regression performance in 12 out of 17 phenotypes on average, it identifies more candidate genes and runs faster. We also investigate the use of Hi-C data to construct SNP-SNP network in the context of SNP selection problem for the first time, which yields slight but consistent improvements in regression performance. SPADIS is available at http://ciceklab.cs.bilkent.edu.tr/spadis


2015 ◽  
Vol 247 (3) ◽  
pp. 1013-1016 ◽  
Author(s):  
Camilo Ortiz-Astorquiza ◽  
Ivan Contreras ◽  
Gilbert Laporte

Sign in / Sign up

Export Citation Format

Share Document