scholarly journals The Power of Linear-Time Data Reduction for Maximum Matching

Algorithmica ◽  
2020 ◽  
Vol 82 (12) ◽  
pp. 3521-3565
Author(s):  
George B. Mertzios ◽  
André Nichterlein ◽  
Rolf Niedermeier

Abstract Finding maximum-cardinality matchings in undirected graphs is arguably one of the most central graph primitives. For m-edge and n-vertex graphs, it is well-known to be solvable in $$O(m\sqrt{n})$$ O ( m n )  time; however, for several applications this running time is still too slow. We investigate how linear-time (and almost linear-time) data reduction (used as preprocessing) can alleviate the situation. More specifically, we focus on linear-time kernelization. We start a deeper and systematic study both for general graphs and for bipartite graphs. Our data reduction algorithms easily comply (in form of preprocessing) with every solution strategy (exact, approximate, heuristic), thus making them attractive in various settings.

2021 ◽  
Vol 26 ◽  
pp. 1-30
Author(s):  
Tomohiro Koana ◽  
Viatcheslav Korenwein ◽  
André Nichterlein ◽  
Rolf Niedermeier ◽  
Philipp Zschoche

Finding a maximum-cardinality or maximum-weight matching in (edge-weighted) undirected graphs is among the most prominent problems of algorithmic graph theory. For n -vertex and m -edge graphs, the best-known algorithms run in Õ( m √ n ) time. We build on recent theoretical work focusing on linear-time data reduction rules for finding maximum-cardinality matchings and complement the theoretical results by presenting and analyzing (thereby employing the kernelization methodology of parameterized complexity analysis) new (near-)linear-time data reduction rules for both the unweighted and the positive-integer-weighted case. Moreover, we experimentally demonstrate that these data reduction rules provide significant speedups of the state-of-the art implementations for computing matchings in real-world graphs: the average speedup factor is 4.7 in the unweighted case and 12.72 in the weighted case.


Algorithmica ◽  
2022 ◽  
Author(s):  
Boris Klemz ◽  
Günter Rote

AbstractA bipartite graph $$G=(U,V,E)$$ G = ( U , V , E ) is convex if the vertices in V can be linearly ordered such that for each vertex $$u\in U$$ u ∈ U , the neighbors of u are consecutive in the ordering of V. An induced matchingH of G is a matching for which no edge of E connects endpoints of two different edges of H. We show that in a convex bipartite graph with n vertices and mweighted edges, an induced matching of maximum total weight can be computed in $$O(n+m)$$ O ( n + m ) time. An unweighted convex bipartite graph has a representation of size O(n) that records for each vertex $$u\in U$$ u ∈ U the first and last neighbor in the ordering of V. Given such a compact representation, we compute an induced matching of maximum cardinality in O(n) time. In convex bipartite graphs, maximum-cardinality induced matchings are dual to minimum chain covers. A chain cover is a covering of the edge set by chain subgraphs, that is, subgraphs that do not contain induced matchings of more than one edge. Given a compact representation, we compute a representation of a minimum chain cover in O(n) time. If no compact representation is given, the cover can be computed in $$O(n+m)$$ O ( n + m ) time. All of our algorithms achieve optimal linear running time for the respective problem and model, and they improve and generalize the previous results in several ways: The best algorithms for the unweighted problem versions had a running time of $$O(n^2)$$ O ( n 2 ) (Brandstädt et al. in Theor. Comput. Sci. 381(1–3):260–265, 2007. 10.1016/j.tcs.2007.04.006). The weighted case has not been considered before.


Author(s):  
Atheer Alahmed ◽  
Amal Alrasheedi ◽  
Maha Alharbi ◽  
Norah Alrebdi ◽  
Marwan Aleasa ◽  
...  

2017 ◽  
Vol 27 (04) ◽  
pp. 277-296 ◽  
Author(s):  
Vincent Froese ◽  
Iyad Kanj ◽  
André Nichterlein ◽  
Rolf Niedermeier

We study the General Position Subset Selection problem: Given a set of points in the plane, find a maximum-cardinality subset of points in general position. We prove that General Position Subset Selection is NP-hard, APX-hard, and present several fixed-parameter tractability results for the problem as well as a subexponential running time lower bound based on the Exponential Time Hypothesis.


2019 ◽  
Author(s):  
Jaclyn Marjorie Smith ◽  
Melvin Lathara ◽  
Hollis Wright ◽  
Brian Hill ◽  
Nalini Ganapati ◽  
...  

Abstract Background The affordability of next-generation genomic sequencing and the improvement of medical data management have contributed largely to the evolution of biological analysis from both a clinical and research perspective. Precision medicine is a response to these advancements that places individuals into better-defined subsets based on shared clinical and genetic features. The identification of personalized diagnosis and treatment options is dependent on the ability to draw insights from large-scale, multi-modal analysis of biomedical datasets. Driven by a real use case, we premise that platforms that support precision medicine analysis should maintain data in their optimal data stores, should support distributed storage and query mechanisms, and should scale as more samples are added to the system. Results We extended a genomics-based columnar data store, GenomicsDB, for ease of use within a distributed analytics platform for clinical and genomic data integration, known as the ODA framework. The framework supports interaction from an i2b2 plugin as well as a notebook environment. We show that the ODA framework exhibits worst-case linear scaling for array size (storage), import time (data construction), and query time for an increasing number of samples. We go on to show worst-case linear time for both import of clinical data and aggregate query execution time within a distributed environment. Conclusions This work highlights the integration of a distributed genomic database with a distributed compute environment to support scalable and efficient precision medicine queries from a HIPAA-compliant, cohort system in a real-world setting. The ODA framework is currently deployed in production to support precision medicine exploration and analysis from clinicians and researchers at UCLA David Geffen School of Medicine.


Author(s):  
Marwa F. Mohamed ◽  
Abd El-Rahman Shabayek ◽  
Mahmoud El-Gayyar ◽  
Hamed Nassar

Author(s):  
Beda Büchel ◽  
Francesco Corman

Understanding the variability of bus travel time is a key issue in the optimization of schedules, transit reliability, route choice analysis, and transit simulation. The statistical modeling of bus travel time data is of increasing importance given the increasing availability of data. In this paper, we introduce a novel approach to modeling the day-to-day variability of urban bus running times on a section level. First, the explanatory power of conventionally used distributions is examined, based on likelihood and effect size. We show that a mixture model is a powerful tool to increase fitting performance, but the applied components need to be justified. To overcome this issue, we propose a novel model consisting of two individual characteristic distributions representing either off-peak or peak hour dynamics. The observed running time distribution at every hour of the day can be described as a combination (mixture) of the two dynamics. The proposed time varying model uses a small set of parameters, which are physically interpretable and capable of accurately describing running time distributions. With our modeling approach, we reduce the complexity of mixture models and increase the explanatory power and fit compared with conventional models.


Sign in / Sign up

Export Citation Format

Share Document