diff --git a/README.md b/README.md index a2856b4..fd2fc43 100644 --- a/README.md +++ b/README.md @@ -1,39 +1,73 @@ # Maximum Weighted Matching -This repository contains a Python 3 implementation of maximum weighted matching in general graphs. +This repository contains implementations of maximum weighted matching for general graphs. In graph theory, a _matching_ is a subset of edges that does not use any vertex more than once. For an edge-weighted graph, a _maximum weight matching_ is a matching that achieves the largest possible sum of weights of matched edges. The code in this repository is based on a variant of the blossom algorithm that runs in -_O(n3)_ steps. +_O(n*m*log(n))_ steps. See the file [Algorithm.md](doc/Algorithm.md) for a detailed description. -You may find this repository useful if ... - - you want a stand-alone, pure Python module that calculates maximum weight matchings - with reasonable efficiency; - - or you want to play around with a maximum weight matching algorithm to learn how it works. +## Python -This repository is probably not the best place if ... +The folder [python/](python/) contains a Python 3 package `mwmatching` that implements +maximum weighted matching. +The Python code is self-contained -- it has no dependencies outside the standard library. - - you need a very fast routine that quickly matches large graphs (more than ~ 1000 vertices). - In that case I recommend [LEMON](http://lemon.cs.elte.hu/trac/lemon), - a C++ library that provides a blazingly fast implementation. - - or you want a high quality Python package that provides abstract graph data structures - and graph algorithms. - In that case I recommend [NetworkX](https://networkx.org/). - - or you are only interested in bipartite graphs. - There are simpler and faster algorithms for matching bipartite graphs. +To use the package, set your `PYTHONPATH` to the location of the package `mwmatching`. +Alternatively, you can install the package into your Python environment by running + +``` +cd python +pip install . +``` + +Using the algorithm is easy. +You describe the input graph by listing its edges. +Each edge is represented as a pair of vertex indices and the weight of the edge. + +The example below finds a matching in a graph with 5 vertices and 5 edges. +The maximum weight matching contains two edges and has total weight 11. + +``` +from mwmatching import maximum_weight_matching +edges = [(0, 1, 3), (1, 2, 8), (1, 4, 6), (2, 3, 5), (2, 4, 7)] +matching = maximum_weight_matching(edges) +print(matching) # prints [(2, 5), (3, 4)] +``` + +## C++ + +The folder [cpp/](cpp/) contains a header-only C++ implementation of maximum weighted matching. + +**NOTE:** +The C++ code currently implements a slower algorithm that runs in _O(n3)_ steps. +I plan to eventually update the C++ code to implement the faster _O(n*m*log(n))_ algorithm. + +The C++ code is self-contained and can easily be linked into an application. +It is also reasonably efficient. + +For serious use cases, [LEMON](http://lemon.cs.elte.hu/trac/lemon) may be a better choice. +LEMON is a C++ library that provides a very fast and robust implementation of +maximum weighted matching and many other graph algorithms. +To my knowledge, it is the only free software library that provides a high-quality +matching algorithm. ## Repository structure ``` python/ - mwmatching.py : Python implementation of maximum weight matching - test_mwmatching.py : Unit tests + mwmatching/ : Python package for maximum weight matching + __init__.py + algorithm.py : Algorithm implementation + datastruct.py : Internal data structures + tests/ + test_algorithm : Unit tests for the algorithm + test_datastruct.py : Unit tests for data structures run_matching.py : Command-line program to run the matching algorithm cpp/ @@ -76,6 +110,11 @@ My implementation follows the description of the algorithm as it appears in the However, earlier versions of the algorithm were invented and improved by several other scientists. See the file [Algorithm.md](doc/Algorithm.md) for links to the most important papers. +I used some ideas from the source code of the `MaxWeightedMatching` class in +[LEMON](http://lemon.cs.elto.hu/trac/lemon): +the technique to implement lazy updates of vertex dual variables, +and the approach to re-use alternating trees after augmenting the matching. + I used Fortran programs `hardcard.f`, `t.f` and `tt.f` by R. B. Mattingly and N. Ritchey to generate test graphs. These programs are part of the DIMACS Network Flows and Matching implementation challenge. @@ -83,7 +122,7 @@ They can be found in the [DIMACS Netflow archive](http://archive.dimacs.rutgers.edu/pub/netflow/). To check the correctness of my results, I used other maximum weight matching solvers: -the `MaxWeightedMatching` module in [LEMON](http://lemon.cs.elte.hu/trac/lemon), +the `MaxWeightedMatching` module in LEMON, and the program [`wmatch`](http://archive.dimacs.rutgers.edu/pub/netflow/matching/weighted/solver-1/) by Edward Rothberg. @@ -94,7 +133,7 @@ by Edward Rothberg. The following license applies to the software in this repository, excluding the folder `doc`. This license is sometimes called the MIT License or the Expat License: -> Copyright (c) 2023 Joris van Rantwijk +> Copyright (c) 2023-2024 Joris van Rantwijk > > Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: >