Add implementation notes to algorithm description
This commit is contained in:
parent
b53e58902a
commit
3b80109cff
234
Algorithm.md
234
Algorithm.md
|
@ -793,7 +793,239 @@ Therefore the total cost of dual variable updates is _O(n<sup>2</sup>)_ per stag
|
||||||
|
|
||||||
## Implementation details
|
## Implementation details
|
||||||
|
|
||||||
_TO BE WRITTEN_
|
This section describes some choices I made while implementing the algorithm.
|
||||||
|
There may be easier or faster or better ways to do it, but this is what I did
|
||||||
|
and it seems to work mostly okay.
|
||||||
|
|
||||||
|
### Data structures
|
||||||
|
|
||||||
|
#### Input graph
|
||||||
|
|
||||||
|
Vertices are represented as non-negative integers in range _0_ to _n-1_.
|
||||||
|
|
||||||
|
Edges are represented as an array of tuples _(x, y, w)_ where _x_ and _y_ are indices
|
||||||
|
of the incident vertices, and _w_ is the numerical weight of the edge.
|
||||||
|
The edges are listed in no particular order.
|
||||||
|
Each edge has an index in _e_ range _0_ to _m-1_.
|
||||||
|
|
||||||
|
`edges[e] = (x, y, w)`
|
||||||
|
|
||||||
|
In addition, edges are organized in an array of adjacency lists, indexed by vertex index.
|
||||||
|
Each adjacency list contains the edge indices of all edges incident on a specific vertex.
|
||||||
|
Every edge therefore appears in two adjacency lists.
|
||||||
|
|
||||||
|
`adjacent_edges[x] = [e1, e2, ...]`
|
||||||
|
|
||||||
|
These data structures are initialized at the start of the matching algorithm
|
||||||
|
and never change while the algorithm runs.
|
||||||
|
|
||||||
|
#### General data
|
||||||
|
|
||||||
|
`vertex_mate[x] = y` if the edge between vertex _x_ and vertex _y_ is matched. <br>
|
||||||
|
`vertex_mate[x] = -1` if vertex _x_ is unmatched.
|
||||||
|
|
||||||
|
`vertex_top_blossom[x] =` pointer to _B(x)_, the top-level blossom that contains vertex _x_.
|
||||||
|
|
||||||
|
`vertex_dual[x]` holds the value of _u<sub>x</sub>_.
|
||||||
|
|
||||||
|
A FIFO queue holds vertex indices of S-vertices whose edges have not yet been scanned.
|
||||||
|
Vertices are inserted in this queue as soon as their top-level blossom gets label S.
|
||||||
|
|
||||||
|
#### Blossoms
|
||||||
|
|
||||||
|
A blossom is either a single vertex or a non-trivial blossom.
|
||||||
|
Both types of blossoms are represented as class instances with the following attributes:
|
||||||
|
|
||||||
|
* `B.base_vertex` is the vertex index of the base vertex of blossom _B_.
|
||||||
|
* `B.parent` is a pointer to the parent of _B_ in the blossom structure tree,
|
||||||
|
or `None` if _B_ is a top-level blossom.
|
||||||
|
* `B.label` is `S` or `T` or `None`
|
||||||
|
* `B.tree_edge = (x, y)` if _B_ is a labeled top-level blossom, where _y_ is a vertex in _B_
|
||||||
|
and _(x, y)_ is the edge that links _B_ to its parent in the alternating tree.
|
||||||
|
|
||||||
|
A non-trivial blossom additionally has the following attributes:
|
||||||
|
|
||||||
|
* `B.subblossoms` is an array of pointers to the sub-blossoms of _B_,
|
||||||
|
starting with the sub-blossom that contains the base vertex.
|
||||||
|
* `B.edges` is an array of alternating edges connecting the sub-blossoms.
|
||||||
|
* `B.dual_var` holds the value of _z<sub>B</sub>_.
|
||||||
|
|
||||||
|
Single-vertex blossoms are kept in an array indexed by vertex index. <br>
|
||||||
|
Non-trivial blossoms are kept in a separate array. <br>
|
||||||
|
These arrays are used to iterate over blossoms and to find the trivial blossom
|
||||||
|
that consists of a given vertex.
|
||||||
|
|
||||||
|
#### Least-slack edge tracking
|
||||||
|
|
||||||
|
`vertex_best_edge[x]` is an array holding _e<sub>x</sub>_, the edge index of
|
||||||
|
the least-slack edge between vertex _x_ and any S-vertex, or -1 if there is no such edge.
|
||||||
|
This value is only meaningful if _x_ is a T-vertex or unlabeled vertex.
|
||||||
|
|
||||||
|
`B.best_edge` is a blossom attribute holding _e<sub>B</sub>_, the edge index of the least-slack
|
||||||
|
edge between blossom _B_ and any other S-blossom, or -1 if there is no such edge.
|
||||||
|
This value is only meaningful if _B_ is a top-level S-blossom.
|
||||||
|
|
||||||
|
For non-trivial S-blossoms _B_, attribute `B.best_edge_set` holds the list _L<sub>B</sub>_
|
||||||
|
of potential least-slack edges to other blossoms.
|
||||||
|
This list is not maintained for single-vertex blossoms, since _L<sub>B</sub>_ of a single vertex
|
||||||
|
can be efficiently constructed from its adjacency list.
|
||||||
|
|
||||||
|
#### Memory usage
|
||||||
|
|
||||||
|
The data structures described above use a constant amount of memory per vertex and per edge
|
||||||
|
and per blossom.
|
||||||
|
Therefore the total memory requirement is _O(m + n)_.
|
||||||
|
|
||||||
|
The memory usage of _L<sub>B</sub>_ is a little tricky.
|
||||||
|
Any given list _L<sub>B</sub>_ can have length _O(n)_, and _O(n)_ of these lists can exist
|
||||||
|
simultaneously.
|
||||||
|
Naively allocating space for _O(n)_ elements per list will drive memory usage
|
||||||
|
up to _O(n<sup>2</sup>)_.
|
||||||
|
However, at any moment, an edge can be in at most two of these lists, therefore the sum
|
||||||
|
of the lengths of these lists is limited to _O(m)_.
|
||||||
|
A possible solution is to implement the _L<sub>B</sub>_ as linked lists.
|
||||||
|
|
||||||
|
### Performance critical routines
|
||||||
|
|
||||||
|
Calculations that happen very frequently in the algorithm are:
|
||||||
|
determining the top-level blossom of a given vertex, and calculating the slack of a given edge.
|
||||||
|
These calculations must run in constant time per call in any case, but it makes sense to put
|
||||||
|
some extra effort into making these calculations _fast_.
|
||||||
|
|
||||||
|
### Recursion
|
||||||
|
|
||||||
|
Certain tasks in the algorithm are recursive in nature:
|
||||||
|
enumerating the vertices in a given blossom, and augmenting the matching along
|
||||||
|
a path through a blossom.
|
||||||
|
It seems natural to implement such tasks as recursive subroutines, which handle the task
|
||||||
|
for a given blossom and make recursive calls to handle sub-blossoms as necessary.
|
||||||
|
But the recursion depth of such implementations can grow to _O(n)_ in case
|
||||||
|
of deeply nested blossoms.
|
||||||
|
|
||||||
|
Deep recursion may cause problems in certain programming languages and runtime environments.
|
||||||
|
In such cases, it may be better to avoid recursive calls and instead implement an iterative
|
||||||
|
control flow with an explicit stack.
|
||||||
|
|
||||||
|
### Handling integer edge weights
|
||||||
|
|
||||||
|
If all edge weights in the input graph are integers, it is possible and often desirable
|
||||||
|
to implement the algorithm such that only integer calculations are used.
|
||||||
|
|
||||||
|
If all edge weights are integers, then all vertex dual variables _u<sub>x</sub>_
|
||||||
|
are integer multiples of 0.5, and all blossom dual variables _z<sub>B</sub>_ are integers.
|
||||||
|
Storing the vertex duals as _2\*u<sub>x</sub>_ allows all calculations to be done with integers.
|
||||||
|
|
||||||
|
Proof by induction that all vertex duals are multiples of 0.5 and all blossom duals are integers:
|
||||||
|
|
||||||
|
- Vertex duals are initialized to 0.5 times the greatest edge weight.
|
||||||
|
Blossom duals are initialized to 0.
|
||||||
|
Therefore the proposition is initially true.
|
||||||
|
- All unmatched vertices have the same dual value.
|
||||||
|
Proof: Initially all vertices have the same dual value.
|
||||||
|
All unmatched vertices have been unmatched since the beginning,
|
||||||
|
therefore always had label S in every dual variable update,
|
||||||
|
therefore always got changed by the same amount.
|
||||||
|
- Either all duals of S-vertices are integers or all duals of S-vertices are odd multiples of 0.5.
|
||||||
|
Proof: The root nodes of alternating trees are unmatched vertices which all have the same dual.
|
||||||
|
Within an alternating tree, all edges are tight, therefore all duals of vertices in the tree
|
||||||
|
differ by an integer amount, therefore either all duals are integers or all duals
|
||||||
|
are odd multiples of 0.5.
|
||||||
|
- _δ_ is a multiple of 0.5. Proof:
|
||||||
|
- _δ<sub>1</sub> = u<sub>x</sub>_ and _u<sub>x</sub>_ is a multiple of 0.5,
|
||||||
|
therefore _δ<sub>1</sub>_ is a mutiple of 0.5.
|
||||||
|
- _δ<sub>2</sub> = π<sub>x,y</sub> = u<sub>x</sub> + u<sub>y</sub> - w<sub>x,y</sub>_
|
||||||
|
where _u<sub>x</sub>_ and _u<sub>y</sub>_ and _w<sub>x,y</sub>_ are multiples of 0.5,
|
||||||
|
therefore _δ<sub>2</sub>_ is a multiple of 0.5.
|
||||||
|
- _δ<sub>3</sub> = 0.5 \* π<sub>x,y</sub> = 0.5 \* (u<sub>x</sub> + u<sub>y</sub> - w<sub>x,y</sub>)_.
|
||||||
|
Since _x_ and _y_ are S-vertices, either _u<sub>x</sub>_ and _u<sub>y</sub>_ are
|
||||||
|
both integers or both are odd multiples of 0.5.
|
||||||
|
In either case _u<sub>x</sub> + u<sub>y</sub>_ is an integer.
|
||||||
|
Therefore _δ<sub>3</sub>_ is a multiple of 0.5.
|
||||||
|
- _δ<sub>4</sub> = 0.5 \* z<sub>B</sub>_ where _z<sub>B</sub>_ is an integer,
|
||||||
|
therefore _δ<sub>4</sub>_ is a multiple of 0.5.
|
||||||
|
- Vertex duals increase or decrease by _δ_ which is a multiple of 0.5,
|
||||||
|
therefore updated vertex duals are still multiples of 0.5.
|
||||||
|
- Blossom duals increase or decrease by _2\*δ_,
|
||||||
|
therefore updated blossom duals are still integers.
|
||||||
|
|
||||||
|
The value of vertex dual variables and blossom dual variables never exceeds the
|
||||||
|
greatest edge weight in the graph.
|
||||||
|
This may be helpful for choosing an integer data type for the dual variables.
|
||||||
|
(Alternatively, choose a programming language with unlimited integer range.
|
||||||
|
This is perhaps the thing I love most about Python.)
|
||||||
|
|
||||||
|
Proof that dual variables do not exceed _max-weight_:
|
||||||
|
|
||||||
|
- Vertex dual variables start at _u<sub>x</sub> = 0.5\*max-weight_.
|
||||||
|
- While the algorithm runs, there is at least one vertex which has been unmatched
|
||||||
|
since the beginning.
|
||||||
|
This vertex has always had label S, therefore its dual always decreased by _δ_
|
||||||
|
during a dual variable update.
|
||||||
|
Since it started at _0.5\*max-weight_ and can not become negative,
|
||||||
|
the sum of _δ_ over all dual variable updates can not exceed _0.5\*max-weight_.
|
||||||
|
- Vertex dual variables increase by at most _δ_ per update.
|
||||||
|
Therefore no vertex dual can increase by more than _0.5\*max-weight_ in total.
|
||||||
|
Therefore no vertex dual can exceed _max-weight_.
|
||||||
|
- Blossom dual variables start at _z<sub>B</sub> = 0_.
|
||||||
|
- Blossom dual variables increase by at most _2\*δ_ per update.
|
||||||
|
Therefore no blossom dual can increase by more than _max-weight_ in total.
|
||||||
|
Therefore no blossom dual can exceed _max-weight_.
|
||||||
|
|
||||||
|
### Handling floating point edge weights
|
||||||
|
|
||||||
|
Floating point calculations are subject to rounding errors.
|
||||||
|
This has two consequences for the matching algorithm:
|
||||||
|
|
||||||
|
- The algorithm may return a matching which has slightly lower weight than
|
||||||
|
the actual maximum weight.
|
||||||
|
|
||||||
|
- The algorithm may not reliably recognize tight edges.
|
||||||
|
To check whether an edge is tight, its slack is compared to zero.
|
||||||
|
Rounding errors may cause the slack to appear positive even when an exact calculation
|
||||||
|
would show it to be zero.
|
||||||
|
The slack of some edges may even become slightly negative.
|
||||||
|
|
||||||
|
I believe this does not affect the correctness of the algorithm.
|
||||||
|
An edge that should be tight but is not recognized as tight due to rounding errors,
|
||||||
|
can be pulled tight through an additional dual variable update.
|
||||||
|
As side-effect of this update, the edge will immediately be used to grow the alternating tree,
|
||||||
|
or construct a blossom or augmenting path.
|
||||||
|
This mechanism allows the algorithm to make progress, even if slack comparisons
|
||||||
|
are repeatedly thrown off by rounding errors.
|
||||||
|
Rounding errors may cause the algorithm to perform more dual variable updates
|
||||||
|
than strictly necessary.
|
||||||
|
But this will still not cause the run time of the algorithm to exceed _O(n<sup>3</sup>)_.
|
||||||
|
|
||||||
|
It seems to me that the matching algorithm is stable for floating point weights.
|
||||||
|
And it seems to me that it returns a matching which is close to optimal,
|
||||||
|
and could have been optimal if edge weights were changed by small amounts.
|
||||||
|
|
||||||
|
I must admit these arguments are mostly based on intuition.
|
||||||
|
Unfortunately I don't know how to properly analyze the floating point accuracy of this algorithm.
|
||||||
|
|
||||||
|
### Finding a maximum weight matching out of all maximum cardinality matchings
|
||||||
|
|
||||||
|
It is sometimes useful to find a maximum cardinality matching which has maximum weight
|
||||||
|
out of all maximum cardinality matchings.
|
||||||
|
A simple way to achieve this is to increase the weight of all edges by the same amount.
|
||||||
|
|
||||||
|
In general, a maximum weight matching is not necessarily a maximum cardinality matching.
|
||||||
|
However if all edge weights are at least _n_ times the difference between
|
||||||
|
the maximum and minimum edge weight, any maximum weight matching is guaranteed
|
||||||
|
to have maximum cardinality.
|
||||||
|
Proof:
|
||||||
|
The weight of a non-maximum-cardinality matching can be increased by matching
|
||||||
|
an additional edge.
|
||||||
|
In order to match that extra edge, some high-weight edges may have to be removed from
|
||||||
|
the matching.
|
||||||
|
Their place might be taken low-weight edges.
|
||||||
|
But even if all previously matched edges had maximum weight, and all newly matched edges
|
||||||
|
have minimum weight, the weight of the matching will still increase.
|
||||||
|
|
||||||
|
Changing all edge weights by the same amount does not change the relative preference
|
||||||
|
for a certain groups of edges over another group of the same size.
|
||||||
|
Therefore, given that we only consider maximum-cardinality matchings,
|
||||||
|
changing all weights by the same amount doesn't change which of these matchings has maximum weight.
|
||||||
|
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
Loading…
Reference in New Issue