Update Algorithm.md
This commit is contained in:
parent
4670cf1dca
commit
e103a493fc
|
@ -7,7 +7,7 @@ For an edge-weighted graph, a _maximum weight matching_ is a matching that achie
|
|||
the largest possible sum of weights of matched edges.
|
||||
|
||||
The code in this repository is based on a variant of the blossom algorithm that runs in
|
||||
_O(n \* m \* log(n))_ steps.
|
||||
_O(n m log n)_ steps.
|
||||
See the file [Algorithm.md](doc/Algorithm.md) for a detailed description.
|
||||
|
||||
|
||||
|
@ -44,7 +44,7 @@ The folder [cpp/](cpp/) contains a header-only C++ implementation of maximum wei
|
|||
|
||||
**NOTE:**
|
||||
The C++ code currently implements a slower algorithm that runs in _O(n<sup>3</sup>)_ steps.
|
||||
I plan to eventually update the C++ code to implement the faster _O(n*m*log(n))_ algorithm.
|
||||
I plan to eventually update the C++ code to implement the faster _O(n m log n)_ algorithm.
|
||||
|
||||
The C++ code is self-contained and can easily be linked into an application.
|
||||
It is also reasonably efficient.
|
||||
|
|
747
doc/Algorithm.md
747
doc/Algorithm.md
|
@ -4,19 +4,20 @@
|
|||
## Introduction
|
||||
|
||||
This document describes the implementation of an algorithm that computes a maximum weight matching
|
||||
in a general graph in time _O(n<sup>3</sup>)_, where _n_ is the number of vertices in the graph.
|
||||
in a general graph in time _O(n (n + m) log n)_, where _n_ is the number of vertices in
|
||||
the graph and _m_ is the number of edges.
|
||||
|
||||
In graph theory, a _matching_ is a subset of edges such that none of them share a common vertex.
|
||||
|
||||
A _maximum cardinality matching_ is a matching that contains the largest possible number of edges
|
||||
(or equivalently, the largest possible number of vertices).
|
||||
|
||||
For a graph that has weights attached to its edges, a _maximum weight matching_
|
||||
If a graph has weights assigned to its edges, a _maximum weight matching_
|
||||
is a matching that achieves the largest possible sum of weights of matched edges.
|
||||
An algorithm for maximum weight matching can obviously also be used to calculate a maximum
|
||||
cardinality matching by simply assigning weight 1 to all edges.
|
||||
|
||||
Certain computer science problems can be understood as _restrictions_ of maximum weighted matching
|
||||
Certain related problems can be understood as _restrictions_ of maximum weighted matching
|
||||
in general graphs.
|
||||
Examples are: maximum matching in bipartite graphs, maximum cardinality matching in general graphs,
|
||||
and maximum weighted matching in general graphs with edge weights limited to integers
|
||||
|
@ -48,7 +49,7 @@ It is based on the ideas of Edmonds, but uses different data structures to reduc
|
|||
of work.
|
||||
|
||||
In 1983, Galil, Micali and Gabow published a maximum weighted matching algorithm that runs in
|
||||
time _O(n\*m\*log(n))_ [[4]](#galil_micali_gabow1986) .
|
||||
time _O(n m log n)_ [[4]](#galil_micali_gabow1986) .
|
||||
It is an implementation of Edmonds' blossom algorithm that uses advanced data structures
|
||||
to speed up critical parts of the algorithm.
|
||||
This algorithm is asymptotically faster than _O(n<sup>3</sup>)_ for sparse graphs,
|
||||
|
@ -63,45 +64,32 @@ of the literature.
|
|||
The paper describes a maximum weighted matching algorithm that is similar to Edmonds'
|
||||
blossom algorithm, but carefully implemented to run in time _O(n<sup>3</sup>)_.
|
||||
It then sketches how advanced data structures can be added to arrive at the Galil-Micali-Gabow
|
||||
algorithm that runs in time _O(n\*m\*log(n))_.
|
||||
algorithm that runs in time _O(n m log n)_.
|
||||
|
||||
In 1990, Gabow published a maximum weighted matching algorithm that runs in time
|
||||
_O(n\*m + n<sup>2</sup>\*log(n))_ [[6]](#gabow1990) .
|
||||
_O(n m + n<sup>2</sup> log n)_ [[6]](#gabow1990) .
|
||||
It uses several advanced data structures, including Fibonacci heaps.
|
||||
Unfortunately I don't understand this algorithm at all.
|
||||
|
||||
|
||||
## Choosing an algorithm
|
||||
|
||||
I selected the _O(n<sup>3</sup>)_ variant of Edmonds' blossom algorithm as described by
|
||||
Galil [[5]](#galil1986) .
|
||||
This algorithm is usually credited to Gabow [[3]](#gabow1974), but I find the description
|
||||
in [[5]](#galil1986) easier to understand.
|
||||
I selected the _O(n m log n)_ algorithm by Galil, Micali and Gabow
|
||||
[[4]](#galil_micali_gabow1986).
|
||||
|
||||
This is generally a fine algorithm.
|
||||
One of its strengths is that it is relatively easy to understand, especially compared
|
||||
to the more recent algorithms.
|
||||
Its run time is asymptotically optimal for complete graphs (graphs that have an edge
|
||||
between every pair of vertices).
|
||||
|
||||
On the other hand, the algorithm is suboptimal for sparse graphs.
|
||||
It is possible to construct highly sparse graphs, having _m = O(n)_,
|
||||
that cause this algorithm to perform _Θ(n<sup>3</sup>)_ steps.
|
||||
In such cases the Galil-Micali-Gabow algorithm would be significantly faster.
|
||||
|
||||
Then again, there is a question of how important the worst case is for practical applications.
|
||||
I suspect that the simple algorithm typically needs only about _O(n\*m)_ steps when running
|
||||
on random sparse graphs with random weights, i.e. much faster than its worst case bound.
|
||||
|
||||
After trading off these properties, I decided that I prefer an algorithm that is understandable
|
||||
and has decent performance, over an algorithm that is faster in specific cases but also
|
||||
significantly more complicated.
|
||||
This algorithm is asymptotically optimal for sparse graphs.
|
||||
It has also been shown to be quite fast in practice on several types of graphs
|
||||
including random graphs [[7]](#mehlhorn_schafer2002).
|
||||
|
||||
This algorithm is more difficult to implement than the older _O(n<sup>3</sup>)_ algorithm.
|
||||
In particular, it requires a specialized data structure to implement concatenable priority queues.
|
||||
This increases the size and complexity of the code quite a bit.
|
||||
However, in my opinion the performance improvement is worth the extra effort.
|
||||
|
||||
## Description of the algorithm
|
||||
|
||||
My implementation closely follows the description by Zvi Galil in [[5]](#galil1986) .
|
||||
I recommend to read that paper before diving into my description below.
|
||||
My implementation roughly follows the description by Zvi Galil in [[5]](#galil1986) .
|
||||
I recommend reading that paper before diving into my description below.
|
||||
The paper explains the algorithm in depth and shows how it relates to matching
|
||||
in bipartite graphs and non-weighted graphs.
|
||||
|
||||
|
@ -253,14 +241,14 @@ have _slack_.
|
|||
An augmenting path that consists only of tight edges is _guaranteed_ to increase the weight
|
||||
of the matching as much as possible.
|
||||
|
||||
While searching for an augmenting path, we simply restrict the search to tight edges,
|
||||
While searching for an augmenting path, we restrict the search to tight edges,
|
||||
ignoring all edges that have slack.
|
||||
Certain explicit actions of the algorithm cause edges to become tight or slack.
|
||||
How this works will be explained later.
|
||||
|
||||
To find an augmenting path, the algorithm searches for alternating paths that start
|
||||
in an unmatched vertex.
|
||||
The collection of alternating paths forms a forest of trees.
|
||||
The collection of such alternating paths forms a forest of trees.
|
||||
Each tree is rooted in an unmatched vertex, and all paths from the root to the leaves of a tree
|
||||
are alternating paths.
|
||||
The nodes in these trees are top-level blossoms.
|
||||
|
@ -289,8 +277,8 @@ an odd-length alternating cycle.
|
|||
The lowest common ancestor node in the alternating tree forms the beginning and end
|
||||
of the alternating cycle.
|
||||
In this case a new blossom must be created by shrinking the cycle.
|
||||
If the two S-blossoms are in different alternating trees, the edge that links the blossoms
|
||||
is part of an augmenting path between the roots of the two trees.
|
||||
On the other hand, if the two S-blossoms are in different alternating trees,
|
||||
the edge that links the blossoms is part of an augmenting path between the roots of the two trees.
|
||||
|
||||
![Figure 3](figures/graph3.png) <br>
|
||||
*Figure 3: Growing alternating trees*
|
||||
|
@ -314,7 +302,7 @@ The search procedure considers these vertices one-by-one and tries to use them t
|
|||
either grow the alternating tree (thus adding new vertices to the queue),
|
||||
or discover an augmenting path or a new blossom.
|
||||
|
||||
In detail, the search for an augmenting path proceeds as follows:
|
||||
The search for an augmenting path proceeds as follows:
|
||||
|
||||
- Mark all top-level blossoms as _unlabeled_.
|
||||
- Initialize an empty queue _Q_.
|
||||
|
@ -469,9 +457,9 @@ $$ \pi_{x,y} = u_x + u_y + \sum_{(x,y) \in B} z_B - w_{x,y} $$
|
|||
|
||||
An edge is _tight_ if and only if its slack is zero.
|
||||
Given the values of the dual variables, it is very easy to calculate the slack of an edge
|
||||
which is not contained in any blossom: simply add the duals of its incident vertices and
|
||||
which is not contained in any blossom: add the duals of its incident vertices and
|
||||
subtract the weight.
|
||||
To check whether an edge is tight, simply compute its slack and check whether it is zero.
|
||||
To check whether an edge is tight, simply compute its slack and compare it to zero.
|
||||
|
||||
Calculating the slack of an edge that is contained in one or more blossoms is a little tricky,
|
||||
but fortunately we don't need such calculations.
|
||||
|
@ -504,7 +492,7 @@ At that point the maximum weight matching has been found.
|
|||
When the matching algorithm is finished, the constraints can be checked to verify
|
||||
that the matching is optimal.
|
||||
This check is simpler and faster than the matching algorithm itself.
|
||||
It can therefore be a useful way to guard against bugs in the matching algorithm.
|
||||
It can therefore be a useful way to guard against bugs in the algorithm.
|
||||
|
||||
### Rules for updating dual variables
|
||||
|
||||
|
@ -534,11 +522,12 @@ It then changes dual variables as follows:
|
|||
- _z<sub>B</sub> ← z<sub>B</sub> − 2 * δ_ for every non-trivial T-blossom _B_
|
||||
|
||||
Dual variables of unlabeled blossoms and their vertices remain unchanged.
|
||||
Dual variables _z<sub>B</sub>_ of non-trivial sub-blossoms also remain changed;
|
||||
Dual variables _z<sub>B</sub>_ of non-trivial sub-blossoms also remain unchanged;
|
||||
only top-level blossoms have their _z<sub>B</sub>_ updated.
|
||||
|
||||
Note that this update does not change the slack of edges that are either matched,
|
||||
or linked in the alternating tree, or contained in a blossom.
|
||||
Note that these rules ensure that no change occurs to the slack of any edge which is matched,
|
||||
or part of an alternating tree, or contained in a blossom.
|
||||
Such edges are tight and remain tight through the update.
|
||||
However, the update reduces the slack of edges between S blossoms and edges between S-blossoms
|
||||
and unlabeled blossoms.
|
||||
It may cause some of these edges to become tight, allowing them to be used
|
||||
|
@ -569,25 +558,62 @@ If the dual update is limited by _δ<sub>4</sub>_, it causes the dual varia
|
|||
a T-blossom to become zero.
|
||||
The next step is to expand that blossom.
|
||||
|
||||
A dual update may find that _δ = 0_, implying that the dual variables don't change.
|
||||
### Discovering tight edges through delta steps
|
||||
|
||||
A delta step may find that _δ = 0_, implying that the dual variables don't change.
|
||||
This can still be useful since all types of updates have side effects (adding an edge
|
||||
to an alternating tree, or expanding a blossom) that allow the algorithm to make progress.
|
||||
In fact, it is convenient to let the dual update mechanism drive the entire process of discovering
|
||||
tight edges and growing alternating trees.
|
||||
|
||||
During a single _stage_, the algorithm may iterate several times between scanning tight edges and
|
||||
updating dual variables.
|
||||
These iterations are called _substages_.
|
||||
To clarify: A stage is the process of growing alternating trees until an augmenting path is found.
|
||||
In my description of the search algorithm above, I stated that upon discovery of a tight edge
|
||||
between a newly labeled S-vertex and an unlabeled vertex or a different S-blossom, the edge should
|
||||
be used to grow the alternating tree or to create a new blossom or to form an augmenting path.
|
||||
However, it turns out to be easier to postpone the use of such edges until the next delta step.
|
||||
While scanning newly labeled S-vertices, edges to unlabeled vertices or different S-blossoms
|
||||
are discovered but not yet used.
|
||||
Such edges are merely registered in a suitable data structure.
|
||||
Even if the edge is tight, it is registered rather than used right away.
|
||||
|
||||
Once the scan completes, a delta step will be done.
|
||||
If any tight edges were discovered during the scan, the delta step will find that either
|
||||
_δ<sub>2</sub> = 0_ or _δ<sub>3</sub> = 0_.
|
||||
The corresponding step (growing the alternating tree, creating a blossom or augmenting
|
||||
the matching) will occur at that point.
|
||||
If no suitable tight edges exist, a real (non-zero) change of dual variables will occur.
|
||||
|
||||
The search for an augmenting path becomes as follows:
|
||||
|
||||
- Mark all top-level blossoms as _unlabeled_.
|
||||
- Initialize an empty queue _Q_.
|
||||
- Assign label S to all top-level blossoms that contain an unmatched vertex.
|
||||
Add all vertices inside such blossoms to _Q_.
|
||||
- Repeat until either an augmenting path is found or _δ<sub>1</sub> = 0_:
|
||||
- Scan all vertices in Q as described earlier.
|
||||
Register edges to unlabeled vertices or other S-blossoms.
|
||||
Do not yet use such edges to change the alternating tree, even if the edge is tight.
|
||||
- Calculate _δ_ and update dual variables as described above.
|
||||
- If _δ = δ<sub>1</sub>_, end the search.
|
||||
The maximum weight matching has been found.
|
||||
- If _δ = δ<sub>2</sub>_, use the corresponding edge to grow the alternating tree.
|
||||
Assign label T to the unlabeled blossom.
|
||||
Then assign label S to its mate and add the new S-vertices to _Q_.
|
||||
- If _δ = δ<sub>3</sub>_ and the corresponding edge connects two S-blossoms
|
||||
in the same alternating tree, use the edge to create a new blossom.
|
||||
Add the new S-vertices to _Q_.
|
||||
- If _δ = δ<sub>3</sub>_ and the corresponding edge connects two S-blossoms
|
||||
in different alternating trees, use the edge to construct an augmenting path.
|
||||
End the search and return the augmenting path.
|
||||
- If _δ = δ<sub>4</sub>_, expand the corresponding T-blossom.
|
||||
|
||||
It may seem complicated, but this is actually easier.
|
||||
The code that scans newly labeled S-vertices, no longer needs special treatment of tight edges.
|
||||
|
||||
In general, multiple updates of the dual variables are necessary during a single _stage_ of
|
||||
the algorithm.
|
||||
Remember that a stage is the process of growing alternating trees until an augmenting path is found.
|
||||
A stage ends either by augmenting the matching, or by concluding that no further improvement
|
||||
is possible.
|
||||
Each stage consists of one or more substages.
|
||||
A substage scans tight edges to grow the alternating trees.
|
||||
When a substage runs out of tight edges, it ends by performing a dual variable update.
|
||||
A substage also ends immediately when it finds an augmenting path.
|
||||
At the end of each stage, labels and alternating trees are removed.
|
||||
|
||||
The matching algorithm ends when a substage ends in a dual variable update limited
|
||||
by _δ<sub>1</sub>_.
|
||||
At that point the matching has maximum weight.
|
||||
|
||||
### Expanding a blossom
|
||||
|
||||
|
@ -635,161 +661,322 @@ All vertices of sub-blossoms that got label S are inserted into _Q_.
|
|||
![Figure 5](figures/graph5.png) <br>
|
||||
*Figure 5: Expanding a T-blossom*
|
||||
|
||||
### Keeping track of least-slack edges
|
||||
### Keeping track of the top-level blossom of each vertex
|
||||
|
||||
To perform a dual variable update, the algorithm needs to compute the values
|
||||
The algorithm often needs to find the top-level blossom _B(x)_ that contains a given vertex _x_.
|
||||
|
||||
A naive implementation may keep this information in an array where the element with
|
||||
index _x_ holds a pointer to blossom _B(x)_.
|
||||
Lookup in this array would be fast, but keeping the array up-to-date takes too much time.
|
||||
There can be _O(n)_ stages, and _O(n)_ blossoms can be created or expanded during a stage,
|
||||
and a blossom can contain _O(n)_ vertices,
|
||||
therefore the total number of updates to the array could add up to _O(n<sup>3</sup>)_.
|
||||
|
||||
To solve this, we use a special data structure: a _concatenable priority queue_.
|
||||
Each top-level blossom maintains an instance of this type of queue, containing its vertices.
|
||||
Each vertex is a member in precisely one such queue.
|
||||
|
||||
To find the top-level blossom _B(x)_ that contains a given vertex _x_, we determine
|
||||
the queue instance in which the vertex is a member.
|
||||
This takes time _O(log n)_.
|
||||
The queue instance corresponds directly to a specific blossom, which we can find
|
||||
for example by storing a pointer to the blossom inside the queue instance.
|
||||
|
||||
When a new blossom is created, the concatenable queues of its sub-blossoms are merged
|
||||
to form one concatenable queue for the new blossom.
|
||||
The merged queue contains all vertices of the original queues.
|
||||
Merging a pair of queues takes time _O(log n)_.
|
||||
To merge the queues of _k_ sub-blossoms, the concatenation step is repeated _k-1_ times,
|
||||
taking total time _O(k log n)_.
|
||||
|
||||
When a blossom is expanded, its concatenable queue is un-concatenated to recover separate queues
|
||||
for the sub-blossoms.
|
||||
This also takes time _O(log n)_ for each sub-blossom.
|
||||
|
||||
Implementation details of concatenable queues are discussed later in this document.
|
||||
|
||||
### Lazy updating of dual variables
|
||||
|
||||
During a delta step, the dual variables of labeled vertices and blossoms change as described above.
|
||||
Updating these variables directly would take time _O(n)_ per delta step.
|
||||
The total number of delta steps during a matching may be _Θ(n<sup>2</sup>)_,
|
||||
pushing the total time to update dual variables to _O(n<sup>3</sup>)_ which is too slow.
|
||||
|
||||
To solve this, [[4]](#galil_micali_gabow1986) describes a technique that stores dual values
|
||||
in a _modified_ form which is invariant under delta steps.
|
||||
The modified values can be converted back to the true dual values when necessary.
|
||||
[[7]](#mehlhorn_schafer2002) describes a slightly different technique which I find easier
|
||||
to understand.
|
||||
My implementation is very similar to theirs.
|
||||
|
||||
The first trick is to keep track of the running sum of _δ_ values since the beginning of the algorithm.
|
||||
Let's call that number _Δ_.
|
||||
At the start of the algorithm _Δ = 0_, but the value increases as the algorithm goes through delta steps.
|
||||
|
||||
For each non-trivial blossom, rather than storing its true dual value, we store a _modified_ dual value:
|
||||
|
||||
- For an S-blossom, the modified dual value is _z'<sub>B</sub> = z<sub>B</sub> - 2 Δ_
|
||||
- For a T-blossom, the modified dual value is _z'<sub>B</sub> = z<sub>B</sub> + 2 Δ_
|
||||
- For an unlabeled blossom or non-top-level blossom, the modified dual value is equal
|
||||
to the true dual value.
|
||||
|
||||
These modified values are invariant under delta steps.
|
||||
Thus, there is no need to update the stored values during a delta step.
|
||||
|
||||
Since the modified blossom dual value depends on the label (S or T) of the blossom,
|
||||
the modified value must be updated whenever the label of the blossom changes.
|
||||
This update can be done in constant time, and changing the label of a blossom is
|
||||
in any case an explicit step, so this won't increase the asymptotic run time.
|
||||
|
||||
For each vertex, rather than storing its true dual value, we store a _modified_ dual value:
|
||||
|
||||
- For an S-vertex, the modified dual value is _u'<sub>x</sub> = u<sub>x</sub> + Δ_
|
||||
- For a T-vertex, the modified dual value is _u'<sub>x</sub> = u<sub>x</sub> - offset<sub>B(x)</sub> - Δ_
|
||||
- For an unlabeled vertex, the modified dual value is _u'<sub>x</sub> = u<sub>x</sub> - offset<sub>B(x)</sub>_
|
||||
|
||||
where _offset<sub>B</sub>_ is an extra variable which is maintained for each top-level blossom.
|
||||
|
||||
Again, the modified values are invariant under delta steps, which implies that no update
|
||||
to the stored values is necessary during a delta step.
|
||||
|
||||
Since the modified vertex dual value depends on the label (S or T) of its top-level blossom,
|
||||
an update is necessary when that label changes.
|
||||
For S-vertices, we can afford to apply that update directly to the vertices involved.
|
||||
This is possible since a vertex becomes an S-vertex at most once per stage.
|
||||
|
||||
The situation is more complicated for T-vertices.
|
||||
During a stage, a T-vertex can become unlabeled if it is part of a T-blossom that gets expanded.
|
||||
The same vertex can again become a T-vertex, then again become unlabeled during a subsequent
|
||||
blossom expansion.
|
||||
In this way, a vertex can transition between T-vertex and unlabeled vertex up to _O(n)_ times
|
||||
within a stage.
|
||||
We can not afford to update the stored modified vertex dual so many times.
|
||||
This is where the _offset_ variables come in.
|
||||
If a blossom becomes a T-blossom, rather than updating the modified duals of all vertices,
|
||||
we update only the _offset_ variable of the blossom such that the modified vertex duals
|
||||
remain unchanged.
|
||||
If a blossom is expanded, we push the _offset_ values down to its sub-blossoms.
|
||||
|
||||
### Efficiently computing _δ_
|
||||
|
||||
To perform a delta step, the algorithm computes the values
|
||||
of _δ<sub>1</sub>_, _δ<sub>2</sub>_, _δ<sub>3</sub>_ and _δ<sub>4</sub>_
|
||||
and determine which edge (_δ<sub>2</sub>_, _δ<sub>3</sub>_) or
|
||||
and determines which edge (_δ<sub>2</sub>_, _δ<sub>3</sub>_) or
|
||||
blossom (_δ<sub>4</sub>_) achieves the minimum value.
|
||||
|
||||
The total number of dual updates during a matching may be _Θ(n<sup>2</sup>)_.
|
||||
Since we want to limit the total number of steps of the matching algorithm to _O(n<sup>3</sup>)_,
|
||||
each dual update may use at most _O(n)_ steps.
|
||||
A naive implementation might compute _δ_ by looping over the vertices, blossoms and edges
|
||||
in the graph.
|
||||
The total number of delta steps during a matching may be _Θ(n<sup>2</sup>)_,
|
||||
pushing the total time for _δ_ calculations to _O(n<sup>2</sup> m)_ which is much too slow.
|
||||
[[4]](#galil_micali_gabow1986) introduces a combination of data structures from which
|
||||
the value of _δ_ can be computed efficiently.
|
||||
|
||||
We can find _δ<sub>1</sub>_ in _O(n)_ steps by simply looping over all vertices
|
||||
and checking their dual variables.
|
||||
We can find _δ<sub>4</sub>_ in _O(n)_ steps by simply looping over all non-trivial blossoms
|
||||
(since there are fewer than _n_ non-trivial blossoms).
|
||||
We could find _δ<sub>2</sub>_ and _δ<sub>3</sub>_ by simply looping over
|
||||
all edges of the graph in _O(m)_ steps, but that exceeds our budget of _O(n)_ steps.
|
||||
So we need better techniques.
|
||||
_δ<sub>1</sub>_ is the minimum dual value of any S-vertex.
|
||||
This value can be computed in constant time.
|
||||
The dual value of an unmatched vertex is reduced by _δ_ during every delta step.
|
||||
Since all vertex duals start with the same dual value _u<sub>start</sub>_,
|
||||
all unmatched vertices have dual value _δ<sub>1</sub> = u<sub>start</sub> - Δ_.
|
||||
|
||||
For _δ<sub>2</sub>_, we determine the least-slack edge between an S-blossom and unlabeled
|
||||
blossom as follows.
|
||||
For every vertex _y_ in any unlabeled blossom, keep track of _e<sub>y</sub>_:
|
||||
the least-slack edge that connects _y_ to any S-vertex.
|
||||
The thing to keep track of is the identity of the edge, not the slack value.
|
||||
This information is kept up-to-date as part of the procedure that considers S-vertices.
|
||||
The scanning procedure eventually considers all edges _(x, y, w)_ where _x_ is an S-vertex.
|
||||
At that moment _e<sub>y</sub>_ is updated if the new edge has lower slack.
|
||||
_δ<sub>3</sub>_ is half of the minimum slack of any edge between two different S-blossoms.
|
||||
To compute this efficiently, we keep edges between S-blossoms in a priority queue.
|
||||
The edges are inserted into the queue during scanning of newly labeled S-vertices.
|
||||
To compute _δ<sub>3</sub>_, we simply find the minimum-priority element of the queue.
|
||||
|
||||
Calculating _δ<sub>2</sub>_ then becomes a matter of looping over all vertices _x_,
|
||||
checking whether _B(x)_ is unlabeled and calculating the slack of _e<sub>x</sub>_.
|
||||
A complication may occur when a new blossom is created.
|
||||
Edges that connect different top-level S-blossoms before creation of the new blossom,
|
||||
may end up as internal edges inside the newly created blossom.
|
||||
This implies that such edges would have to be removed from the _δ<sub>3</sub>_ priority queue,
|
||||
but that would be quite difficult.
|
||||
Instead, we just let those edges stay in the queue.
|
||||
When computing the value of _δ<sub>3</sub>_, we thus have to check whether the minimum
|
||||
element represents an edge between different top-level blossoms.
|
||||
If not, we discard such stale elements until we find an edge that does.
|
||||
|
||||
One subtle aspect of this technique is that a T-vertex can loose its label when
|
||||
the containing T-blossom gets expanded.
|
||||
At that point, we suddenly need to have kept track of its least-slack edge to any S-vertex.
|
||||
It is therefore necessary to do the same tracking also for T-vertices.
|
||||
So the technique is: For any vertex that is not an S-vertex, track its least-slack edge
|
||||
to any S-vertex.
|
||||
A complication occurs when dual variables are updated.
|
||||
At that point, the slack of any edge between different S-blossoms decreases by _2\*δ_.
|
||||
But we can not afford to update the priorities of all elements in the queue.
|
||||
To solve this, we set the priority of each edge to its _modified slack_.
|
||||
|
||||
Another subtle aspect is that a T-vertex may have a zero-slack edge to an S-vertex.
|
||||
Even though these edges are already tight, they must still be tracked.
|
||||
If the T-vertex loses its label, this edge needs to be reconsidered by the scanning procedure.
|
||||
By including these edges in least-slack edge tracking, they will be rediscovered
|
||||
through a _δ<sub>2</sub>=0_ update after the vertex becomes unlabeled.
|
||||
The _modified slack_ of an edge is defined as follows:
|
||||
|
||||
For _δ<sub>3</sub>_, we determine the least-slack edge between any pair of S-blossoms
|
||||
as follows.
|
||||
For every S-blossom _B_, keep track of _e<sub>B</sub>_:
|
||||
the least-slack edge between _B_ and any other S-blossom.
|
||||
Note that this is done per S-blossoms, not per S-vertex.
|
||||
This information is kept up-to-date as part of the procedure that considers S-vertices.
|
||||
Calculating _δ<sub>3</sub>_ then becomes a matter of looping over all S-blossoms _B_
|
||||
and calculating the slack of _e<sub>B</sub>_.
|
||||
$$ \pi'_{x,y} = u'_x + u'_y - w_{x,y} $$
|
||||
|
||||
A complication occurs when S-blossoms are merged.
|
||||
Some of the least-slack edges of the sub-blossoms may be internal edges in the merged blossom,
|
||||
and therefore irrelevant for _δ<sub>3</sub>_.
|
||||
As a result, the proper _e<sub>B</sub>_ of the merged blossom may be different from all
|
||||
least-slack edges of its sub-blossoms.
|
||||
An additional data structure is needed to find _e<sub>B</sub>_ of the merged blossom.
|
||||
The modified slack is computed in the same way as true slack, except it uses
|
||||
the modified vertex duals instead of true vertex duals.
|
||||
Blossom duals are ignored since we will never compute the modified slack of an edge that
|
||||
is contained inside a blossom.
|
||||
Because modified vertex duals are invariant under delta steps, so is the modified edge slack.
|
||||
As a result, the priorities of edges in the priority queue remain unchanged during a delta step.
|
||||
|
||||
For every S-blossom _B_, maintain a list _L<sub>B</sub>_ of edges between _B_ and
|
||||
other S-blossoms.
|
||||
The purpose of _L<sub>B</sub>_ is to contain, for every other S-blossom, the least-slack edge
|
||||
between _B_ and that blossom.
|
||||
These lists are kept up-to-date as part of the procedure that considers S-vertices.
|
||||
While considering vertex _x_, if edge _(x, y, w)_ has positive slack,
|
||||
and _B(y)_ is an S-blossom, the edge is inserted in _L<sub>B(x)</sub>_.
|
||||
This may cause _L<sub>B(x)</sub>_ to contain multiple edges to _B(y)_.
|
||||
That's okay as long as it definitely contains the least-slack edge to _B(y)_.
|
||||
_δ<sub>4</sub>_ is half of the minimum dual variable of any T-blossom.
|
||||
To compute this efficiently, we keep non-trivial T-blossoms in a priority queue.
|
||||
The blossoms are inserted into the queue when they become a T-blossom and removed from
|
||||
the queue when they stop being a T-blossom.
|
||||
|
||||
When a new S-blossom is created, form its list _L<sub>B</sub>_ by merging the lists
|
||||
of its sub-blossoms.
|
||||
Ignore edges that are internal to the merged blossom.
|
||||
If there are multiple edges to the same target blossom, keep only the least-slack of these edges.
|
||||
Then find _e<sub>B</sub>_ of the merged blossom by simply taking the least-slack edge
|
||||
out of _L<sub>B</sub>_.
|
||||
A complication occurs when dual variables are updated.
|
||||
At that point, the dual variable of any T-blossom decreases by _2\*δ_.
|
||||
But we can not afford to update the priorities of all elements in the queue.
|
||||
To solve this, we set the priority of each blossom to its _modified_ dual value
|
||||
_z'<sub>B</sub> = z<sub>B</sub> + 2\*Δ_.
|
||||
These values are invariant under delta steps.
|
||||
|
||||
_δ<sub>2</sub>_ is the minimum slack of any edge between an S-vertex and unlabeled vertex.
|
||||
To compute this efficiently, we use a fairly complicated strategy that involves
|
||||
three levels of priority queues.
|
||||
|
||||
At the lowest level, every T-vertex or unlabeled vertex maintains a separate priority queue
|
||||
of edges between itself and any S-vertex.
|
||||
Edges are inserted into this queue during scanning of newly labeled S-vertices.
|
||||
Note that S-vertices do not maintain this type of queue.
|
||||
|
||||
The priorities of edges in these queues are set to their _modified slack_.
|
||||
This ensures that the priorities remain unchanged during delta steps.
|
||||
The priorities also remain unchanged when the T-vertex becomes unlabeled or the unlabeled
|
||||
vertex becomes a T-vertex.
|
||||
|
||||
At the middle level, every T-blossom or unlabeled top-level blossom maintains a priority queue
|
||||
containing its vertices.
|
||||
This is in fact the _concatenable priority queue_ that is maintained by every top-level blossom
|
||||
as was described earlier in this document.
|
||||
The priority of each vertex in the mid-level queue is set to the minimum priority of any edge
|
||||
in the low-level queue of that vertex.
|
||||
If edges are added to (or removed from) the low-level queue, the priority of the corresponding
|
||||
vertex in the mid-level queue may change.
|
||||
If the low-level queue of a vertex is empty, that vertex has priority _Infinity_
|
||||
in the mid-level queue.
|
||||
|
||||
At the highest level, unlabeled top-level blossoms are tracked in one global priority queue.
|
||||
The priority of each blossom in this queue is set to the minimum slack of any edge
|
||||
from that blossom to an S-vertex plus _Δ_.
|
||||
These priorities are invariant under delta steps.
|
||||
|
||||
To compute _δ<sub>2</sub>_, we find the minimum priority in the high-level queue
|
||||
and adjust it by _Δ_.
|
||||
To find the edge associated with _δ<sub>2</sub>_,
|
||||
we use the high-level queue to find the unlabeled blossom with minimum priority,
|
||||
then use that blossom's mid-level queue to find the vertex with minimum priority,
|
||||
then use that vertex's low-level queue to find the edge with minimum priority.
|
||||
|
||||
The whole thing is a bit tricky, but it works.
|
||||
|
||||
### Re-using alternating trees
|
||||
|
||||
According to [[5]](#galil1986), labels and alternating trees should be erased at the end of each stage.
|
||||
However, the algorithm can be optimized by keeping some of the labels and re-using them
|
||||
in the next stage.
|
||||
The optimized algorithm erases _only_ the two alternating trees that are part of
|
||||
the augmenting path.
|
||||
All blossoms in those two trees lose their labels and become free blossoms again.
|
||||
Other alternating trees, which are not involved in the augmenting path, are preserved
|
||||
into the next stage, and so are the labels on the blossoms in those trees.
|
||||
|
||||
This optimization is well known and is described for example in [[7]](#mehlhorn_schafer2002).
|
||||
It does not affect the worst-case asymptotic run time of the algorithm,
|
||||
but it provides a significant practical speed up for many types of graphs.
|
||||
|
||||
Erasing alternating trees is easy enough, but selectively stripping labels off blossoms
|
||||
has a few implications.
|
||||
The blossoms that lose their labels need to have their modified dual values updated.
|
||||
The T-blossoms additionally need to have their _offset<sub>B</sub>_ variables updated
|
||||
to keep the vertex dual values consistent.
|
||||
For S-blossoms that lose their labels, the modified vertex dual variables are updated directly.
|
||||
|
||||
The various priority queues also need updating.
|
||||
Former T-blossoms must be removed from the priority queue for _δ<sub>4</sub>_.
|
||||
Edges incident on former S-vertices must be removed from the priority queues for _δ<sub>3</sub>_ and _δ<sub>2</sub>_.
|
||||
Finally, S-vertices that become unlabeled need to construct a proper priority queue
|
||||
of incident edges to other S-vertices for _δ<sub>2</sub>_ tracking.
|
||||
This involves visiting every incident edge of every vertex in each S-blossom that loses its label.
|
||||
|
||||
## Run time of the algorithm
|
||||
|
||||
Every stage of the algorithm either increases the number of matched vertices by 2 or
|
||||
ends the matching.
|
||||
Therefore the number of stages is at most _n/2_.
|
||||
Every stage runs in _O(n<sup>2</sup>)_ steps, therefore the complete algorithm runs in
|
||||
_O(n<sup>3</sup>)_ steps.
|
||||
|
||||
During each stage, edge scanning is driven by the queue _Q_.
|
||||
Every vertex enters _Q_ at most once.
|
||||
Each vertex that enters _Q_ has its incident edges scanned, causing every edge in the graph
|
||||
to be scanned at most twice per stage.
|
||||
Scanning an edge is done in constant time, unless it leads to the discovery of a blossom
|
||||
or an augmenting path, which will be separately accounted for.
|
||||
Therefore edge scanning needs _O(m)_ steps per stage.
|
||||
Every stage runs in time _O((n + m) log n)_, therefore the complete algorithm runs in
|
||||
time _O(n (n + m) log n)_.
|
||||
|
||||
Creating a blossom reduces the number of top-level blossoms by at least 2,
|
||||
thus limiting the number of simultaneously existing blossoms to _O(n)_.
|
||||
Blossoms that are created during a stage become S-blossoms and survive until the end of that stage,
|
||||
therefore _O(n)_ blossoms are created during a stage.
|
||||
Creating a blossom with _k_ sub-blossoms reduces the number of top-level blossoms by _k-1_,
|
||||
thus limiting the total number of sub-blossoms that can be involved in blossom creation
|
||||
during a stage to _O(n)_.
|
||||
|
||||
Creating a blossom involves tracing the alternating path to the closest common ancestor,
|
||||
and some bookkeeping per sub-blossom,
|
||||
and inserting new S-vertices _Q_,
|
||||
all of which can be done in _O(n)_ steps per blossom creation.
|
||||
The cost of managing least-slack edges between S-blossoms will be separately accounted for.
|
||||
Therefore blossom creation needs _O(n<sup>2</sup>)_ steps per stage
|
||||
(excluding least-slack edge management).
|
||||
which takes time _O(k log n)_ for a blossom with _k_ sub-blossoms.
|
||||
It also involves bookkeeping per sub-blossom, which takes time _O(log n)_ per sub-blossom.
|
||||
It also involves relabeling former T-vertices as S-vertices, but I account for that
|
||||
time separately below so I can ignore it here.
|
||||
It also involves merging the concatenable queues which track the vertices in top-level blossoms.
|
||||
Merging two queues takes time _O(log n)_, therefore merging the queues of all sub-blossoms
|
||||
takes time _O(k log n)_.
|
||||
Creating a blossom thus takes time _O(k log n)_.
|
||||
Blossom creation thus takes total time _O(n log n)_ per stage.
|
||||
|
||||
As part of creating a blossom, a list _L<sub>B</sub>_ of least-slack edges must be formed.
|
||||
This involves processing every element of all least-slack edge lists of its sub-blossoms,
|
||||
and removing redundant edges from the merged list.
|
||||
Merging and removing redundant edges can be done in one sweep via a temporary array indexed
|
||||
by target blossom.
|
||||
Collect the least-slack edges of the sub-blossoms into this array,
|
||||
indexed by the target blossom of the edge,
|
||||
keeping only the edge with lowest slack per target blossom.
|
||||
Then convert the array back into a list by removing unused indices.
|
||||
This takes _O(1)_ steps per candidate edge, plus _O(n)_ steps to manage the temporary array.
|
||||
I choose to shift the cost of collecting the candidate edges from the sub-blossoms to
|
||||
the actions that inserted those edges into the sub-blossom lists.
|
||||
There are two processes which insert edges into _L<sub>B</sub>_: edge scanning and blossom
|
||||
creation.
|
||||
Edge scanning inserts each graph edge at most twice per stage for a total cost of _O(m)_ steps
|
||||
per stage.
|
||||
Blossom creation inserts at most _O(n)_ edges per blossom creation.
|
||||
Therefore the total cost of S-blossom least-slack edge management is
|
||||
_O(m + n<sup>2</sup>) = O(n<sup>2</sup>)_ steps per stage.
|
||||
During each stage, a blossom becomes an S-blossom or T-blossom at most once.
|
||||
A blossom also becomes unlabeled at most once, at the end of the stage.
|
||||
Changing the label of a blossom takes some simple bookkeeping, as well as operations
|
||||
on priority queues (_δ<sub>4</sub>_ for T-blossoms, _δ<sub>2</sub>_ for unlabeled
|
||||
blossoms) which take time _O(log n)_ per blossom.
|
||||
Assigning label S or removing label S also involves work for the vertices in the blossom
|
||||
and their edges, but I account for that time separately below so I can ignore it here.
|
||||
Blossom labeling thus takes total time _O(n log n)_ per stage.
|
||||
|
||||
The number of blossom expansions during a stage is _O(n)_.
|
||||
Expanding a blossom involves some bookkeeping per sub-blossom,
|
||||
and reconstructing the alternating path through the blossom,
|
||||
and inserting any new S-vertices into _Q_,
|
||||
all of which can be done in _O(n)_ steps per blossom.
|
||||
Therefore blossom expansion needs _O(n<sup>2</sup>)_ steps per stage.
|
||||
During each stage, a vertex becomes an S-vertex at most once, and an S-vertex becomes
|
||||
unlabeled at most once.
|
||||
In both cases, the incident edges of the affected vertex are scanned and potentially
|
||||
added to or removed from priority queues.
|
||||
This involves finding the top-level blossoms of the endpoints of each edge, which
|
||||
takes time _O(log n)_ per edge.
|
||||
The updates to priority queues also take time _O(log n)_ per edge.
|
||||
Edge scanning thus takes total time _O(m log n)_ per stage.
|
||||
|
||||
Note that _m ≤ n<sup>2</sup>_ therefore _log m ≤ 2 log n_.
|
||||
This implies that an operation on a priority queue with _m_ elements takes time _O(log n)_.
|
||||
|
||||
Expanding a blossom involves some bookkeeping which takes time _O(log n)_ per sub-blossom.
|
||||
It also involves splitting the concatenable queue that tracks the vertices in top-level blossoms,
|
||||
which takes time _O(log n)_ per sub-blossom.
|
||||
In case of a T-blossom, it also involves reconstructing the alternating path through
|
||||
the blossom which takes time _O(k log n)_ for _k_ sub-blossoms.
|
||||
Also in case of a T-blossom, some sub-blossoms will become S-blossoms and their
|
||||
vertices become S-vertices, but I have already accounted for that cost above
|
||||
so I can ignore it here.
|
||||
Expanding a blossom thus takes time _O(k log n)_.
|
||||
Any blossom is involved as a sub-blossom in an expanding blossom at most once per stage.
|
||||
Blossom expansion thus takes total time _O(n log n)_ per stage.
|
||||
|
||||
The length of an augmenting path is _O(n)_.
|
||||
Tracing the augmenting path and augmenting the matching along the path can be done in _O(n)_ steps.
|
||||
Augmenting through a blossom takes a number of steps that is proportional in the number of
|
||||
Tracing the augmenting path and augmenting the matching along the path can be done
|
||||
in _O(n log n)_ steps.
|
||||
Augmenting through a blossom takes a number of steps that is proportional to the number of
|
||||
its sub-blossoms.
|
||||
Since there are fewer than _n_ non-trivial blossoms, the total cost of augmenting through
|
||||
blossoms is _O(n)_ steps.
|
||||
Therefore the total cost of augmenting is _O(n)_ steps per stage.
|
||||
Augmenting thus takes total time _O(n log n)_ per stage.
|
||||
|
||||
A dual variable update limited by _δ<sub>1</sub>_ ends the algorithm and therefore
|
||||
happens at most once.
|
||||
An update limited by _δ<sub>2</sub>_ labels a previously labeled blossom
|
||||
and therefore happens _O(n)_ times per stage.
|
||||
An update limited by _δ<sub>3</sub>_ either creates a blossom or finds an augmenting path
|
||||
and therefore happens _O(n)_ times per stage.
|
||||
An update limited by _δ<sub>4</sub>_ expands a blossom and therefore happens
|
||||
_O(n)_ times per stage.
|
||||
Therefore the number of dual variable updates is _O(n)_ per stage.
|
||||
The cost of calculating the _δ_ values is _O(n)_ per update as discussed above.
|
||||
Applying changes to the dual variables can be done by looping over all vertices and looping over
|
||||
all top-level blossoms in _O(n)_ steps per update.
|
||||
Therefore the total cost of dual variable updates is _O(n<sup>2</sup>)_ per stage.
|
||||
A delta step limited by _δ<sub>1</sub>_ ends the algorithm and therefore happens at most once.
|
||||
A _δ<sub>2</sub>_ step assigns a label to a previously unlabeled blossom and therefore
|
||||
happens _O(n)_ times per stage.
|
||||
A _δ<sub>3</sub>_ step either creates a blossom or finds an augmenting path and therefore
|
||||
happens _O(n)_ times per stage.
|
||||
A _δ<sub>4</sub>_ step expands a blossom and therefore happens _O(n)_ times per stage.
|
||||
Therefore the number of delta steps is _O(n)_ per stage.
|
||||
|
||||
Calculating _δ<sub>1</sub>_ takes constant time.
|
||||
Calculating _δ<sub>2</sub>_ and _δ<sub>4</sub>_ requires a constant number
|
||||
of lookups in priority queues which takes time _O(log n)_ per delta step.
|
||||
During calculation of _δ<sub>3</sub>_, it may be necessary to remove stage edges
|
||||
from the priority queue.
|
||||
Since every edge is inserted into the priority queue at most once per stage,
|
||||
at most _O(m)_ edges are removed per stage, which takes total time _O(m log n)_ per stage.
|
||||
Calculation of _δ_ thus takes total time _O((n + m) log n)_ per stage.
|
||||
|
||||
Applying updates to dual variables is done in a lazy fashion as discussed above.
|
||||
The only variable that is updated directly is _Δ_, which takes time _O(1)_ per delta step.
|
||||
Updating dual variables thus takes total time _O(n)_ per stage.
|
||||
|
||||
## Implementation details
|
||||
|
||||
|
@ -819,17 +1006,100 @@ Every edge therefore appears in two adjacency lists.
|
|||
These data structures are initialized at the start of the matching algorithm
|
||||
and never change while the algorithm runs.
|
||||
|
||||
#### Priority queue
|
||||
|
||||
Priority queues are used for a number of purposes:
|
||||
|
||||
- a priority queue to find the least-slack edge between S-blossoms;
|
||||
- a priority queue to find the minimum-dual T-blossom;
|
||||
- a priority queue to find the unlabeled blossom with least-slack edge to an S-blossom;
|
||||
- a separate priority queue per vertex to find the least-slack edge between that vertex
|
||||
and any S-vertex.
|
||||
|
||||
These queues are implemented as a binary heaps.
|
||||
This type of queue supports the following operations:
|
||||
|
||||
- _insert_ a new element with specified priority in time _O(log n)_;
|
||||
- find the element with _minimum_ priority in time _O(1)_;
|
||||
- _delete_ a specified element in time _O(log n)_;
|
||||
- _change_ the priority of a specified element in time _O(log n)_.
|
||||
|
||||
#### Concatenable priority queue
|
||||
|
||||
Each top-level blossom maintains a concatenable priority queue containing its vertices.
|
||||
We use a specific type of concatenable queue that supports the following operations
|
||||
[[4]](#galil_micali_gabow1986) [[8]](#aho_hopcroft_ullman1974):
|
||||
|
||||
- _create_ a new queue containing 1 new element;
|
||||
- find the element with _minimum_ priority in time _O(1)_;
|
||||
- _change_ the priority of a given element in time _O(log n)_;
|
||||
- _merge_ two queues into one new queue in time _O(log n)_;
|
||||
- _split_ a queue, thus undoing the previous _merge_ step in time _O(log n)_.
|
||||
|
||||
The efficient _merge_ and _split_ operations make it possible to adapt the queue during
|
||||
blossom creation and blossom expansion steps.
|
||||
The priorities in the queue are used to find, for a given top-level blossom, its vertex
|
||||
with least-slack edge to an S-blossom.
|
||||
|
||||
The concatenable queue is implemented as a balanced tree, specifically a _2-3 tree_.
|
||||
Each internal node of a 2-3 tree has either 2 or 3 children.
|
||||
The leaf nodes of the tree represent the elements of the queue.
|
||||
All leaf nodes are at the same distance from the root.
|
||||
Each node has a pointer to its parent, and each internal node has pointers to its children.
|
||||
Each internal node also stores its height (distance to its leaf nodes).
|
||||
|
||||
Only leaf nodes have a priority.
|
||||
However, each internal node maintains a pointer to the leaf node with minimum priority
|
||||
within its subtree.
|
||||
As a consequence, the root of the tree has a pointer to the least-priority element in the queue.
|
||||
To keep this information consistent, any change in the priority of a leaf node must
|
||||
be followed by updating the internal nodes along a path from the leaf node to the root.
|
||||
The same must be done when the structure of the tree is adjusted.
|
||||
|
||||
The left-to-right order of the leaf nodes is preserved during all operations, including _merge_
|
||||
and _split_.
|
||||
When trees _A_ and _B_ are merged, the sequence of leaf nodes in the merged tree will consist of
|
||||
the leaf nodes of _A_ followed by the leaf nodes of _B_.
|
||||
Note that the left-to-right order of the leaf nodes is unrelated to the priorities of the elements.
|
||||
|
||||
To merge two trees, the root of the smaller tree is inserted as a child of an appropriate node
|
||||
in the larger tree.
|
||||
Subsequent adjustments are needed restore the consistency the 2-3 tree and to update
|
||||
the minimum-element pointers of the internal nodes along a path from the insertion point
|
||||
to the root of the merged tree.
|
||||
This can be done in a number of steps proportional to the difference in height between
|
||||
the two trees, which is in any case _O(log n)_.
|
||||
|
||||
To split a tree, a _split node_ is identified: the left-most leaf node that must end up
|
||||
in the right-side tree after splitting.
|
||||
Internal nodes are deleted along the path from the _split node_ to the root of the tree.
|
||||
This creates a forest of disconnected subtrees on the left side of the path,
|
||||
and a similar forest of subtrees on the right side of the split path.
|
||||
The left-side subtrees are reassembled into a single tree through a series of _merge_ steps.
|
||||
Although _O(log n)_ merge steps may be needed, the total time required for reassembly
|
||||
is also _O(log n)_ because each merge step combines trees of similar height.
|
||||
A similar reassembly is done for the forest on the right side of the split path.
|
||||
|
||||
The concatenable queues have an additional purpose in the matching algorithm:
|
||||
finding the top-level blossom that contains a given vertex.
|
||||
To do this, we assign a _name_ to each concatenable queue instance, which is simply
|
||||
a pointer to the top-level blossom that maintains the queue.
|
||||
An extra operation is defined:
|
||||
_find_ the name of the queue instance that contains a given element in time _O(log n)_.
|
||||
Implementing the _find_ operation is easy:
|
||||
Starting at the leaf node that represents the element, follow _parent_ pointers
|
||||
to the root of the tree.
|
||||
The root node contains the name of the queue.
|
||||
|
||||
#### General data
|
||||
|
||||
`vertex_mate[x] = y` if the edge between vertex _x_ and vertex _y_ is matched. <br>
|
||||
`vertex_mate[x] = -1` if vertex _x_ is unmatched.
|
||||
|
||||
`vertex_top_blossom[x] =` pointer to _B(x)_, the top-level blossom that contains vertex _x_.
|
||||
`vertex_dual[x]` holds the modified vertex dual _u'<sub>x</sub>_.
|
||||
|
||||
`vertex_dual[x]` holds the value of _u<sub>x</sub>_.
|
||||
|
||||
A FIFO queue holds vertex indices of S-vertices whose edges have not yet been scanned.
|
||||
Vertices are inserted in this queue as soon as their top-level blossom gets label S.
|
||||
A global list holds vertex indices of S-vertices whose edges have not yet been scanned.
|
||||
Vertices are inserted in this list when their top-level blossom gets label S.
|
||||
|
||||
#### Blossoms
|
||||
|
||||
|
@ -842,56 +1112,28 @@ Both types of blossoms are represented as class instances with the following att
|
|||
* `B.label` is `S` or `T` or `None`
|
||||
* `B.tree_edge = (x, y)` if _B_ is a labeled top-level blossom, where _y_ is a vertex in _B_
|
||||
and _(x, y)_ is the edge that links _B_ to its parent in the alternating tree.
|
||||
* `B.tree_blossoms` points to a list of blossoms in the same alternating tree, if _B_
|
||||
is a labeled top-level blossom.
|
||||
* `B.vertex_dual_offset` holds the pending change to vertex duals _offset<sub>B</sub>_.
|
||||
|
||||
A non-trivial blossom additionally has the following attributes:
|
||||
|
||||
* `B.subblossoms` is an array of pointers to the sub-blossoms of _B_,
|
||||
starting with the sub-blossom that contains the base vertex.
|
||||
* `B.edges` is an array of alternating edges connecting the sub-blossoms.
|
||||
* `B.dual_var` holds the value of _z<sub>B</sub>_.
|
||||
* `B.dual_var` holds the modified blossom dual _z'<sub>B</sub>_.
|
||||
|
||||
Single-vertex blossoms are kept in an array indexed by vertex index. <br>
|
||||
Non-trivial blossoms are kept in a separate array. <br>
|
||||
Non-trivial blossoms are kept in a separate list. <br>
|
||||
These arrays are used to iterate over blossoms and to find the trivial blossom
|
||||
that consists of a given vertex.
|
||||
|
||||
#### Least-slack edge tracking
|
||||
|
||||
`vertex_best_edge[x]` is an array holding _e<sub>x</sub>_, the edge index of
|
||||
the least-slack edge between vertex _x_ and any S-vertex, or -1 if there is no such edge.
|
||||
This value is only meaningful if _x_ is a T-vertex or unlabeled vertex.
|
||||
|
||||
`B.best_edge` is a blossom attribute holding _e<sub>B</sub>_, the edge index of the least-slack
|
||||
edge between blossom _B_ and any other S-blossom, or -1 if there is no such edge.
|
||||
This value is only meaningful if _B_ is a top-level S-blossom.
|
||||
|
||||
For non-trivial S-blossoms _B_, attribute `B.best_edge_set` holds the list _L<sub>B</sub>_
|
||||
of potential least-slack edges to other blossoms.
|
||||
This list is not maintained for single-vertex blossoms, since _L<sub>B</sub>_ of a single vertex
|
||||
can be efficiently constructed from its adjacency list.
|
||||
|
||||
#### Memory usage
|
||||
|
||||
The data structures described above use a constant amount of memory per vertex and per edge
|
||||
and per blossom.
|
||||
Therefore the total memory requirement is _O(m + n)_.
|
||||
|
||||
The memory usage of _L<sub>B</sub>_ is a little tricky.
|
||||
Any given list _L<sub>B</sub>_ can have length _O(n)_, and _O(n)_ of these lists can exist
|
||||
simultaneously.
|
||||
Naively allocating space for _O(n)_ elements per list will drive memory usage
|
||||
up to _O(n<sup>2</sup>)_.
|
||||
However, at any moment, an edge can be in at most two of these lists, therefore the sum
|
||||
of the lengths of these lists is limited to _O(m)_.
|
||||
A possible solution is to implement the _L<sub>B</sub>_ as linked lists.
|
||||
|
||||
### Performance critical routines
|
||||
|
||||
Calculations that happen very frequently in the algorithm are:
|
||||
determining the top-level blossom of a given vertex, and calculating the slack of a given edge.
|
||||
These calculations must run in constant time per call in any case, but it makes sense to put
|
||||
some extra effort into making these calculations _fast_.
|
||||
|
||||
### Recursion
|
||||
|
||||
Certain tasks in the algorithm are recursive in nature:
|
||||
|
@ -948,59 +1190,41 @@ Proof by induction that all vertex duals are multiples of 0.5 and all blossom du
|
|||
- Blossom duals increase or decrease by _2\*δ_,
|
||||
therefore updated blossom duals are still integers.
|
||||
|
||||
The value of vertex dual variables and blossom dual variables never exceeds the
|
||||
greatest edge weight in the graph.
|
||||
This may be helpful for choosing an integer data type for the dual variables.
|
||||
It is useful to know that (modified) dual variables and (modified) edge slacks
|
||||
are limited to a finite range of values which depends on the maximum edge weight.
|
||||
This may be helpful when choosing an integer data type for these variables.
|
||||
(Alternatively, choose a programming language with unlimited integer range.
|
||||
This is perhaps the thing I love most about Python.)
|
||||
|
||||
Proof that dual variables do not exceed _max-weight_:
|
||||
|
||||
- Vertex dual variables start at _u<sub>x</sub> = 0.5\*max-weight_.
|
||||
- While the algorithm runs, there is at least one vertex which has been unmatched
|
||||
since the beginning.
|
||||
This vertex has always had label S, therefore its dual always decreased by _δ_
|
||||
during a dual variable update.
|
||||
Since it started at _0.5\*max-weight_ and can not become negative,
|
||||
the sum of _δ_ over all dual variable updates can not exceed _0.5\*max-weight_.
|
||||
- Vertex dual variables increase by at most _δ_ per update.
|
||||
Therefore no vertex dual can increase by more than _0.5\*max-weight_ in total.
|
||||
Therefore no vertex dual can exceed _max-weight_.
|
||||
- Blossom dual variables start at _z<sub>B</sub> = 0_.
|
||||
- Blossom dual variables increase by at most _2\*δ_ per update.
|
||||
Therefore no blossom dual can increase by more than _max-weight_ in total.
|
||||
Therefore no blossom dual can exceed _max-weight_.
|
||||
- The value of _Δ_ (sum over _δ_ steps) does not exceed _maxweight / 2_.
|
||||
Proof:
|
||||
- Vertex dual variables start at _u<sub>x</sub> = maxweight_ / 2.
|
||||
- While the algorithm runs, there is at least one vertex which has been unmatched
|
||||
since the beginning.
|
||||
This vertex has always had label S, therefore its dual is _maxweight/2 - Δ_.
|
||||
Vertex deltas can not be negative, therefore _Δ ≤ maxweight/2_.
|
||||
- Vertex duals are limited to the range 0 to _maxweight_.
|
||||
- Blossom duals are limited to the range 0 to _maxweight_.
|
||||
- Edge slack is limited to the range 0 to _2\*maxweight_.
|
||||
- Modified vertex duals are limited to the range 0 to _1.5\*maxweight_.
|
||||
- Modified blossom duals are limited to the range _-maxweight to 2\*maxweight_.
|
||||
- Modified edge slack is limited to the range 0 to _3\*maxweight_.
|
||||
- Dual offsets are limited to the range _-maxweight/2_ to _maxweight/2_.
|
||||
|
||||
### Handling floating point edge weights
|
||||
|
||||
Floating point calculations are subject to rounding errors.
|
||||
This has two consequences for the matching algorithm:
|
||||
As a result, the algorithm may return a matching which has slightly lower weight than
|
||||
the actual maximum weight.
|
||||
|
||||
- The algorithm may return a matching which has slightly lower weight than
|
||||
the actual maximum weight.
|
||||
The algorithm will allways return a valid matching, even if rounding errors occur.
|
||||
Floating point comparisons affect which actions are taken during delta steps,
|
||||
and thus eventually determine which edges are matched.
|
||||
But the overall structure of the algorithm guarantees that it will eventually return
|
||||
a valid (if possibly suboptimal) matching.
|
||||
|
||||
- The algorithm may not reliably recognize tight edges.
|
||||
To check whether an edge is tight, its slack is compared to zero.
|
||||
Rounding errors may cause the slack to appear positive even when an exact calculation
|
||||
would show it to be zero.
|
||||
The slack of some edges may even become slightly negative.
|
||||
|
||||
I believe this does not affect the correctness of the algorithm.
|
||||
An edge that should be tight but is not recognized as tight due to rounding errors,
|
||||
can be pulled tight through an additional dual variable update.
|
||||
As side-effect of this update, the edge will immediately be used to grow the alternating tree,
|
||||
or construct a blossom or augmenting path.
|
||||
This mechanism allows the algorithm to make progress, even if slack comparisons
|
||||
are repeatedly thrown off by rounding errors.
|
||||
Rounding errors may cause the algorithm to perform more dual variable updates
|
||||
than strictly necessary.
|
||||
But this will still not cause the run time of the algorithm to exceed _O(n<sup>3</sup>)_.
|
||||
|
||||
It seems to me that the matching algorithm is stable for floating point weights.
|
||||
And it seems to me that it returns a matching which is close to optimal,
|
||||
and could have been optimal if edge weights were changed by small amounts.
|
||||
|
||||
I must admit these arguments are mostly based on intuition.
|
||||
The most challenging cases are probably graphs with edge weights that differ by many orders
|
||||
of magnitude.
|
||||
Unfortunately I don't know how to properly analyze the floating point accuracy of this algorithm.
|
||||
|
||||
### Finding a maximum weight matching out of all maximum cardinality matchings
|
||||
|
@ -1057,7 +1281,14 @@ changing all weights by the same amount doesn't change which of these matchings
|
|||
([link](https://dl.acm.org/doi/abs/10.5555/320176.320229))
|
||||
([pdf](https://dl.acm.org/doi/pdf/10.5555/320176.320229))
|
||||
|
||||
7. <a id="mehlhorn_schafer2002"></a>
|
||||
Kurt Mehlhorn, Guido Schäfer, "Implementation of O(nm log(n)) Weighted Matchings in General Graphs: The Power of Data Structures", _Journal of Experimental Algorithmics vol. 7_, 2002.
|
||||
([link](https://dl.acm.org/doi/10.1145/944618.944622))
|
||||
|
||||
8. <a id="aho_hopcroft_ullman1974"></a>
|
||||
Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman,
|
||||
_The Design and Analysis of Computer Algorithms_, Addison-Wesley, 1974.
|
||||
|
||||
---
|
||||
Written in 2023 by Joris van Rantwijk.
|
||||
Written in 2023-2024 by Joris van Rantwijk.
|
||||
This work is licensed under [CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0/).
|
||||
|
|
|
@ -470,6 +470,9 @@ class NonTrivialBlossom(Blossom):
|
|||
# The value of the dual variable changes through delta steps,
|
||||
# but these changes are implemented as lazy updates.
|
||||
#
|
||||
# blossom.dual_var holds the modified blossom dual value.
|
||||
# The modified blossom dual is invariant under delta steps.
|
||||
#
|
||||
# The true dual value of a top-level S-blossom is
|
||||
# blossom.dual_var + ctx.delta_sum_2x
|
||||
#
|
||||
|
@ -479,7 +482,6 @@ class NonTrivialBlossom(Blossom):
|
|||
# The true dual value of any other type of blossom is simply
|
||||
# blossom.dual_var
|
||||
#
|
||||
# Note that "dual_var" is invariant under delta steps.
|
||||
self.dual_var: float = 0
|
||||
|
||||
# If this is a top-level T-blossom,
|
||||
|
@ -575,6 +577,9 @@ class MatchingContext:
|
|||
# The value of the dual variable changes through delta steps,
|
||||
# but these changes are implemented as lazy updates.
|
||||
#
|
||||
# vertex_dual_2x[x] holds 2 times the modified vertex dual value of
|
||||
# vertex "x". The modified vertex dual is invariant under delta steps.
|
||||
#
|
||||
# The true dual value of an S-vertex is
|
||||
# (vertex_dual_2x[x] - delta_sum_2x) / 2
|
||||
#
|
||||
|
@ -584,7 +589,6 @@ class MatchingContext:
|
|||
# The true dual value of an unlabeled vertex is
|
||||
# (vertex_dual_2x[x] + B(x).vertex_dual_offset) / 2
|
||||
#
|
||||
# Note that "vertex_dual_2x" is invariant under delta steps.
|
||||
self.vertex_dual_2x: list[float]
|
||||
self.vertex_dual_2x = num_vertex * [self.start_vertex_dual_2x]
|
||||
|
||||
|
@ -648,14 +652,14 @@ class MatchingContext:
|
|||
The pseudo-slack of an edge is related to its true slack, but
|
||||
adjusted in a way that makes it invariant under delta steps.
|
||||
|
||||
If the edge connects two S-vertices in different top-level blossoms,
|
||||
the true slack is the pseudo-slack minus 2 times the running sum
|
||||
of delta steps.
|
||||
The true slack of an edge between to S-vertices in different
|
||||
top-level blossoms is
|
||||
edge_pseudo_slack_2x(e) / 2 - delta_sum_2x
|
||||
|
||||
If the edge connects an S-vertex to an unlabeled vertex,
|
||||
the true slack is the pseudo-slack minus the running sum of delta
|
||||
steps, plus the pending offset of the top-level blossom that contains
|
||||
the unlabeled vertex.
|
||||
The true slack of an edge between an S-vertex and an unlabeled
|
||||
vertex "y" inside top-level blossom B(y) is
|
||||
(edge_pseudo_slack_2x(e)
|
||||
- delta_sum_2x + B(y).vertex_dual_offset) / 2
|
||||
"""
|
||||
(x, y, w) = self.graph.edges[e]
|
||||
return self.vertex_dual_2x[x] + self.vertex_dual_2x[y] - 2 * w
|
||||
|
@ -1485,7 +1489,7 @@ class MatchingContext:
|
|||
def augment_matching(self, path: AlternatingPath) -> None:
|
||||
"""Augment the matching through the specified augmenting path.
|
||||
|
||||
This function takes time O(n).
|
||||
This function takes time O(n * log(n)).
|
||||
"""
|
||||
|
||||
# Check that the augmenting path starts and ends in
|
||||
|
|
Loading…
Reference in New Issue