diff --git a/README.md b/README.md index ef058a5..3522def 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ For an edge-weighted graph, a _maximum weight matching_ is a matching that achie the largest possible sum of weights of matched edges. The code in this repository is based on a variant of the blossom algorithm that runs in -_O(n \* m \* log(n))_ steps. +_O(n m log n)_ steps. See the file [Algorithm.md](doc/Algorithm.md) for a detailed description. @@ -44,7 +44,7 @@ The folder [cpp/](cpp/) contains a header-only C++ implementation of maximum wei **NOTE:** The C++ code currently implements a slower algorithm that runs in _O(n3)_ steps. -I plan to eventually update the C++ code to implement the faster _O(n*m*log(n))_ algorithm. +I plan to eventually update the C++ code to implement the faster _O(n m log n)_ algorithm. The C++ code is self-contained and can easily be linked into an application. It is also reasonably efficient. diff --git a/doc/Algorithm.md b/doc/Algorithm.md index 1d08728..686e125 100644 --- a/doc/Algorithm.md +++ b/doc/Algorithm.md @@ -82,7 +82,7 @@ It has also been shown to be quite fast in practice on several types of graphs including random graphs [[7]](#mehlhorn_schafer2002). This algorithm is more difficult to implement than the older _O(n3)_ algorithm. -In particular, it requires a specialized data structure to implement mergeable priority queues. +In particular, it requires a specialized data structure to implement concatenable priority queues. This increases the size and complexity of the code quite a bit. However, in my opinion the performance improvement is worth the extra effort. @@ -241,7 +241,7 @@ have _slack_. An augmenting path that consists only of tight edges is _guaranteed_ to increase the weight of the matching as much as possible. -While searching for an augmenting path, we simply restrict the search to tight edges, +While searching for an augmenting path, we restrict the search to tight edges, ignoring all edges that have slack. Certain explicit actions of the algorithm cause edges to become tight or slack. How this works will be explained later. @@ -277,8 +277,8 @@ an odd-length alternating cycle. The lowest common ancestor node in the alternating tree forms the beginning and end of the alternating cycle. In this case a new blossom must be created by shrinking the cycle. -If the two S-blossoms are in different alternating trees, the edge that links the blossoms -is part of an augmenting path between the roots of the two trees. +On the other hand, if the two S-blossoms are in different alternating trees, +the edge that links the blossoms is part of an augmenting path between the roots of the two trees. ![Figure 3](figures/graph3.png)
*Figure 3: Growing alternating trees* @@ -457,9 +457,9 @@ $$ \pi_{x,y} = u_x + u_y + \sum_{(x,y) \in B} z_B - w_{x,y} $$ An edge is _tight_ if and only if its slack is zero. Given the values of the dual variables, it is very easy to calculate the slack of an edge -which is not contained in any blossom: simply add the duals of its incident vertices and +which is not contained in any blossom: add the duals of its incident vertices and subtract the weight. -To check whether an edge is tight, simply compute its slack and check whether it is zero. +To check whether an edge is tight, simply compute its slack and compare it to zero. Calculating the slack of an edge that is contained in one or more blossoms is a little tricky, but fortunately we don't need such calculations. @@ -492,7 +492,7 @@ At that point the maximum weight matching has been found. When the matching algorithm is finished, the constraints can be checked to verify that the matching is optimal. This check is simpler and faster than the matching algorithm itself. -It can therefore be a useful way to guard against bugs in the matching algorithm. +It can therefore be a useful way to guard against bugs in the algorithm. ### Rules for updating dual variables @@ -522,7 +522,7 @@ It then changes dual variables as follows: - _zB ← zB − 2 * δ_ for every non-trivial T-blossom _B_ Dual variables of unlabeled blossoms and their vertices remain unchanged. -Dual variables _zB_ of non-trivial sub-blossoms also remain changed; +Dual variables _zB_ of non-trivial sub-blossoms also remain unchanged; only top-level blossoms have their _zB_ updated. Note that these rules ensure that no change occurs to the slack of any edge which is matched, @@ -566,21 +566,21 @@ to an alternating tree, or expanding a blossom) that allow the algorithm to make In fact, it is convenient to let the dual update mechanism drive the entire process of discovering tight edges and growing alternating trees. -In my description of the search algorithm above, I stated that a tight edge between -a newly labeled S-vertex and an unlabeled vertex or a different S-blossom should be used to -grow the alternating tree or to create a new blossom or to form an augmenting path. +In my description of the search algorithm above, I stated that upon discovery of a tight edge +between a newly labeled S-vertex and an unlabeled vertex or a different S-blossom, the edge should +be used to grow the alternating tree or to create a new blossom or to form an augmenting path. However, it turns out to be easier to postpone the use of such edges until the next delta step. While scanning newly labeled S-vertices, edges to unlabeled vertices or different S-blossoms are discovered but not yet used. -Such edges will merely be indexed in a suitable data structure. -Even if the edge is tight, it will be indexed rather than used right away. +Such edges are merely registered in a suitable data structure. +Even if the edge is tight, it is registered rather than used right away. Once the scan completes, a delta step will be done. If any tight edges were discovered during the scan, the delta step will find that either _δ2 = 0_ or _δ3 = 0_. The corresponding step (growing the alternating tree, creating a blossom or augmenting the matching) will occur at that point. -If no suitable tight edges exist, a real change of dual variables will occur. +If no suitable tight edges exist, a real (non-zero) change of dual variables will occur. The search for an augmenting path becomes as follows: @@ -590,7 +590,7 @@ The search for an augmenting path becomes as follows: Add all vertices inside such blossoms to _Q_. - Repeat until either an augmenting path is found or _δ1 = 0_: - Scan all vertices in Q as described earlier. - Build an index of edges to unlabeled vertices or other S-blossoms. + Register edges to unlabeled vertices or other S-blossoms. Do not yet use such edges to change the alternating tree, even if the edge is tight. - Calculate _δ_ and update dual variables as described above. - If _δ = δ1_, end the search. @@ -607,7 +607,7 @@ The search for an augmenting path becomes as follows: - If _δ = δ4_, expand the corresponding T-blossom. It may seem complicated, but this is actually easier. -The code that scans newly labeled S-vertices, no longer needs to treat tight edges specially. +The code that scans newly labeled S-vertices, no longer needs special treatment of tight edges. In general, multiple updates of the dual variables are necessary during a single _stage_ of the algorithm. @@ -665,7 +665,7 @@ All vertices of sub-blossoms that got label S are inserted into _Q_. The algorithm often needs to find the top-level blossom _B(x)_ that contains a given vertex _x_. -A naive implementation may keep this information is an array where the element with +A naive implementation may keep this information in an array where the element with index _x_ holds a pointer to blossom _B(x)_. Lookup in this array would be fast, but keeping the array up-to-date takes too much time. There can be _O(n)_ stages, and _O(n)_ blossoms can be created or expanded during a stage, @@ -684,8 +684,8 @@ for example by storing a pointer to the blossom inside the queue instance. When a new blossom is created, the concatenable queues of its sub-blossoms are merged to form one concatenable queue for the new blossom. -Concatenating two queues produces a new queue that contains all members of the original queues. -This operation takes time _O(log n)_. +The merged queue contains all vertices of the original queues. +Merging a pair of queues takes time _O(log n)_. To merge the queues of _k_ sub-blossoms, the concatenation step is repeated _k-1_ times, taking total time _O(k log n)_. @@ -693,7 +693,7 @@ When a blossom is expanded, its concatenable queue is un-concatenated to recover for the sub-blossoms. This also takes time _O(log n)_ for each sub-blossom. -Implementation details of a concatenable queue will be discussed later in this document. +Implementation details of concatenable queues are discussed later in this document. ### Lazy updating of dual variables @@ -775,8 +775,7 @@ _δ1_ is the minimum dual value of any S-vertex. This value can be computed in constant time. The dual value of an unmatched vertex is reduced by _δ_ during every delta step. Since all vertex duals start with the same dual value _ustart_, -all unmatched vertices have dual value _ustart - Δ_, -which is the minimum dual value among all vertices. +all unmatched vertices have dual value _δ1 = ustart - Δ_. _δ3_ is half of the minimum slack of any edge between two different S-blossoms. To compute this efficiently, we keep edges between S-blossoms in a priority queue. @@ -836,15 +835,16 @@ This ensures that the priorities remain unchanged during delta steps. The priorities also remain unchanged when the T-vertex becomes unlabeled or the unlabeled vertex becomes a T-vertex. -At the middle level, every T-blossom or unlabeled top-level maintains a priority queue +At the middle level, every T-blossom or unlabeled top-level blossom maintains a priority queue containing its vertices. -This is in fact the _concatenable priority queue_ instance that is maintained by every -top-level blossom as described earlier in this document. -The priority of each vertex in the queue is set to the minimum priority of any edge +This is in fact the _concatenable priority queue_ that is maintained by every top-level blossom +as was described earlier in this document. +The priority of each vertex in the mid-level queue is set to the minimum priority of any edge in the low-level queue of that vertex. If edges are added to (or removed from) the low-level queue, the priority of the corresponding vertex in the mid-level queue may change. -If the low-level queue of a vertex is empty, that vertex has priority _Inf_ in the mid-level queue. +If the low-level queue of a vertex is empty, that vertex has priority _Infinity_ +in the mid-level queue. At the highest level, unlabeled top-level blossoms are tracked in one global priority queue. The priority of each blossom in this queue is set to the minimum slack of any edge @@ -862,7 +862,7 @@ The whole thing is a bit tricky, but it works. ### Re-using alternating trees -According to [[5]], labels and alternating trees should be erased at the end of each stage. +According to [[5]](#galil1986), labels and alternating trees should be erased at the end of each stage. However, the algorithm can be optimized by keeping some of the labels and re-using them in the next stage. The optimized algorithm erases _only_ the two alternating trees that are part of @@ -884,7 +884,7 @@ For S-blossoms that lose their labels, the modified vertex dual variables are up The various priority queues also need updating. Former T-blossoms must be removed from the priority queue for _δ4_. -Edges incident on former S-vertices must be removed from the priority queue for _δ3_. +Edges incident on former S-vertices must be removed from the priority queues for _δ3_ and _δ2_. Finally, S-vertices that become unlabeled need to construct a proper priority queue of incident edges to other S-vertices for _δ2_ tracking. This involves visiting every incident edge of every vertex in each S-blossom that loses its label. @@ -894,8 +894,8 @@ This involves visiting every incident edge of every vertex in each S-blossom tha Every stage of the algorithm either increases the number of matched vertices by 2 or ends the matching. Therefore the number of stages is at most _n/2_. -Every stage runs in _O((n + m) log n)_ steps, therefore the complete algorithm runs in -_O(n (n + m) log n)_ steps. +Every stage runs in time _O((n + m) log n)_, therefore the complete algorithm runs in +time _O(n (n + m) log n)_. Creating a blossom reduces the number of top-level blossoms by at least 2, thus limiting the number of simultaneously existing blossoms to _O(n)_. @@ -921,11 +921,11 @@ A blossom also becomes unlabeled at most once, at the end of the stage. Changing the label of a blossom takes some simple bookkeeping, as well as operations on priority queues (_δ4_ for T-blossoms, _δ2_ for unlabeled blossoms) which take time _O(log n)_ per blossom. -Assigning label S or removing label S also involves some work per vertex in the blossom, -but I account for that time separately below so I can ignore it here. +Assigning label S or removing label S also involves work for the vertices in the blossom +and their edges, but I account for that time separately below so I can ignore it here. Blossom labeling thus takes total time _O(n log n)_ per stage. -During each stage, an vertex becomes an S-vertex at most once, and an S-vertex becomes +During each stage, a vertex becomes an S-vertex at most once, and an S-vertex becomes unlabeled at most once. In both cases, the incident edges of the affected vertex are scanned and potentially added to or removed from priority queues. @@ -946,7 +946,7 @@ Also in case of a T-blossom, some sub-blossoms will become S-blossoms and their vertices become S-vertices, but I have already accounted for that cost above so I can ignore it here. Expanding a blossom thus takes time _O(k log n)_. -The number of blossom expansions during a stage is _O(n)_. +Any blossom is involved as a sub-blossom in an expanding blossom at most once per stage. Blossom expansion thus takes total time _O(n log n)_ per stage. The length of an augmenting path is _O(n)_. @@ -1017,8 +1017,8 @@ Priority queues are used for a number of purposes: - a separate priority queue per vertex to find the least-slack edge between that vertex and any S-vertex. -This type of queue is implemented as a binary heap. -It supports the following operations: +These queues are implemented as a binary heaps. +This type of queue supports the following operations: - _insert_ a new element with specified priority in time _O(log n)_; - find the element with _minimum_ priority in time _O(1)_; @@ -1029,11 +1029,11 @@ It supports the following operations: Each top-level blossom maintains a concatenable priority queue containing its vertices. We use a specific type of concatenable queue that supports the following operations -[[4]](#galil_micali_gabow1986) [[5]](#aho_hopcroft_ullman1974): +[[4]](#galil_micali_gabow1986) [[8]](#aho_hopcroft_ullman1974): - _create_ a new queue containing 1 new element; - find the element with _minimum_ priority in time _O(1)_; - - _change_ the priority of a given element; + - _change_ the priority of a given element in time _O(log n)_; - _merge_ two queues into one new queue in time _O(log n)_; - _split_ a queue, thus undoing the previous _merge_ step in time _O(log n)_. @@ -1052,7 +1052,7 @@ Each internal node also stores its height (distance to its leaf nodes). Only leaf nodes have a priority. However, each internal node maintains a pointer to the leaf node with minimum priority within its subtree. -As a consequence, the root of the tree has a pointer to the element with minimum priority. +As a consequence, the root of the tree has a pointer to the least-priority element in the queue. To keep this information consistent, any change in the priority of a leaf node must be followed by updating the internal nodes along a path from the leaf node to the root. The same must be done when the structure of the tree is adjusted. @@ -1060,7 +1060,7 @@ The same must be done when the structure of the tree is adjusted. The left-to-right order of the leaf nodes is preserved during all operations, including _merge_ and _split_. When trees _A_ and _B_ are merged, the sequence of leaf nodes in the merged tree will consist of -the leaf nodes _A_ followed by the leaf nodes of _B_. +the leaf nodes of _A_ followed by the leaf nodes of _B_. Note that the left-to-right order of the leaf nodes is unrelated to the priorities of the elements. To merge two trees, the root of the smaller tree is inserted as a child of an appropriate node @@ -1087,7 +1087,6 @@ To do this, we assign a _name_ to each concatenable queue instance, which is sim a pointer to the top-level blossom that maintains the queue. An extra operation is defined: _find_ the name of the queue instance that contains a given element in time _O(log n)_. - Implementing the _find_ operation is easy: Starting at the leaf node that represents the element, follow _parent_ pointers to the root of the tree. @@ -1285,7 +1284,6 @@ changing all weights by the same amount doesn't change which of these matchings 7. Kurt Mehlhorn, Guido Schäfer, "Implementation of O(nm log(n)) Weighted Matchings in General Graphs: The Power of Data Structures", _Journal of Experimental Algorithmics vol. 7_, 2002. ([link](https://dl.acm.org/doi/10.1145/944618.944622)) - ([pdf](https://sci-hub.se/https://doi.org/10.1145/944618.944622)) 8. Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman,