Add implementation notes to algorithm description
This commit is contained in:
		
							parent
							
								
									b53e58902a
								
							
						
					
					
						commit
						3b80109cff
					
				
							
								
								
									
										234
									
								
								Algorithm.md
								
								
								
								
							
							
						
						
									
										234
									
								
								Algorithm.md
								
								
								
								
							|  | @ -793,7 +793,239 @@ Therefore the total cost of dual variable updates is _O(n<sup>2</sup>)_ per stag | |||
| 
 | ||||
| ## Implementation details | ||||
| 
 | ||||
| _TO BE WRITTEN_ | ||||
| This section describes some choices I made while implementing the algorithm. | ||||
| There may be easier or faster or better ways to do it, but this is what I did | ||||
| and it seems to work mostly okay. | ||||
| 
 | ||||
| ### Data structures | ||||
| 
 | ||||
| #### Input graph | ||||
| 
 | ||||
| Vertices are represented as non-negative integers in range _0_ to _n-1_. | ||||
| 
 | ||||
| Edges are represented as an array of tuples _(x, y, w)_ where _x_ and _y_ are indices | ||||
| of the incident vertices, and _w_ is the numerical weight of the edge. | ||||
| The edges are listed in no particular order. | ||||
| Each edge has an index in _e_ range _0_ to _m-1_. | ||||
| 
 | ||||
| `edges[e] = (x, y, w)` | ||||
| 
 | ||||
| In addition, edges are organized in an array of adjacency lists, indexed by vertex index. | ||||
| Each adjacency list contains the edge indices of all edges incident on a specific vertex. | ||||
| Every edge therefore appears in two adjacency lists. | ||||
| 
 | ||||
| `adjacent_edges[x] = [e1, e2, ...]` | ||||
| 
 | ||||
| These data structures are initialized at the start of the matching algorithm | ||||
| and never change while the algorithm runs. | ||||
| 
 | ||||
| #### General data | ||||
| 
 | ||||
| `vertex_mate[x] = y` if the edge between vertex _x_ and vertex _y_ is matched. <br> | ||||
| `vertex_mate[x] = -1` if vertex _x_ is unmatched. | ||||
| 
 | ||||
| `vertex_top_blossom[x] =` pointer to _B(x)_, the top-level blossom that contains vertex _x_. | ||||
| 
 | ||||
| `vertex_dual[x]` holds the value of _u<sub>x</sub>_. | ||||
| 
 | ||||
| A FIFO queue holds vertex indices of S-vertices whose edges have not yet been scanned. | ||||
| Vertices are inserted in this queue as soon as their top-level blossom gets label S. | ||||
| 
 | ||||
| #### Blossoms | ||||
| 
 | ||||
| A blossom is either a single vertex or a non-trivial blossom. | ||||
| Both types of blossoms are represented as class instances with the following attributes: | ||||
| 
 | ||||
|  * `B.base_vertex` is the vertex index of the base vertex of blossom _B_. | ||||
|  * `B.parent` is a pointer to the parent of _B_ in the blossom structure tree, | ||||
|     or `None` if _B_ is a top-level blossom. | ||||
|  * `B.label` is `S` or `T` or `None` | ||||
|  * `B.tree_edge = (x, y)` if _B_ is a labeled top-level blossom, where _y_ is a vertex in _B_ | ||||
|    and _(x, y)_ is the edge that links _B_ to its parent in the alternating tree. | ||||
| 
 | ||||
| A non-trivial blossom additionally has the following attributes: | ||||
| 
 | ||||
|  * `B.subblossoms` is an array of pointers to the sub-blossoms of _B_, | ||||
|    starting with the sub-blossom that contains the base vertex. | ||||
|  * `B.edges` is an array of alternating edges connecting the sub-blossoms. | ||||
|  * `B.dual_var` holds the value of _z<sub>B</sub>_. | ||||
| 
 | ||||
| Single-vertex blossoms are kept in an array indexed by vertex index. <br> | ||||
| Non-trivial blossoms are kept in a separate array. <br> | ||||
| These arrays are used to iterate over blossoms and to find the trivial blossom | ||||
| that consists of a given vertex. | ||||
| 
 | ||||
| #### Least-slack edge tracking | ||||
| 
 | ||||
| `vertex_best_edge[x]` is an array holding _e<sub>x</sub>_, the edge index of | ||||
| the least-slack edge between vertex _x_ and any S-vertex, or -1 if there is no such edge. | ||||
| This value is only meaningful if _x_ is a T-vertex or unlabeled vertex. | ||||
| 
 | ||||
| `B.best_edge` is a blossom attribute holding _e<sub>B</sub>_, the edge index of the least-slack | ||||
| edge between blossom _B_ and any other S-blossom, or -1 if there is no such edge. | ||||
| This value is only meaningful if _B_ is a top-level S-blossom. | ||||
| 
 | ||||
| For non-trivial S-blossoms _B_, attribute `B.best_edge_set` holds the list _L<sub>B</sub>_ | ||||
| of potential least-slack edges to other blossoms. | ||||
| This list is not maintained for single-vertex blossoms, since _L<sub>B</sub>_ of a single vertex | ||||
| can be efficiently constructed from its adjacency list. | ||||
| 
 | ||||
| #### Memory usage | ||||
| 
 | ||||
| The data structures described above use a constant amount of memory per vertex and per edge | ||||
| and per blossom. | ||||
| Therefore the total memory requirement is _O(m + n)_. | ||||
| 
 | ||||
| The memory usage of _L<sub>B</sub>_ is a little tricky. | ||||
| Any given list _L<sub>B</sub>_ can have length _O(n)_, and _O(n)_ of these lists can exist | ||||
| simultaneously. | ||||
| Naively allocating space for _O(n)_ elements per list will drive memory usage | ||||
| up to _O(n<sup>2</sup>)_. | ||||
| However, at any moment, an edge can be in at most two of these lists, therefore the sum | ||||
| of the lengths of these lists is limited to _O(m)_. | ||||
| A possible solution is to implement the _L<sub>B</sub>_ as linked lists. | ||||
| 
 | ||||
| ### Performance critical routines | ||||
| 
 | ||||
| Calculations that happen very frequently in the algorithm are: | ||||
| determining the top-level blossom of a given vertex, and calculating the slack of a given edge. | ||||
| These calculations must run in constant time per call in any case, but it makes sense to put | ||||
| some extra effort into making these calculations _fast_. | ||||
| 
 | ||||
| ### Recursion | ||||
| 
 | ||||
| Certain tasks in the algorithm are recursive in nature: | ||||
| enumerating the vertices in a given blossom, and augmenting the matching along | ||||
| a path through a blossom. | ||||
| It seems natural to implement such tasks as recursive subroutines, which handle the task | ||||
| for a given blossom and make recursive calls to handle sub-blossoms as necessary. | ||||
| But the recursion depth of such implementations can grow to _O(n)_ in case | ||||
| of deeply nested blossoms. | ||||
| 
 | ||||
| Deep recursion may cause problems in certain programming languages and runtime environments. | ||||
| In such cases, it may be better to avoid recursive calls and instead implement an iterative | ||||
| control flow with an explicit stack. | ||||
| 
 | ||||
| ### Handling integer edge weights | ||||
| 
 | ||||
| If all edge weights in the input graph are integers, it is possible and often desirable | ||||
| to implement the algorithm such that only integer calculations are used. | ||||
| 
 | ||||
| If all edge weights are integers, then all vertex dual variables _u<sub>x</sub>_ | ||||
| are integer multiples of 0.5, and all blossom dual variables _z<sub>B</sub>_ are integers. | ||||
| Storing the vertex duals as _2\*u<sub>x</sub>_ allows all calculations to be done with integers. | ||||
| 
 | ||||
| Proof by induction that all vertex duals are multiples of 0.5 and all blossom duals are integers: | ||||
| 
 | ||||
|  - Vertex duals are initialized to 0.5 times the greatest edge weight. | ||||
|    Blossom duals are initialized to 0. | ||||
|    Therefore the proposition is initially true. | ||||
|  - All unmatched vertices have the same dual value. | ||||
|    Proof: Initially all vertices have the same dual value. | ||||
|    All unmatched vertices have been unmatched since the beginning, | ||||
|    therefore always had label S in every dual variable update, | ||||
|    therefore always got changed by the same amount. | ||||
|  - Either all duals of S-vertices are integers or all duals of S-vertices are odd multiples of 0.5. | ||||
|    Proof: The root nodes of alternating trees are unmatched vertices which all have the same dual. | ||||
|    Within an alternating tree, all edges are tight, therefore all duals of vertices in the tree | ||||
|    differ by an integer amount, therefore either all duals are integers or all duals | ||||
|    are odd multiples of 0.5. | ||||
|  - _δ_ is a multiple of 0.5. Proof: | ||||
|      - _δ<sub>1</sub> = u<sub>x</sub>_ and _u<sub>x</sub>_ is a multiple of 0.5, | ||||
|        therefore _δ<sub>1</sub>_ is a mutiple of 0.5. | ||||
|      - _δ<sub>2</sub> = π<sub>x,y</sub> = u<sub>x</sub> + u<sub>y</sub> - w<sub>x,y</sub>_ | ||||
|        where _u<sub>x</sub>_ and _u<sub>y</sub>_ and _w<sub>x,y</sub>_ are multiples of 0.5, | ||||
|        therefore _δ<sub>2</sub>_ is a multiple of 0.5. | ||||
|      - _δ<sub>3</sub> = 0.5 \* π<sub>x,y</sub> = 0.5 \* (u<sub>x</sub> + u<sub>y</sub> - w<sub>x,y</sub>)_. | ||||
|        Since _x_ and _y_ are S-vertices, either _u<sub>x</sub>_ and _u<sub>y</sub>_ are | ||||
|        both integers or both are odd multiples of 0.5. | ||||
|        In either case _u<sub>x</sub> + u<sub>y</sub>_ is an integer. | ||||
|        Therefore _δ<sub>3</sub>_ is a multiple of 0.5. | ||||
|      - _δ<sub>4</sub> = 0.5 \* z<sub>B</sub>_ where _z<sub>B</sub>_ is an integer, | ||||
|        therefore _δ<sub>4</sub>_ is a multiple of 0.5. | ||||
|  - Vertex duals increase or decrease by _δ_ which is a multiple of 0.5, | ||||
|    therefore updated vertex duals are still multiples of 0.5. | ||||
|  - Blossom duals increase or decrease by _2\*δ_, | ||||
|    therefore updated blossom duals are still integers. | ||||
| 
 | ||||
| The value of vertex dual variables and blossom dual variables never exceeds the | ||||
| greatest edge weight in the graph. | ||||
| This may be helpful for choosing an integer data type for the dual variables. | ||||
| (Alternatively, choose a programming language with unlimited integer range. | ||||
| This is perhaps the thing I love most about Python.) | ||||
| 
 | ||||
| Proof that dual variables do not exceed _max-weight_: | ||||
| 
 | ||||
|  - Vertex dual variables start at _u<sub>x</sub> = 0.5\*max-weight_. | ||||
|  - While the algorithm runs, there is at least one vertex which has been unmatched | ||||
|    since the beginning. | ||||
|    This vertex has always had label S, therefore its dual always decreased by _δ_ | ||||
|    during a dual variable update. | ||||
|    Since it started at _0.5\*max-weight_ and can not become negative, | ||||
|    the sum of _δ_ over all dual variable updates can not exceed _0.5\*max-weight_. | ||||
|  - Vertex dual variables increase by at most _δ_ per update. | ||||
|    Therefore no vertex dual can increase by more than _0.5\*max-weight_ in total. | ||||
|    Therefore no vertex dual can exceed _max-weight_. | ||||
|  - Blossom dual variables start at _z<sub>B</sub> = 0_. | ||||
|  - Blossom dual variables increase by at most _2\*δ_ per update. | ||||
|    Therefore no blossom dual can increase by more than _max-weight_ in total. | ||||
|    Therefore no blossom dual can exceed _max-weight_. | ||||
| 
 | ||||
| ### Handling floating point edge weights | ||||
| 
 | ||||
| Floating point calculations are subject to rounding errors. | ||||
| This has two consequences for the matching algorithm: | ||||
| 
 | ||||
|   - The algorithm may return a matching which has slightly lower weight than | ||||
|     the actual maximum weight. | ||||
| 
 | ||||
|   - The algorithm may not reliably recognize tight edges. | ||||
|     To check whether an edge is tight, its slack is compared to zero. | ||||
|     Rounding errors may cause the slack to appear positive even when an exact calculation | ||||
|     would show it to be zero. | ||||
|     The slack of some edges may even become slightly negative. | ||||
| 
 | ||||
|     I believe this does not affect the correctness of the algorithm. | ||||
|     An edge that should be tight but is not recognized as tight due to rounding errors, | ||||
|     can be pulled tight through an additional dual variable update. | ||||
|     As side-effect of this update, the edge will immediately be used to grow the alternating tree, | ||||
|     or construct a blossom or augmenting path. | ||||
|     This mechanism allows the algorithm to make progress, even if slack comparisons | ||||
|     are repeatedly thrown off by rounding errors. | ||||
|     Rounding errors may cause the algorithm to perform more dual variable updates | ||||
|     than strictly necessary. | ||||
|     But this will still not cause the run time of the algorithm to exceed _O(n<sup>3</sup>)_. | ||||
| 
 | ||||
| It seems to me that the matching algorithm is stable for floating point weights. | ||||
| And it seems to me that it returns a matching which is close to optimal, | ||||
| and could have been optimal if edge weights were changed by small amounts. | ||||
| 
 | ||||
| I must admit these arguments are mostly based on intuition. | ||||
| Unfortunately I don't know how to properly analyze the floating point accuracy of this algorithm. | ||||
| 
 | ||||
| ### Finding a maximum weight matching out of all maximum cardinality matchings | ||||
| 
 | ||||
| It is sometimes useful to find a maximum cardinality matching which has maximum weight | ||||
| out of all maximum cardinality matchings. | ||||
| A simple way to achieve this is to increase the weight of all edges by the same amount. | ||||
| 
 | ||||
| In general, a maximum weight matching is not necessarily a maximum cardinality matching. | ||||
| However if all edge weights are at least _n_ times the difference between | ||||
| the maximum and minimum edge weight, any maximum weight matching is guaranteed | ||||
| to have maximum cardinality. | ||||
| Proof: | ||||
| The weight of a non-maximum-cardinality matching can be increased by matching | ||||
| an additional edge. | ||||
| In order to match that extra edge, some high-weight edges may have to be removed from | ||||
| the matching. | ||||
| Their place might be taken low-weight edges. | ||||
| But even if all previously matched edges had maximum weight, and all newly matched edges | ||||
| have minimum weight, the weight of the matching will still increase. | ||||
| 
 | ||||
| Changing all edge weights by the same amount does not change the relative preference | ||||
| for a certain groups of edges over another group of the same size. | ||||
| Therefore, given that we only consider maximum-cardinality matchings, | ||||
| changing all weights by the same amount doesn't change which of these matchings has maximum weight. | ||||
| 
 | ||||
| 
 | ||||
| ## References | ||||
|  |  | |||
		Loading…
	
		Reference in New Issue