Course Content#
- Union-Find: Union sets [establishing connectivity] and find connectivity in sets [determining whether two points are connected]
Connectivity Problem#
-
Rule: Points that are already connected cannot be connected again
-
Divide all points into 2 sets: ① and ②
-
[PS] The relationship between points (merging, determining connectivity) can also be the relationship between sets
Quick-Find Algorithm#
- Core idea: Coloring
- One color corresponds to one category
- Initialization: Each individual is independent and has its own index, belonging to an independent set
- ⭐ Change the color of all points connected to itself to the color to be dyed
- Time complexity
- Connectivity check: O(1) - fast lookup, hence the name Quick-Find
- Union operation: O(N) - need to traverse all points to determine whether to merge
- 🆒 Thought-provoking: Why is it called a forest?
- The key lies in the merge operation, merging several points with several points, merging sets with sets
- Similar to the merge between subtrees or making one tree a subtree of another tree
- All sets are put together, similar to putting all trees together, which is a forest
- So, are there other ways besides coloring?
- Think: It doesn't matter what color it is dyed, just need to know which points have the same color, that is, connected
Quick-Union Algorithm#
- Inspiration from life: Find the big brother, find the leader
- Core idea: Find representative elements [root node]
- The stored value is its representative element
- Initialization: Each individual is independent and has its own index, belonging to an independent set
- ⭐ Find the representative elements of the two points and modify the value of one representative element to the value of the other representative element
- Representative element: The value inside is itself
- Association: When the big brother or his little brother betrays, they all start with the big brother and betray to the big brother of another little brother
- The result is the same as Quick-Find, merging a whole subtree into another subtree, but it is achieved through representative elements
- Can only be root node pointing to root node
- Time complexity
- Both need to find the root node, related to the height of the tree
- Connectivity operation: O(tree height)
- Union operation: O(tree height)
- ❗ 2 cases
- Normal state → both are O(logN)
- Degenerate into a linked list → both are O(N)
- How to solve the degenerate case? 👇
Weighted Quick-Union Algorithm#
[Merge by Rank]
- How to avoid degeneration? → Ensure lush branches
- Merge basis 1: Tree height, hang the short tree under the tall tree [pairwise combination]
- The tree with height h requires at least N = 2 ^ (h - 1) nodes
- That is, tree height h = log[2]N + 1 ≈ log[2]N
- [PS] Only when two trees of the same height are merged will the height increase
- Merge basis 2: Number of nodes, hang the tree with fewer nodes under the tree with more nodes
- Both optimization methods can achieve O(logN), but merge basis 2 [number of nodes] is better
- Merge basis 1: Tree height, hang the short tree under the tall tree [pairwise combination]
- ⭐ Why is merge basis 2 better
- [Example] What is average search times
- As shown in the figure below, the average search times of trees A and B are calculated
- The depth of the node is the number of search times of the node, and the average search times = total search times / total number of nodes
- In this example, the search operation of tree B is faster
- [Derivation] Merge basis 2 directly determines the average search times
- For trees A and B with SA and SB nodes, their total search times LA and LB are respectively:
- Where li represents the depth of the i-th node
- At this time, when performing merge operations, calculate the average search times of ①A→B and ②B→A
- ① When the A tree is merged into the B tree as a subtree, it is
- All nodes in the A tree need to be searched one more time
- ② When the B tree is merged into the A tree as a subtree, it is
- All nodes in the B tree need to be searched one more time
- ① When the A tree is merged into the B tree as a subtree, it is
- ❗ [Comparison of average search times of the two methods]
- The numerator of the average search times is not directly related to the tree height [LA, LB], but the number of nodes in the numerator [SA, SB] directly determines the number of search times, the smaller the better
- 👉Whoever has fewer nodes will be merged as a subtree
- ❓ Think: Does the above derivation prove that height cannot be used as a merge basis?
- ❌No, height indirectly affects the number of nodes, and in general, the lower the height, the fewer the number of nodes
- However, for special cases where the A tree is higher than the B tree, but the number of nodes in the A tree is less than the B tree, still use [number of nodes] as the merge basis and merge the A tree as a subtree into the B tree
- For trees A and B with SA and SB nodes, their total search times LA and LB are respectively:
- So using the number of nodes as the merge basis is better! The merge idea is as follows
- [Example] What is average search times
- When merging two subtrees
- If the number of nodes is the same, follow the idea of regular Quick-Union
- If not, the root node of the subtree with fewer nodes is attached to the root node of the subtree with more nodes
- [PS] In other words
- When changing the big brother
- If the number of little brothers is the same, follow the idea of regular Quick-Union
- If not, the big brother with fewer little brothers has to follow the big brother with more little brothers
+ Path Compression#
- Refer to the visualization result of weighted Quick-Union in in-class exercise 2
- The position of 0 is a bit awkward
- ❗ Whether the representative element of 0 is 1 or 3 doesn't matter, directly attach 0 to 3 can reduce the height of the tree
- 【Method】Make each non-root node directly point to the root node, flatten the structure
Efficiency comparison of the above algorithms
In-Class Exercise#
Quick-Find vs. Quick-Union#
-
【Key】Understanding Quick-Union
- 0->1->2->4->5, 3->4->5; 8->9->7->6
- Boundary for lookup and union: Stop when the representative element is itself
Quick-Union vs. Weighted Quick-Union#
-
【Key】Understanding weighted
- When the number of elements in two sets is different
- The value of the representative element of the set with fewer elements 👉 the value of the representative element of the set with more elements
- The big brother with fewer little brothers has to follow the big brother with more little brothers
-
Result visualization
- Obviously, the tree obtained by the weighted method is shorter, and the efficiency of merging and searching is higher
Code Demonstration#
HZOJ-71: Friend Circle#
Sample Input
6 5
1 1 2
2 1 3
1 2 4
1 4 3
2 1 3
Sample Output
No
Yes
- Idea
- Use data structure - Union-Find
- 1 means union operation, 2 means find operation
- Test the efficiency of Quick-Find, Quick-Union [+weighted, +path compression, -weighted] algorithms
- Code
Quick-Find#
-
Lookup and union strategy: Based on coloring
Quick-Union#
-
Based on Quick-Find
- Modify the color in the structure definition to the representative element father
- Modify the find operation: need to find the bottom
- Modify the union operation: also need to find the bottoms of the two elements before merging
-
Degeneration into a linked list is prone to occur, below is the optimization
+ Weighted#
-
Based on Quick-Union
-
Add the size attribute to record the number of nodes in the set where the node is located, and use it as the merge strategy
+ Path Compression#
-
Perform path compression every time a find operation is performed, make each non-root node point directly to the root node [root node]
- Weighted#
- Erase all size-related operations
- With path compression, not merging by rank [removing the size attribute] can still achieve good results
Time used for the problem on HZOJ platform
Method | Quick-Find | Quick-Union | + Weighted | + Path Compression | - Weighted |
---|---|---|---|---|---|
Time (ms) | 744 | 2020 | 164 | 172 | 188 |
- Quick-Union has degeneration problem
- After adding path compression, not merging by rank can still achieve good results
- 🆒The last three methods all have good performance!
Points to Ponder#
- In the weighted Quick-Union algorithm, does the derivation that the number of nodes is more optimal prove that height cannot be used as a merge basis?
- ❌No, height indirectly affects the number of nodes, and in general, the lower the height, the fewer the number of nodes
- However, for special cases where the A tree is higher than the B tree, but the number of nodes in the A tree is less than the B tree, still use [number of nodes] as the merge basis and merge the A tree as a subtree into the B tree
Tips#
- 《C++ Primer》 is recommended to be used as a reference book