Bo2SS

Bo2SS

6 Forests and Disjoint Sets

Course Content#

  • Union-Find: Union sets [establishing connectivity] and find connectivity in sets [determining whether two points are connected]

Connectivity Problem#

  • Image
  • Rule: Points that are already connected cannot be connected again

  • Divide all points into 2 sets: ① and ②

  • [PS] The relationship between points (merging, determining connectivity) can also be the relationship between sets

Quick-Find Algorithm#

  • Core idea: Coloring
    • One color corresponds to one category
    • Initialization: Each individual is independent and has its own index, belonging to an independent set
    • ⭐ Change the color of all points connected to itself to the color to be dyed
  • Time complexity
    • Connectivity check: O(1) - fast lookup, hence the name Quick-Find
    • Union operation: O(N) - need to traverse all points to determine whether to merge
  • 🆒 Thought-provoking: Why is it called a forest?
    • The key lies in the merge operation, merging several points with several points, merging sets with sets
    • Similar to the merge between subtrees or making one tree a subtree of another tree
    • All sets are put together, similar to putting all trees together, which is a forest
  • So, are there other ways besides coloring?
    • Think: It doesn't matter what color it is dyed, just need to know which points have the same color, that is, connected

Quick-Union Algorithm#

  • Inspiration from life: Find the big brother, find the leader
  • Core idea: Find representative elements [root node]
    • The stored value is its representative element
    • Initialization: Each individual is independent and has its own index, belonging to an independent set
    • ⭐ Find the representative elements of the two points and modify the value of one representative element to the value of the other representative element
    • Representative element: The value inside is itself
    • Association: When the big brother or his little brother betrays, they all start with the big brother and betray to the big brother of another little brother
      • The result is the same as Quick-Find, merging a whole subtree into another subtree, but it is achieved through representative elements
      • Can only be root node pointing to root node
  • Time complexity
    • Both need to find the root node, related to the height of the tree
    • Connectivity operation: O(tree height)
    • Union operation: O(tree height)
    • ❗ 2 cases
      • Normal state → both are O(logN)
      • Degenerate into a linked list → both are O(N)
  • How to solve the degenerate case? 👇

Weighted Quick-Union Algorithm#

[Merge by Rank]

  • How to avoid degeneration? → Ensure lush branches
    • Merge basis 1: Tree height, hang the short tree under the tall tree [pairwise combination]
      • The tree with height h requires at least N = 2 ^ (h - 1) nodes
      • That is, tree height h = log[2]N + 1 ≈ log[2]N
      • [PS] Only when two trees of the same height are merged will the height increase
    • Merge basis 2: Number of nodes, hang the tree with fewer nodes under the tree with more nodes
    • Both optimization methods can achieve O(logN), but merge basis 2 [number of nodes] is better
  • ⭐ Why is merge basis 2 better
    • [Example] What is average search times
      • As shown in the figure below, the average search times of trees A and B are calculated
      • Image
      • The depth of the node is the number of search times of the node, and the average search times = total search times / total number of nodes
      • In this example, the search operation of tree B is faster
    • [Derivation] Merge basis 2 directly determines the average search times
      • For trees A and B with SA and SB nodes, their total search times LA and LB are respectively:
        • Image
        • Where li represents the depth of the i-th node
      • At this time, when performing merge operations, calculate the average search times of ①A→B and ②B→A
        • ① When the A tree is merged into the B tree as a subtree, it is
          • Image
          • All nodes in the A tree need to be searched one more time
        • ② When the B tree is merged into the A tree as a subtree, it is
          • Image
          • All nodes in the B tree need to be searched one more time
      • ❗ [Comparison of average search times of the two methods]
        • The numerator of the average search times is not directly related to the tree height [LA, LB], but the number of nodes in the numerator [SA, SB] directly determines the number of search times, the smaller the better
        • 👉Whoever has fewer nodes will be merged as a subtree
        • ❓ Think: Does the above derivation prove that height cannot be used as a merge basis?
          • ❌No, height indirectly affects the number of nodes, and in general, the lower the height, the fewer the number of nodes
          • However, for special cases where the A tree is higher than the B tree, but the number of nodes in the A tree is less than the B tree, still use [number of nodes] as the merge basis and merge the A tree as a subtree into the B tree
    • So using the number of nodes as the merge basis is better! The merge idea is as follows
  • When merging two subtrees
    • If the number of nodes is the same, follow the idea of regular Quick-Union
    • If not, the root node of the subtree with fewer nodes is attached to the root node of the subtree with more nodes
  • [PS] In other words
    • When changing the big brother
    • If the number of little brothers is the same, follow the idea of regular Quick-Union
    • If not, the big brother with fewer little brothers has to follow the big brother with more little brothers

+ Path Compression#

  • Refer to the visualization result of weighted Quick-Union in in-class exercise 2
    • Image
    • The position of 0 is a bit awkward
    • ❗ Whether the representative element of 0 is 1 or 3 doesn't matter, directly attach 0 to 3 can reduce the height of the tree
    • 【Method】Make each non-root node directly point to the root node, flatten the structure

Efficiency comparison of the above algorithms

  • Image

In-Class Exercise#

Quick-Find vs. Quick-Union#

  • Image
  • 【Key】Understanding Quick-Union

    • 0->1->2->4->5, 3->4->5; 8->9->7->6
    • Boundary for lookup and union: Stop when the representative element is itself

Quick-Union vs. Weighted Quick-Union#

  • Image
  • 【Key】Understanding weighted

    • When the number of elements in two sets is different
    • The value of the representative element of the set with fewer elements 👉 the value of the representative element of the set with more elements
    • The big brother with fewer little brothers has to follow the big brother with more little brothers
  • Result visualization

    • Image
    • Obviously, the tree obtained by the weighted method is shorter, and the efficiency of merging and searching is higher

Code Demonstration#

HZOJ-71: Friend Circle#

Image

Sample Input

6 5
1 1 2
2 1 3
1 2 4
1 4 3
2 1 3

Sample Output

No
Yes
  • Idea
    • Use data structure - Union-Find
    • 1 means union operation, 2 means find operation
    • Test the efficiency of Quick-Find, Quick-Union [+weighted, +path compression, -weighted] algorithms
  • Code

Quick-Find#

  • Image
  • Image
  • Lookup and union strategy: Based on coloring

Quick-Union#

  • Image
  • Based on Quick-Find

    • Modify the color in the structure definition to the representative element father
    • Modify the find operation: need to find the bottom
    • Modify the union operation: also need to find the bottoms of the two elements before merging
  • Degeneration into a linked list is prone to occur, below is the optimization

+ Weighted#

  • Image
  • Based on Quick-Union

  • Add the size attribute to record the number of nodes in the set where the node is located, and use it as the merge strategy

+ Path Compression#

  • Image
  • Perform path compression every time a find operation is performed, make each non-root node point directly to the root node [root node]

- Weighted#

  • Image
  • Erase all size-related operations
  • With path compression, not merging by rank [removing the size attribute] can still achieve good results

Time used for the problem on HZOJ platform

MethodQuick-FindQuick-Union+ Weighted+ Path Compression- Weighted
Time (ms)7442020164172188
  • Quick-Union has degeneration problem
  • After adding path compression, not merging by rank can still achieve good results
  • 🆒The last three methods all have good performance!

Points to Ponder#

  • In the weighted Quick-Union algorithm, does the derivation that the number of nodes is more optimal prove that height cannot be used as a merge basis?
    • ❌No, height indirectly affects the number of nodes, and in general, the lower the height, the fewer the number of nodes
    • However, for special cases where the A tree is higher than the B tree, but the number of nodes in the A tree is less than the B tree, still use [number of nodes] as the merge basis and merge the A tree as a subtree into the B tree

Tips#

  • 《C++ Primer》 is recommended to be used as a reference book
  • Image
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.