Bo2SS

Bo2SS

1 Binary Search Tree, AVL Tree

Data structure is to define a property and maintain that property.

BS Tree 👉 AVL Tree

Course Content#

Binary Search Tree#

[Also known as binary search tree, binary lookup tree]

Basic Knowledge#

  • Property: For any triplet, left subtree < root node < right subtree
  • Purpose: To solve retrieval needs related to searching and ranking
  • 【Why is the entry point for tree structure searches always the root node?】
    • First, understand the meaning of points and edges: sets and relations
    • The root node represents the whole set
  • Structural operations: insert, delete, search
    • Insert
      • Continuously compare with the root node of the subtree, recursively, until the subtree has no child nodes
      • ⭐ The new node will definitely become a leaf node
    • Delete [Node]
      • Degree 0: delete directly
      • Degree 1: attach the orphan subtree directly to its parent node
      • Degree 2: [A bit troublesome]
        • Find the predecessor or successor to replace the original node
          • Predecessor node: the node corresponding to the maximum value in its left subtree [This node definitely has no right subtree, degree at most 1]
          • Successor node: the node corresponding to the minimum value in its right subtree [This node definitely has no left subtree, degree at most 1]
          • 【Only for nodes of degree 2, otherwise need to consider searching up to the parent node, to what extent?】
        • ⭐ Thus converting to the problem of deleting a node of degree 0 or 1 [predecessor or successor]
        • [Can also be understood as the deletion operation of a one-dimensional array]
    • Search: Just search recursively based on the property
  • The order of insertion will affect the tree structure, corresponding to different average search efficiencies
    • The same sequence, but different insertion orders
    • Average search efficiency: the expected value of the number of node searches. That is, average search times: total search times / number of nodes
      • Assuming each node is searched with equal probability
  • In-order traversal → an ordered sequence from smallest to largest
  • Refer to 《Basic Data Structures》——3 Trees and Binary Trees——Binary Search Trees

Extended Content#

[See specific code in code demonstration——Binary Search Tree]

  • Delete operation optimization [Code demonstration——Binary Search Tree]
    • The delete operation for nodes of degree 0 and degree 1 can share the deletion operation code for degree 1
    • Code for degree 0
    • Image
    • Code for degree 1
    • Image
    • When the degree is 0, temp points to NULL, also destroy root, then return NULL
  • ——Find the element ranked k
    • ⭐ Add a size field to record the number of nodes in each subtree [including the root node]
      • The essence of data structure, modifying structure definition for specific problems
      • Update size during insertion and deletion
    • Search conditions
      • ① k = LS + 1, the root node is the value we are looking for
      • ② k < LS + 1, the element ranked k is in the left subtree, continue searching in the left subtree
      • ③ k > LS + 1, the element ranked k is in the right subtree, search for the element ranked k - LS - 1 in the right subtree
      • [PS] LS is the number of nodes in the left subtree
      • Image
      • Output at boundary conditions [-1 or key]
  • ——Find the elements ranked top k [all elements ranked <= k]
    • 【Based on the ranking problem】
    • Search conditions
      • ① k = LS + 1, output all values in the left subtree
      • ② k < LS + 1, all top k elements are in the left subtree
      • ③ k > LS + 1, the root node and elements in the left subtree all belong to the top k elements
      • Image
      • Equivalent to continuously reducing the size of the tree, thus transforming into the first case
      • There are actually many solutions [such as heaps], there is no standard answer
  • The relationship between binary search trees and quicksort
    • 【Binary search trees are the data structure used by quicksort at the conceptual level】
    • The Partition operation in quicksort
      • Treat the pivot value as the root node of the binary search tree
      • After each Partition operation, the relationship between the pivot value and the left and right sides is actually 👇
      • The relationship between the root node and the left and right subtrees
    • ⭐ Understanding the following two thoughts can clarify the relationship between the two
      • Thought 1: The relationship between the time complexity of quicksort and the tree construction time complexity of binary search trees
        • They are essentially the same, the tree construction process is similar to multiple Partition processes
      • Thought 2: The relationship between the quickselect algorithm and binary search trees
        • The essence of both is the same, the former manifests as an algorithm, the latter as a data structure
      • [Personal understanding]
        • Quicksort is a bunch of numbers doing a Partition once, then continuing to split
        • The tree construction process is one number at a time doing a Partition, fixing the position of that number before moving to the next

Balanced Binary Search Tree—AVL Tree#

Solving the degeneration problem in binary search trees

Basic Knowledge#

  • Background
    • Inventors: AV, L, with AV contributing more
    • Year: 1962
    • [Neural networks also emerged in this era]
  • Properties
    • Essentially also a binary search tree, possessing all properties of binary search trees
    • Additional property: The height difference between the left and right subtrees does not exceed 1 [Left and right subtrees are roughly equal]
  • ⭐ Balance condition, balance adjustment

[Further] Thoughts#

What is the range of the number of nodes contained in a BS, AVL tree of height H?

  • [General upper limit for binary trees: 2 ^ H - 1]
  • <BS> H ~ 2 ^ H - 1
  • <AVL> low(H - 2) + low(H - 1) + 1 ~ 2 ^ H - 1
    • That is, 1.5 ^ H ~ 2 ^ H - 1, low(H) represents the minimum number of nodes in an AVL tree of height H
    • The left boundary is a Fibonacci sequence
      • Its growth rate is 1.618 ^ n ≈ 1.5 ^ n 【Can be used for estimation】
    • 👉 The relationship between the number of nodes and height is logarithmic, so search efficiency is O(logN) level
  • ❓ [Further] Is the search efficiency of a regular binary search tree always worse than that of an AVL tree?
    • Not necessarily, it depends on the order of the insertion sequence, but the lower limit is high for AVL trees
    • 【Extension】
      • Tree height = lifespan, number of nodes = wealth of life, different algorithms yield different results
      • Education raises the baseline, not the upper limit; the upper limit depends on ability and luck
        • The upper limit largely depends on an uncontrollable insertion sequence, left to fate

Adjustment ⭐#

Insertion/deletion + adjustment: The insertion operation is consistent with binary search trees, the key lies in adjustment [⭐ Which node to rotate]

Rotation#

[Basic operation used when unbalanced]

  • Left Rotation
    • Image
    • Grasp the root node and rotate left around the center point
      • K3 becomes the root node
      • K1 becomes the left subtree of K3
      • K3's original left subtree A becomes the right subtree of K1 [because K3 > A > K1]
    • "Leaf node" depth changes: A remains unchanged, B - 1, K2 + 1
      • 👉 Subtree height changes: left subtree + 1, right subtree - 1
  • Right Rotation
    • Image
    • Grasp the root node and rotate right around the center point
      • K2 becomes the root node
      • K1 becomes the right subtree of K2
      • K2's original right subtree B becomes the left subtree of K1 [because K2 < B < K1]
    • "Leaf node" depth changes: A - 1, B remains unchanged, K3 + 1
      • 👉 Subtree height changes: left subtree - 1, right subtree + 1
  • Left and right rotations are symmetric operations, essentially affecting the height of the subtrees

Types of Imbalance && Adjustment Strategies#

  • Image
  • Judging scenario: During the backtracking process, adjust as soon as the first imbalance is detected
  • Symmetrical situations: LL ——RR , LR——RL

① LL

  • Image
  • ⭐ Key point ⭐: Analyze the relationship between h1, h2, h3, h4 [Write equations]
    • Analysis
      • Insert one by one, the newly inserted node must cause an LL-type imbalance in subtree A [Deletion also occurs one by one]
      • Before insertion, it must be balanced
      • After insertion, K1's left subtree is 2 higher than the right subtree
        • That is, h(K2) = h(D) + 2, while
          • h(K2) = h1 + 1
          • h(D) = max(h3, h4) + 1
    • Equation relationship=>
      • h1 = max(h3, h4) + 2 = h2 + 1
      • [PS] |h3 - h4| <= 1
  • <Adjustment Strategy> Big right rotation. Result see above, balance analysis see below:
    • ① K1: left subtree height h2= right subtree height max(h3, h4) + 1=h1 -1
    • ② K2: left subtree height h1= right subtree height h2 + 1
    • Already balanced, and can be said to be terrifyingly balanced

② LR

  • Image
  • ⭐ Key point ⭐: Analyze the relationship between h1, h2, h3, h4 [Write equations]
    • Analysis
      • Insert one by one, the newly inserted node must cause an LR-type imbalance in subtree B / C
      • Before insertion, it must be balanced
      • After insertion, K1's left subtree is 2 higher than the right subtree
        • That is, h(K2) = h(D) + 2, while
          • h(K2) = max(h2, h3) + 2
          • h(D) = h4
    • Equation relationship=>
      • max(h2, h3) = h4 = h1
      • [PS] |h2 - h3| <= 1

❗ Remember again! The judgment always occurs at the first imbalance during the backtracking process, so all previous subtrees are balanced

  • <Adjustment Strategy> Small left rotation + big right rotation. Result see below:
    • Image
    • First perform a small left rotation to convert to LL type, then a big right rotation [LL adjustment strategy]
    • Balance analysis
      • It is obvious that the heights of the left and right subtrees depend on h1, h2, h3, h4
      • And the height difference between them does not exceed 1 [one of h2 or h3 may be smaller by 1]

Summary of Adjustment Strategies#

  • Occurs at the first unbalanced node during the backtracking phase
  • Key to adjustment: the height relationships of the four subtrees ABCD in the four cases
  • Four cases
    • LL, LR: both ultimately require a big right rotation
      • LL, big right rotation
      • LR, small left rotation first, then big right rotation
    • RL, RR: both ultimately require a big left rotation
      • RL, small right rotation first, then big left rotation
      • RR, big left rotation

[Balanced Binary Search Tree—SBTree]#

  • Image
  • AVL balances through tree height, while SBTree balances through the number of nodes
  • Refer to the learning order of AVL: balance condition + balance adjustment [insertion, deletion]

In-Class Exercises#

  • Image
  • The binary search tree formed by the second sequence degenerates into a linked list, and the insertion order will affect the structure
  • Image
  • Image
  • When judging imbalance, backtrack from the inserted node step by step, and adjust immediately upon finding an imbalance

Code Demonstration#

Binary Search Tree#

  • Image
  • Image
  • Image
  • Image
  • Image
  • 5 basic operations: initialize, destroy, search, insert, delete
  • Add sorting problem [add size field], Top-k problem solution [based on sorting problem]
  • [PS]
    • Finding predecessor has a bug: the predecessor of nodes of degree 1 or 0 may not exist in the subtree, but this scenario does not need to be considered
    • Recursion must consider boundary conditions
    • Learn to use macros to make the code more beautiful
    • Analyzing binary trees generally involves discussing three situations: left, root, right
  • Partial results
  • Image

【Appendix】 Code for randomly generating input operations

  • Image
  • Generate non-proportional operation types

AVL Tree#

Image

Image

Image

Image

  • After insertion and deletion, remember to recalculate the tree height 【Always remember to maintain the structure definition】
    • During insertion / deletion of nodes + during left rotation / right rotation adjustments
  • ⭐ Introduced NIL nodes to replace NULL🆒[Red-black trees also need to use this]
    • NULL is inaccessible
    • NIL is an actual node that can be accessed
    • 【Convenient for code implementation】
  • LL, LR, and RL, RR last operations are big right rotation and big left rotation respectively

Additional Knowledge Points#

  • When analyzing binary tree situations, it is generally divided into three cases: left, root, right

Points for Thought#

  • Is the search efficiency of a regular binary search tree always worse than that of an AVL tree?
    • Not necessarily, it depends on the order of the insertion sequence, but the lower limit is high for AVL trees
    • 【Extension】
      • Tree height = lifespan, number of nodes = wealth of life, different algorithms yield different results
      • Education raises the baseline, not the upper limit; the upper limit depends on ability and luck
        • The upper limit largely depends on an uncontrollable insertion sequence, left to fate

Tips#

  • Prerequisites for learning C++: system programming, network knowledge, algorithm structure, C language
  • The so-called ability to design and analyze algorithms: the ability to classify discussions [divide into several cases] and summarize [combine multiple cases into one]
    • Program = Algorithm + Data Structure
    • Those who improve this ability by practicing problems are gifted individuals
  • Recommended Geek Column——Programming Introductory Course That Everyone Can Learn——A challenge to traditional learning methods
  • Image

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.