Data structure is to define a property and maintain that property.
BS Tree 👉 AVL Tree
Course Content#
Binary Search Tree#
[Also known as binary search tree, binary lookup tree]
Basic Knowledge#
- Property: For any triplet, left subtree < root node < right subtree
- Purpose: To solve retrieval needs related to searching and ranking
- 【Why is the entry point for tree structure searches always the root node?】
- First, understand the meaning of points and edges: sets and relations
- The root node represents the whole set
- Structural operations: insert, delete, search
- Insert
- Continuously compare with the root node of the subtree, recursively, until the subtree has no child nodes
- ⭐ The new node will definitely become a leaf node
- Delete [Node]
- Degree 0: delete directly
- Degree 1: attach the orphan subtree directly to its parent node
- Degree 2: [A bit troublesome]
- Find the predecessor or successor to replace the original node
- Predecessor node: the node corresponding to the maximum value in its left subtree [This node definitely has no right subtree, degree at most 1]
- Successor node: the node corresponding to the minimum value in its right subtree [This node definitely has no left subtree, degree at most 1]
- 【Only for nodes of degree 2, otherwise need to consider searching up to the parent node, to what extent?】
- ⭐ Thus converting to the problem of deleting a node of degree 0 or 1 [predecessor or successor]
- [Can also be understood as the deletion operation of a one-dimensional array]
- Find the predecessor or successor to replace the original node
- Search: Just search recursively based on the property
- Insert
- The order of insertion will affect the tree structure, corresponding to different average search efficiencies
- The same sequence, but different insertion orders
- Average search efficiency: the expected value of the number of node searches. That is, average search times: total search times / number of nodes
- Assuming each node is searched with equal probability
- In-order traversal → an ordered sequence from smallest to largest
- Refer to 《Basic Data Structures》——3 Trees and Binary Trees——Binary Search Trees
Extended Content#
[See specific code in code demonstration——Binary Search Tree]
- Delete operation optimization [Code demonstration——Binary Search Tree]
- The delete operation for nodes of degree 0 and degree 1 can share the deletion operation code for degree 1
- Code for degree 0
- Code for degree 1
- When the degree is 0, temp points to NULL, also destroy root, then return NULL
- ——Find the element ranked k
- ⭐ Add a size field to record the number of nodes in each subtree [including the root node]
- The essence of data structure, modifying structure definition for specific problems
- Update size during insertion and deletion
- Search conditions
- ① k = LS + 1, the root node is the value we are looking for
- ② k < LS + 1, the element ranked k is in the left subtree, continue searching in the left subtree
- ③ k > LS + 1, the element ranked k is in the right subtree, search for the element ranked k - LS - 1 in the right subtree
- [PS] LS is the number of nodes in the left subtree
- Output at boundary conditions [-1 or key]
- ⭐ Add a size field to record the number of nodes in each subtree [including the root node]
- ——Find the elements ranked top k [all elements ranked <= k]
- 【Based on the ranking problem】
- Search conditions
- ① k = LS + 1, output all values in the left subtree
- ② k < LS + 1, all top k elements are in the left subtree
- ③ k > LS + 1, the root node and elements in the left subtree all belong to the top k elements
- Equivalent to continuously reducing the size of the tree, thus transforming into the first case
- There are actually many solutions [such as heaps], there is no standard answer
- The relationship between binary search trees and quicksort
- 【Binary search trees are the data structure used by quicksort at the conceptual level】
- The Partition operation in quicksort
- Treat the pivot value as the root node of the binary search tree
- After each Partition operation, the relationship between the pivot value and the left and right sides is actually 👇
- The relationship between the root node and the left and right subtrees
- ⭐ Understanding the following two thoughts can clarify the relationship between the two
- Thought 1: The relationship between the time complexity of quicksort and the tree construction time complexity of binary search trees
- They are essentially the same, the tree construction process is similar to multiple Partition processes
- Thought 2: The relationship between the quickselect algorithm and binary search trees
- The essence of both is the same, the former manifests as an algorithm, the latter as a data structure
- [Personal understanding]
- Quicksort is a bunch of numbers doing a Partition once, then continuing to split
- The tree construction process is one number at a time doing a Partition, fixing the position of that number before moving to the next
- Thought 1: The relationship between the time complexity of quicksort and the tree construction time complexity of binary search trees
Balanced Binary Search Tree—AVL Tree#
Solving the degeneration problem in binary search trees
Basic Knowledge#
- Background
- Inventors: AV, L, with AV contributing more
- Year: 1962
- [Neural networks also emerged in this era]
- Properties
- Essentially also a binary search tree, possessing all properties of binary search trees
- Additional property: The height difference between the left and right subtrees does not exceed 1 [Left and right subtrees are roughly equal]
- ⭐ Balance condition, balance adjustment
[Further] Thoughts#
What is the range of the number of nodes contained in a BS, AVL tree of height H?
- [General upper limit for binary trees: 2 ^ H - 1]
- <BS> H ~ 2 ^ H - 1
- <AVL> low(H - 2) + low(H - 1) + 1 ~ 2 ^ H - 1
- That is, 1.5 ^ H ~ 2 ^ H - 1, low(H) represents the minimum number of nodes in an AVL tree of height H
- The left boundary is a Fibonacci sequence
- Its growth rate is 1.618 ^ n ≈ 1.5 ^ n 【Can be used for estimation】
- 👉 The relationship between the number of nodes and height is logarithmic, so search efficiency is O(logN) level
- ❓ [Further] Is the search efficiency of a regular binary search tree always worse than that of an AVL tree?
- Not necessarily, it depends on the order of the insertion sequence, but the lower limit is high for AVL trees
- 【Extension】
- Tree height = lifespan, number of nodes = wealth of life, different algorithms yield different results
- Education raises the baseline, not the upper limit; the upper limit depends on ability and luck
- The upper limit largely depends on an uncontrollable insertion sequence, left to fate
Adjustment ⭐#
Insertion/deletion + adjustment: The insertion operation is consistent with binary search trees, the key lies in adjustment [⭐ Which node to rotate]
Rotation#
[Basic operation used when unbalanced]
- Left Rotation
- Grasp the root node and rotate left around the center point
- K3 becomes the root node
- K1 becomes the left subtree of K3
- K3's original left subtree A becomes the right subtree of K1 [because K3 > A > K1]
- "Leaf node" depth changes: A remains unchanged, B - 1, K2 + 1
- 👉 Subtree height changes: left subtree + 1, right subtree - 1
- Right Rotation
- Grasp the root node and rotate right around the center point
- K2 becomes the root node
- K1 becomes the right subtree of K2
- K2's original right subtree B becomes the left subtree of K1 [because K2 < B < K1]
- "Leaf node" depth changes: A - 1, B remains unchanged, K3 + 1
- 👉 Subtree height changes: left subtree - 1, right subtree + 1
- Left and right rotations are symmetric operations, essentially affecting the height of the subtrees
Types of Imbalance && Adjustment Strategies#
- Judging scenario: During the backtracking process, adjust as soon as the first imbalance is detected
- Symmetrical situations: LL ——RR , LR——RL
① LL
- ⭐ Key point ⭐: Analyze the relationship between h1, h2, h3, h4 [Write equations]
- Analysis
- Insert one by one, the newly inserted node must cause an LL-type imbalance in subtree A [Deletion also occurs one by one]
- Before insertion, it must be balanced
- After insertion, K1's left subtree is 2 higher than the right subtree
- That is, h(K2) = h(D) + 2, while
- h(K2) = h1 + 1
- h(D) = max(h3, h4) + 1
- That is, h(K2) = h(D) + 2, while
- Equation relationship=>
- h1 = max(h3, h4) + 2 = h2 + 1
- [PS] |h3 - h4| <= 1
- Analysis
- <Adjustment Strategy> Big right rotation. Result see above, balance analysis see below:
- ① K1: left subtree height
h2
= right subtree heightmax(h3, h4) + 1
=h1 -1
- ② K2: left subtree height
h1
= right subtree heighth2 + 1
- Already balanced, and can be said to be terrifyingly balanced
- ① K1: left subtree height
② LR
- ⭐ Key point ⭐: Analyze the relationship between h1, h2, h3, h4 [Write equations]
- Analysis
- Insert one by one, the newly inserted node must cause an LR-type imbalance in subtree B / C
- Before insertion, it must be balanced
- After insertion, K1's left subtree is 2 higher than the right subtree
- That is, h(K2) = h(D) + 2, while
- h(K2) = max(h2, h3) + 2
- h(D) = h4
- That is, h(K2) = h(D) + 2, while
- Equation relationship=>
- max(h2, h3) = h4 = h1
- [PS] |h2 - h3| <= 1
- Analysis
❗ Remember again! The judgment always occurs at the first imbalance during the backtracking process, so all previous subtrees are balanced
- <Adjustment Strategy> Small left rotation + big right rotation. Result see below:
- First perform a small left rotation to convert to LL type, then a big right rotation [LL adjustment strategy]
- Balance analysis
- It is obvious that the heights of the left and right subtrees depend on h1, h2, h3, h4
- And the height difference between them does not exceed 1 [one of h2 or h3 may be smaller by 1]
Summary of Adjustment Strategies#
- Occurs at the first unbalanced node during the backtracking phase
- Key to adjustment: the height relationships of the four subtrees ABCD in the four cases
- Four cases
- LL, LR: both ultimately require a big right rotation
- LL, big right rotation
- LR, small left rotation first, then big right rotation
- RL, RR: both ultimately require a big left rotation
- RL, small right rotation first, then big left rotation
- RR, big left rotation
- LL, LR: both ultimately require a big right rotation
[Balanced Binary Search Tree—SBTree]#
- AVL balances through tree height, while SBTree balances through the number of nodes
- Refer to the learning order of AVL: balance condition + balance adjustment [insertion, deletion]
In-Class Exercises#
- The binary search tree formed by the second sequence degenerates into a linked list, and the insertion order will affect the structure
- When judging imbalance, backtrack from the inserted node step by step, and adjust immediately upon finding an imbalance
Code Demonstration#
Binary Search Tree#
- 5 basic operations: initialize, destroy, search, insert, delete
- Add sorting problem [add size field], Top-k problem solution [based on sorting problem]
- [PS]
- Finding predecessor has a bug: the predecessor of nodes of degree 1 or 0 may not exist in the subtree, but this scenario does not need to be considered
- Recursion must consider boundary conditions
- Learn to use macros to make the code more beautiful
- Analyzing binary trees generally involves discussing three situations: left, root, right
- Partial results
【Appendix】 Code for randomly generating input operations
- Generate non-proportional operation types
AVL Tree#
- After insertion and deletion, remember to recalculate the tree height 【Always remember to maintain the structure definition】
- During insertion / deletion of nodes + during left rotation / right rotation adjustments
- ⭐ Introduced NIL nodes to replace NULL🆒[Red-black trees also need to use this]
- NULL is inaccessible
- NIL is an actual node that can be accessed
- 【Convenient for code implementation】
- LL, LR, and RL, RR last operations are big right rotation and big left rotation respectively
Additional Knowledge Points#
- When analyzing binary tree situations, it is generally divided into three cases: left, root, right
Points for Thought#
- Is the search efficiency of a regular binary search tree always worse than that of an AVL tree?
- Not necessarily, it depends on the order of the insertion sequence, but the lower limit is high for AVL trees
- 【Extension】
- Tree height = lifespan, number of nodes = wealth of life, different algorithms yield different results
- Education raises the baseline, not the upper limit; the upper limit depends on ability and luck
- The upper limit largely depends on an uncontrollable insertion sequence, left to fate
Tips#
- Prerequisites for learning C++: system programming, network knowledge, algorithm structure, C language
- The so-called ability to design and analyze algorithms: the ability to classify discussions [divide into several cases] and summarize [combine multiple cases into one]
- Program = Algorithm + Data Structure
- Those who improve this ability by practicing problems are gifted individuals
- Recommended Geek Column——Programming Introductory Course That Everyone Can Learn——A challenge to traditional learning methods