5 Heaps and Priority Queues

Course Content#

Complete binary trees are numbered starting from 1 by default
- This ensures concise numbering of left and right children
- [Otherwise], the left child's number should be 2 * i + 1, and the right child's number should be 2 * i + 2
The property of being numbered → can be stored using sequential structure without the need to save the address of the next node
- Conversion from binary tree to array
- Store the nodes in the array in the order of level order traversal, with the nodes of each level stored consecutively
- Here, it is stored starting from index 0, which is okay but not recommended

The essence is a priority queue, but a heap is just one way to implement a priority queue; a heap is maintained based on a complete binary tree

The structure is the same as a complete binary tree, but an additional property is maintained👇
- Max heap: For any triple, the root node has the maximum value
- Min heap: For any triple, the root node has the minimum value

All operations need to maintain their own properties!

Insertion at the end [+ adjustment]
- Insert the element at the end, then adjust from bottom to top, up to the root node at most
  - Continuously adjust in the triple, compare with the parent node one by one, keep swapping as long as it is swapped, until the root node is reached
  - The numbers being compared are actually i and i / 2
- Time complexity: O(log[2]N), which is the height of the tree
Pop at the top [+ adjustment]
- Take the last element to the top of the heap, then adjust from top to bottom until the element has no child nodes
  - Continuously adjust in the triple, compare with the child nodes one by one, keep swapping as long as it is swapped, until there are no children
  - Why choose the last element to put at the top of the heap?
  - Because if you take an element from another position, who will fill the gap left by this element?
- The numbers being compared are actually i, 2 * i, and 2 * i + 1
- Time complexity: the same as the height of the tree
[PS]
- Max heap: Convenient for finding the first and second largest elements [root node, second layer]
  - Finding the third, fourth largest... is not that simple, it is not easy to determine the layer they are in
- ❗ The above process
  - Inserting at the end and popping at the top👉same as a queue
  - The popped element is the extreme value👉also known as a priority queue

According to its property, popping all elements will result in a sorted sequence
⭐ Change of thinking: Popping the top element of the heap ==> Swapping the top element with the last element of the heap
- [With this operation]
- All elements of the max heap are popped👉the original array stores a sorted sequence from smallest to largest
- [At this point, a special min heap is obtained from the max heap, because the popped elements are usually placed at the end of the original array]
Time complexity: O(NlogN)
- The time is consumed by the adjustment operation, the time complexity of each adjustment is O(logN), there are N elements in total, and N - 1 adjustments are needed
- The time complexity of the pop operation is O(1)
- The time efficiency will not degrade

[To use heap sort, you need to maintain a heap first, which means building a heap from a regular sequence. There are two approaches below]

Also known as the insertion method for building a heap

Adjust from bottom to top using the aforementioned insertion adjustment method
- Start from the 0th level [assuming the root node is in the 0th level by default], calculate the maximum number of adjustments for each level:
- The number of adjustments for the i-th level is i, and the number of nodes in the i-th level is 2 ^ i → the total number of adjustments for the i-th level is i * (2 ^ i)
The worst time complexity for building a heap is O(NlogN), the calculation process is as follows:
- The total number of adjustments S = (n - 1) * 2 ^ (n + 1) + 1, the process is as follows:
  - Using the method of canceling out terms
- The above n corresponds to [number of levels - 1] [starting from the 0th level to the n-th level, the number of levels is n + 1], if the total number of nodes is N, then n ≈ log[2]N
- ❗ [Conversion from number of levels n to number of nodes N] Substituting n ≈ log[2]N into S, we get S ≈ Nlog[2]N
- That is, the worst time complexity is: O(NlogN)

Also known as the linear build heap method

As shown in the figure
- Conventional approach: The deeper the level, the more adjustments are needed, that is, the greater the weight
- ❗ Can we reverse the thinking, put the larger weight at the front, and reduce the weight of the layer with more nodes below
- Linear approach: Yes! Start from the second-to-last level and adjust [from top to bottom]
🆒 The worst time complexity for building a heap is O(N)
- Similarly, using the method of canceling out terms, we get the total number of adjustments S = 2 ^ (n + 1) - 2 - n
- Substituting the number of levels n into the number of nodes N, we get S ≈ 2N - 2 - log[2]N
- That is, the worst time complexity is: O(N)
⭐ Recommended video Linear Time BuildHeap——Youtube
- Compares the conventional build heap method and the linear build heap method, and has intuitive animated demonstrations to deepen the impression

Adjustment operation, clever use of variables ind, temp, l, r
- ind stops adjusting when certain conditions are met
- Pay attention to whether ind [for insertion and deletion] and r [for deletion] exist

Here, the [heap sort] considers [building a heap] + [sorting]
- O(N) + O(NlogN) = O(NlogN)
No heap structure is defined, an array is used to simulate the heap
Line 33: The heap node numbering starts from 1, which is more convenient, so arr-1 is used to shift the array index one int size to the left
Encapsulated the top-down adjustment method, and the position that is adjusted first can be seen from the figure below, which comes from the recommended video Linear Time BuildHeap——Youtube

❌The time complexity of an n-level loop corresponds to O(N^n), we need to abandon fixed thinking and analyze the essence!