Course Content#
Definition and Initialization of Arrays#
- Definition: A sequential collection of the same type of elements of a fixed size
- Initialization
- int a[100] = {2, 3}; // a[0] = 2, a[1] = 3
- int a[100] = {0}; // Array clearing operation: initializes every element to 0
- Actually only defines the 0th element as 0, but causes other elements to be automatically initialized to 0
- Others
- Starts from index 0
- Array space size: total size of all variables
- Storage space is contiguous
- int a[3]
Address | 0123 | 4567 | 89[10][11] |
---|---|---|---|
Element | a[0] | a[1] | a[2] |
-
- The arrays defined above are all static arrays
- Not recommended operation: int n; scanf("%d", &n); int arr(n);
- Index-value mapping
- Given the index, the time efficiency of accessing the value is O(1)
- The arrays defined above are all static arrays
Prime Sieve#
Core Idea: Use primes to eliminate all composite numbers
- Learn as an algorithm framework, with many variations
- Prime: has only two factors, 1 and itself
- 1 is neither prime nor composite
- 2 is the smallest prime
- A prime must be an odd number (except for 2), but an odd number is not necessarily a prime
- Utilizing array structure. An array can represent three types of information:
- Index
- Value mapped by the index
- Sign of the value
- Algorithm measurement standards and performance
- Space complexity: O(N), an array of size N is created for N numbers
- Time complexity: O(N * logN), O(N * loglogN)--optimized version, approaching O(N)
- Not O(N) due to duplicate marking
- The base of the logarithm does not matter, base 2 is already quite small
- How to calculate: amortization idea, calculate expectation based on time complexity and probability...
- Marking: Prime, 0; Composite, 1
- Utilizing two states of the array: index → number, value → mark
- Prime sieve extension
- Find the minimum and maximum prime factors of all numbers within a range
- For example, for 12: 2, 3
Linear Sieve#
- Start with Euler's 7th problem
- Find the 10001st prime number
- Can use enumeration method O(N * sqrt(N)), prime sieve
- Estimate the size of the 10001st prime number, what is the upper limit?
- Rule: The range of primes ranked within 200,000 will not exceed 20 times the rank
- Thoughts from the prime sieve
- How many times is 6 marked? 2 times: 2, 3
- How many times is 30 marked? 3 times: 2, 3, 5
- If a number N has m different primes in its prime factorization, how many times is N marked? m times
- Can it be marked only once? 👇
- Linear sieve
- Both space and time are O(n)
- Use an integer M to mark composite number N
- The relationship between M and N satisfies four properties
- The smallest prime factor of N is p
- N = p * M
- ❗ p must be less than or equal to the smallest prime factor in M
- Similar to the strategy of capturing Meng Huo, only mark on the last occasion
- Use M * P' (the set of all primes not greater than [the smallest prime of M]) to mark N1, N2, ...
[i and iii have a somewhat equivalent meaning]
- Code
-
- Core idea: N = M (↑ as large as possible) * p (↓ as small as possible)
- Enumerate M, gradually increase p until p meets the termination condition: property iii
- Compared to M, p is easier to control!
- The key to avoiding duplication is property iii
- Note: Although there are two nested for loops, the time complexity is O(N) because the break in the second for loop means the entire array is traversed only once
- The difference between optimizing the initial condition of the prime sieve to i * i is? The prime sieve still has duplicate marking situations
-
Binary Search Method#
-
Sequential search → binary search: O(N) → O(logN)
-
Key premise: The search sequence is monotonic
-
Code
-
-
-
Mainly demonstrates three parts: ① Binary search in an array ② Binary search in a function ③
-
For the implementation part of binary search in functions
- Should be able to unify binary_search_f_i and binary_search_f_d into one function
- Focus on later learning, can you identify the return type of the passed function pointer?
- But there are differences in implementation, so the significance should not be great
-
The commented code in the main function tests the results of binary search in arrays and functions, as follows:
-
- Using functions for binary search can be more flexible
-
Double type equality: use the difference less than EPSL to judge
- Remember to #undef EPSL
-
Details see comments
- Learning order: binary_search 👉 binary_search_f_i 👉 binary_search_f_d
-
-
Recursive version code
-
-
The parameters of the function differ from the iterative method, and the search boundaries need to be determined
-
Newton's Iteration Method#
-
-
Purpose: To solve for the value of x when f(x) = 0
-
Conditions: The function is differentiable; convergent
-
Iteration formula: x2 = x1 - f(x1) / f'(x1)
-
Core idea: Each iteration brings x2 closer to the required value of x, until the absolute value of f(x) is less than EPSL
-
Code
-
-
⭐ Remember to take the absolute value for the loop termination condition!
-
Is the initial value x = n / 2.0 fixed?
- No
- This value is an imagined x that is closest to the solution, which is fine on both sides of the solution
- However, it cannot be 0, as the derivative would be 0, leading to a division by 0
-
Wrap the Newton function to get the my_sqrt function
-
This is not binary search! Binary search requires monotonicity~
-
Preprocessing Commands#
- Commands starting with #
- #include <stdio.h> → During preprocessing, the stdio.h file is copied to the current location unchanged
- Macro Definitions
- Define symbolic constants: constant → symbol.
#define PI 3.1415926
#define MAX_N 1000
- Convenient for batch modification of values
- Foolish expressions
#define MAX(a, b) (a) > (b) ? (a) : (b)
#define S(a, b) a * b
- Foolish eg. #define S(a, b) a * b
- S(2 + 3, 4 + 5) → Simple replacement → 2 + 3 * 4 + 5 is equivalent to 2 + (3 * 4) + 5
- No calculations are done during the replacement!
- Define code segments (advanced)
#define P(a) { \
printf("%d\n", a); \
}
-
Macros can only accept one line of definition; to break lines, use a backslash '' (continuation character)
-
Predefined macros (system encapsulated)
-
-
Macros are surrounded by two underscores
-
DATA, TIME: the date and time at preprocessing, which do not change if not preprocessed
- Can achieve functionality similar to fingerprint recognition? Pre-stored relationships?
-
Non-standard macros: Some environments may not define them, and different environments may have different standards, so portability is poor
- How to handle this? Conditional compilation can be used 👇
-
-
Conditional Compilation
- #ifdef DEBUG
- Description: Whether the DEBUG macro is defined
- Note: DEBUG must be a macro, not a variable!
- Macros take effect after preprocessing, while variables take effect after compilation
- #ifndef DEBUG Whether not defined
- #if MAX_N == 5 Whether macro MAX_N equals 5
- Can be used in games to determine the user's mobile version
- #elif MAX_N == 4
- #else
- Must end with #endif**!!!
- 【Coding Standards】
- Similar to va_start, va_end
- #ifdef DEBUG
-
Simplified Compilation Process
-
-
C Source .c
👇 Preprocessing stage (pre-compilation) (gcc -E): includes header files, macro replacements, and other preprocessing commands, replacing all # commands
- Compile Source .c
👇 Syntax check, compile (gcc –S)
- Assembly Source .s, .asm
👇 Assembly (gcc -c)
- Object File .o (on Linux) or .obj (on Windows)
👇 Linking (gcc)
- Executable Program a.out (on Linux) or a.exe (on Windows) (binary file)
PS: Finally, there is a loading process, mainly establishing the mapping relationship between the executable file, virtual address, and physical address
-
Strings#
- Character Arrays
- Definition: char str[size];
- Initialization
char str[] = "hello world";
char str[size] = {'h', 'e', 'l', 'l', 'o'};
-
Line1
- The size in [] can be omitted, the system will calculate automatically
- The system will automatically add a terminator '\0', the size of the character array: 11+1
-
Line2
- Size must be at least 6, considering the terminator '\0' takes one space
- Extra spaces are filled with '\0'
- If size is not added, '\0' must be added at the end of the array, the system will not add it automatically
- Size must be at least 6, considering the terminator '\0' takes one space
-
'\0'
- A termination marker
- Corresponds to 0 at the lower level: false
- '\0' can serve as a termination condition, e.g., for loops reading strings
-
Is there a '\0' after characters? No, it is a characteristic of strings
-
Related Operations
-
-
Header File: <string.h>
-
strlen(str)
- Returns the index of character '\0', length does not include '\0'
- PS: sizeof() considers the actual memory occupied by the variable, including '\0', measured in bytes
-
strcmp(str1, str2);
- Compares ASCII values in dictionary order
- Starts from the first character
- Return value is the difference (0 if equal)
- Compares ASCII values in dictionary order
-
strcpy(dest, src);
-
The end markers for comparison and copying: look for character '\0', what if '\0' is lost? Hence there are
-
Safer comparisons and copies: strncmp, strncpy
- For cases where '\0' might be accidentally lost
- Compares or copies at most n bytes
-
Memory Related
-
memset(str, c, n)
-
Not limited to string operations, str corresponds to an address
-
Here is an example of operations on arrays
-
memset(arr, 0, sizeof(arr))
- Initializes, clears to 0
-
memset(arr, 1, sizeof(arr))
- This does not set every position to 1, because it operates on each byte, and int occupies 4 bytes
- 0000...01000...001000....0001...
-
memset(arr, -1, sizeof(arr))
- It is indeed -1, because -1 in one byte is 11111111
-
-
-
-
-
Header File: <stdio.h>
-
The combination of both can do format concatenation: one for input, one for output
-
The str1 in scanf is analogous to terminal input, which is replaced with string input
In-Class Exercise#
Implement a BUG-Free MAX Macro#
-
-
Idea: ① Calculate the correct output manually 👉② First write the simplest implementation with BUGs and have friendly output display 👉③ Find and solve problems one by one for incorrect outputs
-
① Correct output as shown
-
② First write the one with BUGs and implement friendly output display
#define MAX(a, b) a > b ? a : b
#define P(func) { \
printf("%s = %d\n", #func, func); \
}
- Line2-4: Friendly output display
- #func directly outputs its string representation
- Without #, it calls its value
- ③ Find and solve problems one by one for incorrect outputs
-
(The order of solving BUGs is also important, but the order of BUGs here is good)
-
At this point, look at the error situation
-
-
Errors caused by macros can be seen in the preprocessed file
- gcc -E *.c outputs the code after preprocessing with macros replaced
-
First correction
-
- Solution: Can use parentheses in the red box to increase precedence
-
-
#define MAX(a, b) (a > b ? a : b)
-
- Second correction
-
-
The fourth result is 2, reasoning as shown in the image
-
Note the conditional operator '?:'
-
Has a very low precedence: lower than '>'
-
When multiple '?:' exist, the associativity direction is right to left
-
-
-
Solution: Use parentheses for each variable to increase precedence
-
- Second correction
#define MAX(a, b) ((a) > (b) ? (a) : (b))
-
- Third correction
-
-
a++ ran twice
- It should first take a for comparison, then perform a+1
-
Solution: Assign the pre-increment value of a to a variable; define it as an expression generated by a code segment
-
- Third correction
#define MAX(a, b) ({\
__typeof(a) _a = (a);\
__typeof(b) _b = (b);\
_a > _b ? _a : _b;\
})
-
__typeof(a++) does a++ execute?
- When used in an expression, it does not execute, it only retrieves its type
-
({}) expression
-
The value is the value of the last expression in parentheses, and the data type is that of the last expression
-
Refer to Using parentheses and braces in C language combined with && compound statements-CSDN
-
-
Removing {} or () will not work!
- () can only contain comma expressions
- Comma expressions take the value of the last expression
- {} is a code segment, which has no return value
- If defined directly as a macro function
- Directly printf output will lose the essence of MAX returning the maximum value
- ❓ Use return to return value
- Needs declaration
- But using an expression to represent the return value is smarter
- () can only contain comma expressions
-
-
Final Version Code
-
Result is all correct!
Implement LOG Macro Printing#
-
-
-
Code
-
-
When delivering work, to omit log information, conditional compilation can be used
- Use "gcc -DDEBUG" during compilation to pass in the DEBUG macro definition
- Define an empty macro replacement after #else
-
Variadic Macros
- Variadic '...' just needs a name. eg. args...
-
Using "#" and "##" in Macros
- "#" : Take as string
- "##" : Concatenate
-
Solve the situation where the log macro input is an empty character: If empty, the "," after frm will be automatically removed, refer to the preprocessing result 👇
-
-
Refer to Text Replacement Macros-cppreference
-
-
The difference between ab and a##b
- ab will not substitute a and b before concatenation
- The parameters a and b that need to be ## concatenated can be undefined
- The preprocessor will concatenate these undefined variable names, and the defined variable names after concatenation can be defined
-
-
Highlighted Notes#
Code Demonstration#
Prime Sieve#
Code
-
-
Dual logic, control indentation
- if continue
-
The front part of the prime array can simultaneously count and store (used to mark)
- Initially, prime[0] and prime[1] are not used, remaining idle
- There will be no overtaking issue: even numbers are definitely composite, and primes appear at least one apart
-
The initial condition of the second for loop can be optimized
- Initial Condition: i * 2 → i * i
- Time efficiency: O(Nlogn)→O(Nlognlogn)
- Because composite numbers smaller than i * i must have been marked by numbers smaller than i
- This reduces the number of duplicate markings in the earlier part, but not completely
- To completely avoid duplicate marking, a linear sieve must be used
- Danger point: i * i grows exponentially, more likely to overflow the maximum value that int can store
- Can a check be added in advance? But it may lead to greater time consumption
- Change to a data type with a higher upper limit to store i * i
Arrays#
Code
-
-
Output
-
- Details 👇
-
Declaration and Initialization#
- Programming Tip: Open 5 extra positions to avoid overflow, which can reduce the bug rate
#define max_n 100
int arr[max_n + 5];
- The variable space in the function is defined in the stack area
- The stack area can be understood as a badminton tube, last in first out
- Default stack size: 8MB≈8000000Byte (bytes)
- Arrays declared within a function may not be clean and need manual initialization
- = {0};
- scanf input
for (int i = 0; i < max_n; i++) {
scanf("%d",arr + i);
}
-
- arr + i is equivalent to &arr[i]& cannot be omitted!
- Arrays declared outside the function
- The system will automatically clear them to 0, similar to the = {0} operation
- The size that can be allocated is limited by the computer's memory, basically unlimited
- Arrays defined with malloc, calloc, realloc are in the heap area, even if defined within a function
Space Occupation, Address#
- The control character corresponding to sizeof() is %lu, unsigned long integer
- The array name arr represents the address of the first element of the array, i.e., &arr[0]
printf("%p %p\n", arr, &arr[0]); // Address values are the same
- Address + 1 offsets how many bytes depend on the size of the element represented by the address
Parameter Passing#
- Condition: The external and internal parameter representations must be consistent
- One-dimensional array: int arr[100];
- void func(int *num)
- Two-dimensional array: int arr2[100][100];
- void func2(int **num) will warn
- num + 1 and arr + 1 have inconsistent representations
- num is a pointer to a pointer to int
- The address difference between num and num + 1 is the size of one pointer, which is 8 bytes in a 64-bit operating system
- The address difference between arr and arr + 1 is the size of one one-dimensional array, which is 4 * 100 bytes
- void func2(int (*num)[100]) is valid
- void func2(int **num) will warn
- Three-dimensional array: int arr3[100][100][32]
- void func3(int (*num)[100][32]) is valid
- (*num)[100][32] can be written as num[][100][32]
- (*num) and num[] have the same representation, but they are essentially different
- One-dimensional array: int arr[100];
- ❓ Is q pointing to nil? That is 0x0
Additional Knowledge Points#
- An array can represent three types of information
- int arr[100]; // When arr is passed as a parameter to a function, it is the address of the first element of the array
- Is 1 negated to -2? Let's derive it
- 1: 0000...001
- Negation: 1111...110
- The sign bit remains unchanged, negating and adding one gives: 1000...010
- That is -2
- Conclusion: The bitwise negation of all positive integers is their own +1 negative number
- When including <math.h>, add -lm during compilation, using gcc *.c -lm
- Coding habit: No more than 80 characters per line
- Control character "%X": Uppercase hexadecimal
Points for Thought#
- Which is more powerful, functions or macros?
- Macros can generate functions; aren't macros the father of functions?
Tips#
- vim automatically completes the format of header files: add spaces
- Enter the ~/.vimrc file and modify the corresponding format in SetTitle()
- Explore macro expansion and nesting yourself
- Refer to the reference book chapters 8.1, 8.4, and chapter 15