Course Content#
Inter-process communication is complex [independent memory space] and context switching is expensive [temporal locality], so threads were invented.
Threads#
A branch of a process [pthread], essentially a lightweight process.
- Convenient communication because multiple threads in a process share memory.
- Low context switching cost because memory is shared, so cache swapping is not required when switching threads.
pthread_create#
Create a new thread.
- man pthread_create
- Prototype
- thread: thread ID [Note: Not a numeric type, cannot be directly compared with ==, see pthread_equal]
- attr: attributes
- arg is the argument for start_routine
- Description
- Start a new thread and execute the start_routine function.
- The start_routine function can only accept one arg parameter [multiple parameters can be encapsulated in a structure]
- There are 4 ways for a thread to terminate [as a tool, the way of death is important]
- ① Suicide: call pthread_exit by itself
- Threads in the same process can use pthread_join to receive its death status [similar to wait]
- ② Normal death: return from the start_routine function
- Equivalent to pthread_exit
- ③ Homicide: pthread_cancel
- ④ Mutual destruction: one thread in a process calls exit, or the main thread returns from the main function
- [PS] If a thread causes a memory crash, it is very likely to also result in mutual destruction, that is, all threads in the process die.
- ① Suicide: call pthread_exit by itself
- attr can be NULL, corresponding to default attributes.
- After a successful call, the thread ID will be saved in the thread variable, and it can be used later through this ID [similar to file descriptor].
- Return Value
- 0 for success; otherwise, failure.
——3 Ways to Terminate a Thread——#
pthread_exit#
Thread suicide.
- man pthread_exit
- ❗ When a thread commits suicide, it passes retval to the joining thread [threads are joinable by default].
- After executing the function registered by pthread_cleanup_push, the thread-specific data is released.
- Shared resources in the process will not be released [because there are sibling threads]
- Functions registered by atexit will not be called [this belongs to the process]
- After the last thread ends, the process ends with exit(0), releasing the shared resources of the process and executing the functions registered by atexit.
- 【Note】Relationship between threads and processes
pthread_cancel#
Send a cancellation request to a thread [homicide].
- man pthread_cancel
- The possibility and timing of thread cancellation depend on two attributes: state and type.
- state
- Killable [default]
- Uncancellable: the received cancel command will be queued in this case.
- type
- Deferred [default]: until the next call of the thread
- Asynchronous: immediately, but the system cannot guarantee it.
exit [Process-related]#
Terminate a regular process.
- man exit
- The value passed to the parent process is: status & 0377
- Note: 0377 is octal, corresponding to eight 1s in binary, which means only the low 8 bits of status are retained.
- Functions registered by atexit and on_exit will be called in the opposite order of registration.
- Nesting is possible: registered functions can have registrations themselves, and they will be placed at the beginning of the call list.
- If a registered function does not return, such as calling _exit or committing suicide using a signal, the remaining functions will not be called, and the handling of exit-related operations will be disabled.
- Multiple registrations will result in multiple calls.
- ⭐ After exit, all standard I/O streams are flushed and closed.
——Monitoring Thread Status——#
pthread_join#
Wait for a thread to terminate.
- man pthread_join
- Similar to the wait function in processes.
- retval receives the thread exit status.
- If the thread commits suicide, it copies the retval value from pthread_exit.
- If the thread is killed, it is assigned PTHREAD_CANCELED.
- [Consideration] Why is retval a double pointer here?
- Surface reason: retval in pthread_exit is a pointer, according to convention, a double pointer should be used here [similarly, if the received data is an int, an int * should be used here]
- Further reason: to be able to modify the passed pointer
- There is a blog that also mentions this: Discussion on why the second parameter of the pthread_join() function is a double pointer - CSDN
pthread_detach#
Detach a thread.
- man pthread_detach
- After a thread is detached, the system will automatically reclaim its resources when it terminates, without other threads blocking and waiting for it to terminate.
- Generally used with pthread_self to detach oneself.
- pthread_self: get the ID of the calling thread [self]
- Refer to Detailed Explanation of pthread_join() and pthread_detach() - CSDN
——Additional——#
pthread_yield#
Yield the processor.
- man pthread_yield
- [Similar to the effect of sleep]
- This method is only used on some systems, and the more standard usage is sched_yield.
- For cooperative systems, call this function to actively yield the CPU.
- For preemptive systems, the kernel will schedule, and this function has little meaning. Sleep can also be used directly.
- Cooperative and preemptive, see 4 Advanced Process Management - Scheduler Classification
pthread_equal#
Compare the IDs of two threads [cannot be directly compared with ==].
- man pthread_equal
- Returns a non-zero value if they are equal.
Thread Pool#
Let a bunch of threads wait in a pool and work at any time.
Basic components👇
① Task queue: stores tasks to be processed.
- [Circular queue is better]
- Basic operations: init, push, pop
② Multiple threads: always ready, reducing the time for creation and destruction.
③ Thread function: do_work()
- while(1): wait for tasks to be added
-
- Task queue pop: pop out a task for the thread to execute
-
- do-work(): execute the task in the CPU
-
❗ Note: Locking is required for both push and pop to prevent data races [starving threads]
- See code demonstration for details.
Kernel Threads, User Threads#
Who creates threads? Thread model
- The difference between the two is mainly in scheduling: kernel threads are scheduled by the kernel; user threads are scheduled by user processes.
- Advantages of kernel threads
- ① Each kernel thread has its own time slice. So a process with multiple threads will have more processor time because it has more threads.
- ② If a kernel thread is blocked, the remaining threads in the process can continue to run. If a user thread is blocked, the entire process will be blocked.
- PS: If a kernel thread sends a sleep signal to its own process, the thread can still continue to run.
- Advantages of user threads
- ① Low context switching cost. There is no need to switch from user mode to kernel mode.
- ② The scheduling algorithm is completely controlled by the process. User processes can use their own scheduling algorithms, so they have more autonomy; the scheduling of kernel threads is a black box to users.
- Therefore, a hybrid thread that has both kernel threads and user threads can be designed by combining the advantages of both.
Code Demonstration#
Simple Use of Multithreading#
- Pay attention to the use of the pthread_create function.
- If there is no usleep after creating the threads, or the usleep time is too short, the following may occur:
- The main thread returns, causing all child threads to die.
- In this case, some outputs may appear twice, such as ① and ② [③ is the normal output]
- Guess: the problem of the output buffer, when the thread ends suddenly, it outputs the contents of the buffer again [the buffer has not been updated in time]
- [PS] fflush cannot solve it either, maybe because the thread ends suddenly
- Note: all threads can operate on the same address value.
Thread Pool#
thread_pool.h
- Definition of the task queue and basic operations
thread_pool.c
- Note: locking, signaling; checking if the queue is full/empty; checking if the pointer has reached the end of the queue
1.test.c
- The buff that stores data is a two-dimensional array, which can avoid the main thread changing the data through the address when the thread is reading the data.
- The magic of usleep: avoid consuming too much CPU time in the while(1) loop.
- pthread_detach is generally used with pthread_self.
- fgets: read file into buffer
- man fgets
- Read line by line
- Output effect
- Basic output: push [task queue output]; pop [task queue output] + do_work [thread output]
Additional Knowledge#
- When compiling files that contain thread-related functions, remember to use -lpthread.
Tips#
- Creating a process also represents a main thread.
- Multithreading still executes sequentially on a single CPU, so using multithreading on a single-core CPU is not recommended.
- Refer to How do multi-threaded programs work on single-core CPUs and multi-core CPUs - cnblogs