Bo2SS

Bo2SS

6 Multithreading Programming Basics

Course Content#

Inter-process communication is complex [independent memory space] and context switching is expensive [temporal locality], so threads were invented.

Threads#

A branch of a process [pthread], essentially a lightweight process.

  • Convenient communication because multiple threads in a process share memory.
  • Low context switching cost because memory is shared, so cache swapping is not required when switching threads.

pthread_create#

Create a new thread.

  • man pthread_create
  • Prototype
    • Image
    • thread: thread ID [Note: Not a numeric type, cannot be directly compared with ==, see pthread_equal]
    • attr: attributes
    • arg is the argument for start_routine
  • Description
    • Image
    • Start a new thread and execute the start_routine function.
      • The start_routine function can only accept one arg parameter [multiple parameters can be encapsulated in a structure]
    • There are 4 ways for a thread to terminate [as a tool, the way of death is important]
      • ① Suicide: call pthread_exit by itself
        • Threads in the same process can use pthread_join to receive its death status [similar to wait]
      • ② Normal death: return from the start_routine function
        • Equivalent to pthread_exit
      • ③ Homicide: pthread_cancel
      • ④ Mutual destruction: one thread in a process calls exit, or the main thread returns from the main function
        • [PS] If a thread causes a memory crash, it is very likely to also result in mutual destruction, that is, all threads in the process die.
    • attr can be NULL, corresponding to default attributes.
    • After a successful call, the thread ID will be saved in the thread variable, and it can be used later through this ID [similar to file descriptor].
  • Return Value
    • 0 for success; otherwise, failure.

——3 Ways to Terminate a Thread——#

pthread_exit#

Thread suicide.

  • man pthread_exit
  • Image
  • ❗ When a thread commits suicide, it passes retval to the joining thread [threads are joinable by default].
  • After executing the function registered by pthread_cleanup_push, the thread-specific data is released.
    • Shared resources in the process will not be released [because there are sibling threads]
    • Functions registered by atexit will not be called [this belongs to the process]
  • After the last thread ends, the process ends with exit(0), releasing the shared resources of the process and executing the functions registered by atexit.
  • 【Note】Relationship between threads and processes

pthread_cancel#

Send a cancellation request to a thread [homicide].

  • man pthread_cancel
  • Image
  • The possibility and timing of thread cancellation depend on two attributes: state and type.
  • state
    • Killable [default]
    • Uncancellable: the received cancel command will be queued in this case.
  • type
    • Deferred [default]: until the next call of the thread
    • Asynchronous: immediately, but the system cannot guarantee it.

Terminate a regular process.

  • man exit
  • Image
  • The value passed to the parent process is: status & 0377
    • Note: 0377 is octal, corresponding to eight 1s in binary, which means only the low 8 bits of status are retained.
  • Functions registered by atexit and on_exit will be called in the opposite order of registration.
    • Nesting is possible: registered functions can have registrations themselves, and they will be placed at the beginning of the call list.
    • If a registered function does not return, such as calling _exit or committing suicide using a signal, the remaining functions will not be called, and the handling of exit-related operations will be disabled.
    • Multiple registrations will result in multiple calls.
  • ⭐ After exit, all standard I/O streams are flushed and closed.

——Monitoring Thread Status——#

pthread_join#

Wait for a thread to terminate.

  • man pthread_join
  • Image
  • Similar to the wait function in processes.
  • retval receives the thread exit status.
    • If the thread commits suicide, it copies the retval value from pthread_exit.
    • If the thread is killed, it is assigned PTHREAD_CANCELED.
  • [Consideration] Why is retval a double pointer here?

pthread_detach#

Detach a thread.

  • man pthread_detach
  • Image
  • After a thread is detached, the system will automatically reclaim its resources when it terminates, without other threads blocking and waiting for it to terminate.
  • Generally used with pthread_self to detach oneself.
    • pthread_self: get the ID of the calling thread [self]
  • Refer to Detailed Explanation of pthread_join() and pthread_detach() - CSDN

——Additional——#

pthread_yield#

Yield the processor.

  • man pthread_yield
  • [Similar to the effect of sleep]
  • This method is only used on some systems, and the more standard usage is sched_yield.
    • For cooperative systems, call this function to actively yield the CPU.
    • For preemptive systems, the kernel will schedule, and this function has little meaning. Sleep can also be used directly.
    • Cooperative and preemptive, see 4 Advanced Process Management - Scheduler Classification

pthread_equal#

Compare the IDs of two threads [cannot be directly compared with ==].

  • man pthread_equal
  • Image
  • Returns a non-zero value if they are equal.

Thread Pool#

Let a bunch of threads wait in a pool and work at any time.

Basic components👇

Task queue: stores tasks to be processed.

  • [Circular queue is better]
  • Basic operations: init, push, pop

② Multiple threads: always ready, reducing the time for creation and destruction.

③ Thread function: do_work()

  • while(1): wait for tasks to be added
      1. Task queue pop: pop out a task for the thread to execute
      1. do-work(): execute the task in the CPU

❗ Note: Locking is required for both push and pop to prevent data races [starving threads]

  • See code demonstration for details.

Kernel Threads, User Threads#

Who creates threads? Thread model

  • The difference between the two is mainly in scheduling: kernel threads are scheduled by the kernel; user threads are scheduled by user processes.
  • Advantages of kernel threads
    • ① Each kernel thread has its own time slice. So a process with multiple threads will have more processor time because it has more threads.
    • ② If a kernel thread is blocked, the remaining threads in the process can continue to run. If a user thread is blocked, the entire process will be blocked.
      • PS: If a kernel thread sends a sleep signal to its own process, the thread can still continue to run.
  • Advantages of user threads
    • ① Low context switching cost. There is no need to switch from user mode to kernel mode.
    • ② The scheduling algorithm is completely controlled by the process. User processes can use their own scheduling algorithms, so they have more autonomy; the scheduling of kernel threads is a black box to users.
  • Therefore, a hybrid thread that has both kernel threads and user threads can be designed by combining the advantages of both.

Code Demonstration#

Simple Use of Multithreading#

  • Image
  • Image
  • Pay attention to the use of the pthread_create function.
  • If there is no usleep after creating the threads, or the usleep time is too short, the following may occur:
    • The main thread returns, causing all child threads to die.
    • In this case, some outputs may appear twice, such as ① and ② [③ is the normal output]
    • Image
    • Guess: the problem of the output buffer, when the thread ends suddenly, it outputs the contents of the buffer again [the buffer has not been updated in time]
    • [PS] fflush cannot solve it either, maybe because the thread ends suddenly
  • Note: all threads can operate on the same address value.

Thread Pool#

thread_pool.h

  • Image
  • Definition of the task queue and basic operations

thread_pool.c

  • Image
  • Image
  • Note: locking, signaling; checking if the queue is full/empty; checking if the pointer has reached the end of the queue

1.test.c

  • Image
  • Image
  • The buff that stores data is a two-dimensional array, which can avoid the main thread changing the data through the address when the thread is reading the data.
  • The magic of usleep: avoid consuming too much CPU time in the while(1) loop.
  • pthread_detach is generally used with pthread_self.
  • fgets: read file into buffer
    • man fgets
    • Image
    • Read line by line
  • Output effect
    • Image
    • Basic output: push [task queue output]; pop [task queue output] + do_work [thread output]

Additional Knowledge#

  • When compiling files that contain thread-related functions, remember to use -lpthread.

Tips#


Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.