Bo2SS

Bo2SS

File-based inter-process communication — Using 100 processes to compete for the cumulative sum

Requirement Description#

  1. Set a concurrency level INS, representing the number of processes to be opened.
  2. Use these INS processes to calculate the sum of numbers from start to end.
  3. start and end are obtained by parsing command line arguments using getopt.
./a.out -s 12 -e 24
  1. Output an integer result: sum

[Note]

  • Mainly involves file and process-related operations.
  • Using files for data sharing requires consideration of data races.
  • Attempt to use file locks to simulate mutex locks between threads.
  • Achieve synchronized access to critical data (data modified by multiple processes or threads) through file locks.
  • Need to learn about flock: man 2 flock

Final Result#

  • Calculating the sum from 1 to 1000 using 100 processes, the effect is as follows:
    • Image
    • Image
    • Successfully allowed processes to compete for calculating the sum on the same data.

Implementation Process#

Flowchart#

  • Image
  • Grasp the tasks of the parent and child processes.
  • Key: The locking operation for multiple processes accessing the same file makes read and write operations "atomic" [the smallest indivisible unit].
    • Can be understood as atomic operations, but essentially just ensures the integrity of data reading and writing.
    • Processes may be interrupted due to time slice exhaustion, but because of the lock, other processes cannot access this data at that time.

Obtaining Command Line Arguments#

Capture the -s and -e options, which must be accompanied by parameters.

#include "head.h"
int main(int argc, char **argv) {
    int opt, start = 0, end = 0;
    while ((opt = getopt(argc, argv, "s:e:")) != -1) {
        switch (opt) {
            case 's':
                start = atoi(optarg);  // atoi: string -> integer
                break;
            case 'e':
                end = atoi(optarg);
                break;
            default:
                fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
                exit(1);
        }
    }
    printf("start = %d\nend = %d\n", start, end);
    return 0;
}
  • The header file "head.h" is at the end.
  • atoi: string 👉 integer, optarg is a character array.
  • The effect is as follows:
    • Image
    • 🆗

Creating INS Processes#

Use fork to create INS processes, and be careful to use wait to prevent zombie processes.

#define INS 100
pid_t pid;
int x = 0;       // x: process number
for (int i = 1; i <= INS; i++) {
    if ((pid = fork()) < 0) {
        perror("fork");
        exit(1);  // Just for convenience, not recommended in practice.
    }
    if (pid == 0) {
        x = i;   // Assign number to child process.
        break;   // Key point, otherwise it will keep nesting.
    }
}
if (pid != 0) {
    // Prevent zombie processes [wait for all child processes to finish].
    for (int i = 1; i <= INS; i++) {
        wait(NULL);
    }
    // Parent process
    printf("I'm parent!\n");  
} else {
    printf("I'm %dth child!\n", x);
}
  • This code segment is placed in the main function after obtaining command line arguments.
  • INS is defined as a macro.
  • If the child process creation fails, it directly exits(1) for convenience, which is not recommended in practice.
  • The effect is as follows:
    • Image
    • Successfully created 100 child processes.

File-based Data Read/Write Interface#

Use files as a medium for sharing data between processes.

  • How to store data in files? ASCII code [character], int [low 16 bits + high 16 bits]...
  • Here, a structure is used to store data for clarity.
    • Image
    • Store addends and sums.
char data_file[] = "./.data";
char lock_file[] = "./.lock";  // [Optional] Set a dedicated lock.
struct Msg {
    int now;                     // Addend
    int sum;                     // Sum
};
struct Msg data;                // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "w");                     // Write
    if (f == NULL) {
        perror("fopen");
        return -1;                                         // Exiting in a small function is too rude.
    }
    size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
    fclose(f);
    return nwrite;                                         // Return the number of bytes successfully written, if an error occurs, also return to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "r");
    if (f == NULL) {
        perror("fopen");
        return -1;
    }
    size_t nread = fread(msg, 1, sizeof(struct Msg), f); // Read structure data into msg.
    fclose(f);
    return nread;
}
  • Create a global variable data for data manipulation in processes.
  • Use standard file operations; low-level file operations are also feasible.
  • Return values can be used by callers to check whether read/write was successful.

Adding Locks⭐#

Allow processes to compete to maintain shared data and protect the data file from simultaneous operations.

[Two Approaches] Use one file; use two files.

  • Approach One: Directly lock the data file.
char data_file[] = "./.data";
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
    // Child keeps adding inside.
    while (1) {
        /*
         * Approach One: One file, directly lock the data file.
         */
        // Open data_file for locking.
        FILE *f = fopen(data_file, "r");
        // Add mutex lock.
        flock(f->_fileno, LOCK_EX);
        // Read data from file [the get_data function will open the data_file again, corresponding to a new fd, the lock is not shared].
        if (get_data(&data) < 0) continue;
        // Addend +1, and check if the addend exceeds the range.
        if (++data.now > end) {
            fclose(f);
            break;
        }
        // Perform addition.
        data.sum += data.now;
        printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
        // Write data to file.
        if (set_data(&data) < 0) continue;
        // Unlock [closing later will also automatically release the lock].
        flock(fileno(f), LOCK_UN);
        fclose(f);
    }
}
  • Function parameters: end as the reference for the stopping condition of addition, id can be used to observe which child is doing the addition each time.
  • Locking 👉 unlocking in the middle is an atomic operation [the smallest indivisible unit].
    • Encapsulates reading data, performing calculations, and writing data operations, during which data will not be preempted.
  • Obtain the file descriptor fd from the file pointer FILE* f.
    • ① f->_fileno
    • ② fileno(f)
  • [PS]
    • Repeatedly opening a file will get different file descriptors, and the locks are independent of each other.
    • Closing a file will automatically release the lock.
    • After each call to read/write interfaces, use the return value to check whether the operation was successful.
  • Approach Two: Set a dedicated file for locking.
char data_file[] = "./.data";
char lock_file[] = "./.lock";  // Set a dedicated lock.
void do_add(int end, int id) {
    while (1) {
        /*
         * Approach Two: Two files, use a separate file as a lock [easier to understand].
         */
        // Open or create a lock file; if the file is locked, it will wait for the user to unlock.
        FILE *lock = fopen(lock_file, "w");  // "w": if the file does not exist, it will create one.
        if (lock == NULL) {
            perror("fopen");
            exit(1);
        }
        // Lock.
        flock(lock->_fileno, LOCK_EX);
        // Read data from file.
        if (get_data(&data) < 0) {
            fclose(lock);               // Close the lock file and release the lock.
            continue;
        }
        // Addend +1, and check if it meets the stopping condition for addition.
        if (++data.now > end) {
            fclose(lock);
            break;
        }
        // Perform addition.
        data.sum += data.now;
        printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
        // Write data to file.
        if (set_data(&data) < 0) continue;
        // Unlock.
        flock(lock->_fileno, LOCK_UN);
        fclose(lock);
    }
}
  • lock_file is purely for locking purposes.
  • The effect is as follows: [single-core, 5 processes, calculating 1~100].
    • Image
    • Image
    • The single-core effect is more orderly than multi-core.
      • A single core can only run one process at a time.
      • You can use usleep() to suspend processes in advance, preventing one process from calculating for too long, making the order more chaotic.
    • If the output is passed to more, it will rearrange the output by process.
  • [Note]
    • In the main function, write the initial value of data to the file first, otherwise the file will be empty [see complete code for details].
    • In the main function, call the do_add() function in the child process logic, and in the parent process logic, wait for all child processes to finish, then retrieve and output the final result from the data file.
  • ❗ If no locks are added, the results are still correct.
    • The addend and sum are packaged together, and the addition will not be erroneous.
    • However, each process will calculate the result completely, possibly due to buffering? No.
      • After all write operations, adding fflush, although there are some cases where calculations continue, each process will still reach the correct final result.
      • It is equivalent to a process finishing the calculation, writing data to the file, but another process reading is not the latest data yet, and will calculate the sum again.
    • Explanation:
      • Multiple processes opening the same file, each process has its own file table entry (file object), containing its own file offset.
      • Therefore, multiple processes reading the same file can work correctly, but writing to the same file may produce unexpected results. Consider using pread, pwrite.
      • Also refer to Multiple Processes Operating Files Simultaneously in Linux——cnblogs.

Complete Code#

sum.c#

#include "head.h"
#define INS 100
char data_file[] = "./.data";
char lock_file[] = "./.lock";  // [Optional] Set a dedicated lock.
// Data to be passed.
struct Msg {
    int now;                     // Addend
    int sum;                     // Sum
};
struct Msg data;                // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "w");                     // Write
    if (f == NULL) {
        perror("fopen");
        return -1;                                         // Exiting in a small function is too rude.
    }
    size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
    fclose(f);
    return nwrite;                                         // Return the number of bytes successfully written, if an error occurs, also return to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "r");
    if (f == NULL) {
        perror("fopen");
        return -1;
    }
    size_t nread = fread(msg, 1, sizeof(struct Msg), f); // Read structure data into msg.
    return nread;
}
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
    // Child keeps adding inside.
    while (1) {
        /*
         * Approach Two: Two files, use a separate file as a lock [easier to understand].
         */
        // Open or create a lock file; if the file is locked, it will wait for the user to unlock.
        FILE *lock = fopen(lock_file, "w");  // "w": if the file does not exist, it will create one.
        if (lock == NULL) {
            perror("fopen");
            exit(1);
        }
        // Lock.
        flock(lock->_fileno, LOCK_EX);
        // Read data from file.
        if (get_data(&data) < 0) {
            fclose(lock);               // Close the lock file and release the lock.
            continue;
        }
        // Addend +1, and check if it meets the stopping condition for addition.
        if (++data.now > end) {
            fclose(lock);
            break;
        }
        // Perform addition.
        data.sum += data.now;
        printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
        // Write data to file.
        if (set_data(&data) < 0) continue;
        // Unlock.
        flock(lock->_fileno, LOCK_UN);
        fclose(lock);
        /*
         * Approach One: One file, directly lock the data file.
         */
        /*
         * // Open data_file for locking.
         * FILE *f = fopen(data_file, "r");
         * // Add mutex lock.
         * flock(f->_fileno, LOCK_EX);
         * // Read data from file [the get_data function will open the data_file again, corresponding to a new fd, the lock is not shared].
         * if (get_data(&data) < 0) continue;
         * // Addend +1, and check if the addend exceeds the range.
         * if (++data.now > end) {
         *     fclose(f);
         *     break;
         * }
         * // Perform addition.
         * data.sum += data.now;
         * printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
         * // Write data to file.
         * if (set_data(&data) < 0) continue;
         * // Unlock [closing later will also automatically release the lock].
         * flock(fileno(f), LOCK_UN);
         * fclose(f);
         */
    }
}
int main(int argc, char **argv) {
    int opt, start = 0, end = 0;
    while ((opt = getopt(argc, argv, "s:e:")) != -1) {
        switch (opt) {
            case 's':
                start = atoi(optarg);      // atoi: string -> integer
                break;
            case 'e':
                end = atoi(optarg);
                break;
            default:
                fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
                exit(1);
        }
    }
    printf("start = %d\nend = %d\n", start, end);
    // Write initial data to file first.
    if (set_data(&data) < 0) return -1;     // data is a global variable, members default to 0.
    pid_t pid;
    int x = 0;                               // x: process number.
    for (int i = 1; i <= INS; i++) {
        if ((pid = fork()) < 0) {
            perror("fork");
            exit(1);                         // Just for convenience, not recommended in practice.
        }
        if (pid == 0) {
            x = i;                           // Assign number to child process.
            break;                           // Key point, otherwise it will keep nesting.
        }
    }
    if (pid != 0) {
        // Prevent zombie processes [wait for all child processes to finish].
        for (int i = 1; i <= INS; i++) {
            wait(NULL);
        }
        if (get_data(&data) < 0) return -1; // Obtain final result.
        printf("sum = %d\n", data.sum);
    } else {
        do_add(end, x);                      // The only task of the child process.
    }
    return 0;
}

head.h#

#ifndef _HEAD_H
#define _HEAD_H
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <sys/wait.h>
#include <sys/file.h>
#endif
  • There may be extra header files, which are not the focus.

References#

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.