Requirement Description#
- Set a concurrency level
INS
, representing the number of processes to be opened. - Use these
INS
processes to calculate the sum of numbers fromstart
toend
. start
andend
are obtained by parsing command line arguments usinggetopt
.
./a.out -s 12 -e 24
- Output an integer result:
sum
[Note]
- Mainly involves file and process-related operations.
- Using files for data sharing requires consideration of data races.
- Attempt to use file locks to simulate mutex locks between threads.
- Achieve synchronized access to critical data (data modified by multiple processes or threads) through file locks.
- Need to learn about flock: man 2 flock
Final Result#
- Calculating the sum from 1 to 1000 using 100 processes, the effect is as follows:
- Successfully allowed processes to compete for calculating the sum on the same data.
Implementation Process#
Flowchart#
- Grasp the tasks of the parent and child processes.
- Key: The locking operation for multiple processes accessing the same file makes read and write operations "atomic" [the smallest indivisible unit].
- Can be understood as atomic operations, but essentially just ensures the integrity of data reading and writing.
- Processes may be interrupted due to time slice exhaustion, but because of the lock, other processes cannot access this data at that time.
Obtaining Command Line Arguments#
Capture the -s and -e options, which must be accompanied by parameters.
#include "head.h"
int main(int argc, char **argv) {
int opt, start = 0, end = 0;
while ((opt = getopt(argc, argv, "s:e:")) != -1) {
switch (opt) {
case 's':
start = atoi(optarg); // atoi: string -> integer
break;
case 'e':
end = atoi(optarg);
break;
default:
fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
exit(1);
}
}
printf("start = %d\nend = %d\n", start, end);
return 0;
}
- The header file "head.h" is at the end.
- atoi: string 👉 integer, optarg is a character array.
- The effect is as follows:
- 🆗
Creating INS Processes#
Use fork to create INS processes, and be careful to use wait to prevent zombie processes.
#define INS 100
pid_t pid;
int x = 0; // x: process number
for (int i = 1; i <= INS; i++) {
if ((pid = fork()) < 0) {
perror("fork");
exit(1); // Just for convenience, not recommended in practice.
}
if (pid == 0) {
x = i; // Assign number to child process.
break; // Key point, otherwise it will keep nesting.
}
}
if (pid != 0) {
// Prevent zombie processes [wait for all child processes to finish].
for (int i = 1; i <= INS; i++) {
wait(NULL);
}
// Parent process
printf("I'm parent!\n");
} else {
printf("I'm %dth child!\n", x);
}
- This code segment is placed in the main function after obtaining command line arguments.
- INS is defined as a macro.
- If the child process creation fails, it directly exits(1) for convenience, which is not recommended in practice.
- The effect is as follows:
- Successfully created 100 child processes.
File-based Data Read/Write Interface#
Use files as a medium for sharing data between processes.
- How to store data in files? ASCII code [character], int [low 16 bits + high 16 bits]...
- Here, a structure is used to store data for clarity.
- Store addends and sums.
char data_file[] = "./.data";
char lock_file[] = "./.lock"; // [Optional] Set a dedicated lock.
struct Msg {
int now; // Addend
int sum; // Sum
};
struct Msg data; // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
FILE *f = fopen(data_file, "w"); // Write
if (f == NULL) {
perror("fopen");
return -1; // Exiting in a small function is too rude.
}
size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
fclose(f);
return nwrite; // Return the number of bytes successfully written, if an error occurs, also return to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
FILE *f = fopen(data_file, "r");
if (f == NULL) {
perror("fopen");
return -1;
}
size_t nread = fread(msg, 1, sizeof(struct Msg), f); // Read structure data into msg.
fclose(f);
return nread;
}
- Create a global variable data for data manipulation in processes.
- Use standard file operations; low-level file operations are also feasible.
- Return values can be used by callers to check whether read/write was successful.
Adding Locks⭐#
Allow processes to compete to maintain shared data and protect the data file from simultaneous operations.
[Two Approaches] Use one file; use two files.
- Approach One: Directly lock the data file.
char data_file[] = "./.data";
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
// Child keeps adding inside.
while (1) {
/*
* Approach One: One file, directly lock the data file.
*/
// Open data_file for locking.
FILE *f = fopen(data_file, "r");
// Add mutex lock.
flock(f->_fileno, LOCK_EX);
// Read data from file [the get_data function will open the data_file again, corresponding to a new fd, the lock is not shared].
if (get_data(&data) < 0) continue;
// Addend +1, and check if the addend exceeds the range.
if (++data.now > end) {
fclose(f);
break;
}
// Perform addition.
data.sum += data.now;
printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
// Write data to file.
if (set_data(&data) < 0) continue;
// Unlock [closing later will also automatically release the lock].
flock(fileno(f), LOCK_UN);
fclose(f);
}
}
- Function parameters: end as the reference for the stopping condition of addition, id can be used to observe which child is doing the addition each time.
- Locking 👉 unlocking in the middle is an atomic operation [the smallest indivisible unit].
- Encapsulates reading data, performing calculations, and writing data operations, during which data will not be preempted.
- Obtain the file descriptor fd from the file pointer FILE* f.
- ① f->_fileno
- ② fileno(f)
- [PS]
- Repeatedly opening a file will get different file descriptors, and the locks are independent of each other.
- Closing a file will automatically release the lock.
- After each call to read/write interfaces, use the return value to check whether the operation was successful.
- Approach Two: Set a dedicated file for locking.
char data_file[] = "./.data";
char lock_file[] = "./.lock"; // Set a dedicated lock.
void do_add(int end, int id) {
while (1) {
/*
* Approach Two: Two files, use a separate file as a lock [easier to understand].
*/
// Open or create a lock file; if the file is locked, it will wait for the user to unlock.
FILE *lock = fopen(lock_file, "w"); // "w": if the file does not exist, it will create one.
if (lock == NULL) {
perror("fopen");
exit(1);
}
// Lock.
flock(lock->_fileno, LOCK_EX);
// Read data from file.
if (get_data(&data) < 0) {
fclose(lock); // Close the lock file and release the lock.
continue;
}
// Addend +1, and check if it meets the stopping condition for addition.
if (++data.now > end) {
fclose(lock);
break;
}
// Perform addition.
data.sum += data.now;
printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
// Write data to file.
if (set_data(&data) < 0) continue;
// Unlock.
flock(lock->_fileno, LOCK_UN);
fclose(lock);
}
}
- lock_file is purely for locking purposes.
- The effect is as follows: [single-core, 5 processes, calculating 1~100].
- The single-core effect is more orderly than multi-core.
- A single core can only run one process at a time.
- You can use usleep() to suspend processes in advance, preventing one process from calculating for too long, making the order more chaotic.
- If the output is passed to more, it will rearrange the output by process.
- [Note]
- In the main function, write the initial value of data to the file first, otherwise the file will be empty [see complete code for details].
- In the main function, call the do_add() function in the child process logic, and in the parent process logic, wait for all child processes to finish, then retrieve and output the final result from the data file.
- ❗ If no locks are added, the results are still correct.
- The addend and sum are packaged together, and the addition will not be erroneous.
- However, each process will calculate the result completely, possibly due to buffering? No.
- After all write operations, adding fflush, although there are some cases where calculations continue, each process will still reach the correct final result.
- It is equivalent to a process finishing the calculation, writing data to the file, but another process reading is not the latest data yet, and will calculate the sum again.
- Explanation:
- Multiple processes opening the same file, each process has its own file table entry (file object), containing its own file offset.
- Therefore, multiple processes reading the same file can work correctly, but writing to the same file may produce unexpected results. Consider using pread, pwrite.
- Also refer to Multiple Processes Operating Files Simultaneously in Linux——cnblogs.
Complete Code#
sum.c#
#include "head.h"
#define INS 100
char data_file[] = "./.data";
char lock_file[] = "./.lock"; // [Optional] Set a dedicated lock.
// Data to be passed.
struct Msg {
int now; // Addend
int sum; // Sum
};
struct Msg data; // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
FILE *f = fopen(data_file, "w"); // Write
if (f == NULL) {
perror("fopen");
return -1; // Exiting in a small function is too rude.
}
size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
fclose(f);
return nwrite; // Return the number of bytes successfully written, if an error occurs, also return to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
FILE *f = fopen(data_file, "r");
if (f == NULL) {
perror("fopen");
return -1;
}
size_t nread = fread(msg, 1, sizeof(struct Msg), f); // Read structure data into msg.
return nread;
}
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
// Child keeps adding inside.
while (1) {
/*
* Approach Two: Two files, use a separate file as a lock [easier to understand].
*/
// Open or create a lock file; if the file is locked, it will wait for the user to unlock.
FILE *lock = fopen(lock_file, "w"); // "w": if the file does not exist, it will create one.
if (lock == NULL) {
perror("fopen");
exit(1);
}
// Lock.
flock(lock->_fileno, LOCK_EX);
// Read data from file.
if (get_data(&data) < 0) {
fclose(lock); // Close the lock file and release the lock.
continue;
}
// Addend +1, and check if it meets the stopping condition for addition.
if (++data.now > end) {
fclose(lock);
break;
}
// Perform addition.
data.sum += data.now;
printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
// Write data to file.
if (set_data(&data) < 0) continue;
// Unlock.
flock(lock->_fileno, LOCK_UN);
fclose(lock);
/*
* Approach One: One file, directly lock the data file.
*/
/*
* // Open data_file for locking.
* FILE *f = fopen(data_file, "r");
* // Add mutex lock.
* flock(f->_fileno, LOCK_EX);
* // Read data from file [the get_data function will open the data_file again, corresponding to a new fd, the lock is not shared].
* if (get_data(&data) < 0) continue;
* // Addend +1, and check if the addend exceeds the range.
* if (++data.now > end) {
* fclose(f);
* break;
* }
* // Perform addition.
* data.sum += data.now;
* printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
* // Write data to file.
* if (set_data(&data) < 0) continue;
* // Unlock [closing later will also automatically release the lock].
* flock(fileno(f), LOCK_UN);
* fclose(f);
*/
}
}
int main(int argc, char **argv) {
int opt, start = 0, end = 0;
while ((opt = getopt(argc, argv, "s:e:")) != -1) {
switch (opt) {
case 's':
start = atoi(optarg); // atoi: string -> integer
break;
case 'e':
end = atoi(optarg);
break;
default:
fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
exit(1);
}
}
printf("start = %d\nend = %d\n", start, end);
// Write initial data to file first.
if (set_data(&data) < 0) return -1; // data is a global variable, members default to 0.
pid_t pid;
int x = 0; // x: process number.
for (int i = 1; i <= INS; i++) {
if ((pid = fork()) < 0) {
perror("fork");
exit(1); // Just for convenience, not recommended in practice.
}
if (pid == 0) {
x = i; // Assign number to child process.
break; // Key point, otherwise it will keep nesting.
}
}
if (pid != 0) {
// Prevent zombie processes [wait for all child processes to finish].
for (int i = 1; i <= INS; i++) {
wait(NULL);
}
if (get_data(&data) < 0) return -1; // Obtain final result.
printf("sum = %d\n", data.sum);
} else {
do_add(end, x); // The only task of the child process.
}
return 0;
}
head.h#
#ifndef _HEAD_H
#define _HEAD_H
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <sys/wait.h>
#include <sys/file.h>
#endif
- There may be extra header files, which are not the focus.
References#
- Main knowledge points refer to "Network and System Programming".
- 0 Course Introduction and Command Line Parsing Functions——getopt
- 1 File, Directory Operations and Implementation of ls Ideas——fopen, fread, fwrite
- 3 Multi-Process——fork, wait, flock⭐