Chapter 15. POSIX IPC

The classical UNIX interprocess communication (IPC) mechanisms of shared memory, message queues and semaphore sets are standardized in the POSIX:XSI Extension. These mechanisms, which allow unrelated processes to exchange information in a reasonably efficient way, use a key to identify, create or access the corresponding entity. The entities may persist in the system beyond the lifetime of the process that creates them, but conveniently, POSIX:XSI also provides shell commands to list and remove them.

POSIX:XSI Interprocess Communication

The POSIX interprocess communication (IPC) is part of the POSIX:XSI Extension and has its origin in UNIX System V interprocess communication. IPC, which includes message queues, semaphore sets and shared memory, provides mechanisms for sharing information among processes on the same system. These three communication mechanisms have a similar structure, and this chapter emphasizes the common elements of their use. Table 15.1 summarizes the POSIX:XSI interprocess communication functions.

Identifying and accessing IPC objects

POSIX:XSI identifies each IPC object by a unique integer that is greater than or equal to zero and is returned from the get function for the object in much the same way as the open function returns an integer representing a file descriptor. For example, msgget returns an integer identifier for message queue objects. Similarly, semget returns an integer identifier for a specified semaphore set, and shmget returns an integer identifier for a shared memory segment. These identifiers are associated with additional data structures that are defined in sys/msg.h, sys/sem.h or sys/shm.h, respectively. The integer identifiers within each IPC object type are unique, but you might well have an integer identifier 1 for two different types of objects, say, a semaphore set and a message queue.

When creating or accessing an IPC object, you must specify a key to designate the particular object to be created or accessed. Pick a key in one of these three ways.

  • Let the system pick a key (IPC_PRIVATE).

  • Pick a key directly.

  • Ask the system to generate a key from a specified path by calling ftok.

The ftok function allows independent processes to derive the same key based on a known pathname. The file corresponding to the pathname must exist and be accessible to the processes that want to access an IPC object. The combination of path and id uniquely identifies the IPC object. The id parameter allows several IPC objects of the same type to be keyed from a single pathname.


   #include <sys/ipc.h>

   key_t ftok(const char *path, int id);

If successful, ftok returns a key. If unsuccessful, ftok returns (key_t)-1 and sets errno. The following table lists the mandatory errors for ftok.





search permission on a path component denied



a loop exists in resolution of path



length of path exceeds PATH_MAX, or length of a pathname component exceeds NAME_MAX



a component of path is not a file or is empty



a component of path’s prefix is not a directory


Example 15.1. 

The following code segment derives a key from the filename /tmp/trouble.c.

if ((thekey = ftok("tmp/trouble.c", 1)) == (key_t)-1))
   perror("Failed to derive key from /tmp/trouble.c");

Accessing POSIX:XSI IPC resources from the shell

The POSIX:XSI Extension for shells and utilities defines shell commands for examining and deleting IPC resources, a convenient feature that is missing for the POSIX:SEM semaphores.

The ipcs command displays information about POSIX:XSI interprocess communication resources. If you forget which ones you created, you can list them from the shell command line.


  ipcs [-qms][-a | -bcopt]
                                        POSIX:XSI,Shell and Utilities

If no options are given, ipcs outputs, in an abbreviated format, information about message queues, shared memory segments and semaphore sets. You can restrict the display to specific types of IPC resources with the -q, -m and -s options for message queues, shared memory and semaphores, respectively. The -a option displays a long format giving all information available. The -bcopt options specify which components of the available information to print.

Example 15.2. 

The following command displays all the available information about the semaphores currently allocated on the system.

ipcs -s -a

You can remove an individual resource by giving either an ID or a key. Use the ipcrm command to remove POSIX:XSI interprocess communication resources.


  ipcrm [-q msgid | -Q msgkey | -s semid | -S semkey |
         -m shmid | -M shmkey] ....
                                        POSIX:XSI,Shell and Utilities

The lower case -q, -s and -m options use the object ID to specify the removal of a message queue, semaphore set or shared memory segment, respectively. The uppercase options use the original creation key.

POSIX:XSI Semaphore Sets

A POSIX:XSI semaphore consists of an array of semaphore elements. The semaphore elements are similar, but not identical, to the classical integer semaphores proposed by Dijsktra, as described in Chapter 14. A process can perform operations on the entire set in a single call. Thus, POSIX:XSI semaphores are capable of AND synchronization, as described in Section 14.2. We refer to POSIX:XSI semaphores as semaphore sets to distinguish them from the POSIX:SEM semaphores described in Chapter 14.

Each semaphore element includes at least the following information.

  • A nonnegative integer representing the value of the semaphore element (semval)

  • The process ID of the last process to manipulate the semaphore element (sempid)

  • The number of processes waiting for the semaphore element value to increase (semncnt)

  • The number of processes waiting for the semaphore element value to equal 0 (semzcnt)

The major data structure for semaphores is semid_ds, which is defined in sys/sem.h and has the following members.

struct ipc_perm sem_perm; /* operation permission structure */
unsigned short sem_nsems; /* number of semaphores in the set */
time_t sem_otime;         /* time of last semop */
time_t sem_ctime;         /* time of last semctl */

Each semaphore element has two queues associated with it—a queue of processes waiting for the value to equal 0 and a queue of processes waiting for the value to increase. The semaphore element operations allow a process to block until a semaphore element value is 0 or until it increases to a specific value greater than zero.

Semaphore creation

The semget function returns the semaphore identifier associated with the key parameter. The semget function creates the identifier and its associated semaphore set if either the key is IPC_PRIVATE or semflg & IPC_CREAT is nonzero and no semaphore set or identifier is already associated with key. The nsems parameter specifies the number of semaphore elements in the set. The individual semaphore elements within a semaphore set are referenced by the integers 0 through nsems - 1. Semaphores have permissions specified by the semflg argument of semget. Set permission values in the same way as described in Section 4.3 for files, and change the permissions by calling semctl. Semaphore elements should be initialized with semctl before they are used.


  #include <sys/sem.h>

  int semget(key_t key, int nsems, int semflg);

If successful, semget returns a nonnegative integer corresponding to the semaphore identifier. If unsuccessful, the semget function returns –1 and sets errno. The following table lists the mandatory errors for semget.





semaphore exists for key but permission not granted



semaphore exists for key but ( (semflg & IPC_CREAT) && (semflg & IPC_EXCL) ) != 0



nsems <= 0 or greater than system limit, or nsems doesn’t agree with semaphore set size



semaphore does not exist for key and (semflg & IPC_CREAT) == 0



systemwide limit on semaphores would be exceeded


If a process attempts to create a semaphore that already exists, it receives a handle to the existing semaphore unless the semflg value includes both IPC_CREAT and IPC_EXCL. In the latter case, semget fails and sets errno equal to EEXIST.

Example 15.3. 

The following code segment creates a new semaphore set containing three semaphore elements.


int semid;
if ((semid = semget(IPC_PRIVATE, 3, PERMS)) == -1)
   perror("Failed to create new private semaphore");

This semaphore can only be read or written by the owner.

The IPC_PRIVATE key guarantees that semget creates a new semaphore. To get a new semaphore set from a made-up key or a key derived from a pathname, the process must specify by using the IPC_CREAT flag that it is creating a new semaphore. If both ICP_CREAT and IPC_EXCL are specified, semget returns an error if the semaphore already exists.

Example 15.4. 

The following code segment accesses a semaphore set with a single element identified by the key value 99887.

#define KEY ((key_t)99887)

int semid;
if ((semid = semget(KEY, 1, PERMS | IPC_CREAT)) == -1)
   perror ("Failed to access semaphore with key 99887");

The IPC_CREAT flag ensures that if the semaphore set doesn’t exist, semget creates it. The permissions allow all users to access the semaphore set.

Giving a specific key value allows cooperating processes to agree on a common semaphore set. If the semaphore already exists, semget returns a handle to the existing semaphore. If you replace the semflg argument of semget with PERMS | IPC_CREAT | IPC_EXCL, semget returns an error when the semaphore already exists.

Program 15.1 demonstrates how to identify a semaphore set by using a key generated from a pathname and an ID, which are passed as command-line arguments. If semfrompath executes successfully, the semaphores will exist after the program exits. You will need to call the ipcrm command to get rid of them.

Example 15.1. semfrompath.c

A program that creates a semaphore from a pathname key.

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/sem.h>
#include <sys/stat.h>
#define SET_SIZE 2

int main(int argc, char *argv[]) {
   key_t mykey;
   int semid;

   if (argc != 3) {
      fprintf(stderr, "Usage: %s pathname id
", argv[0]);
      return 1;
   if ((mykey = ftok(argv[1], atoi(argv[2]))) == (key_t)-1) {
      fprintf(stderr, "Failed to derive key from filename %s:%s
             argv[1], strerror(errno));
      return 1;
   if ((semid = semget(mykey, SET_SIZE, PERMS | IPC_CREAT)) == -1) {
      fprintf(stderr, "Failed to create semaphore with key %d:%s
             (int)mykey, strerror(errno));
      return 1;
   printf("semid = %d
", semid);
   return 0;

Semaphore control

Each element of a semaphore set must be initialized with semctl before it is used. The semctl function provides control operations in element semnum for the semaphore set semid. The cmd parameter specifies the type of operation. The optional fourth parameter, arg, depends on the value of cmd.


  #include <sys/sem.h>

  int semctl(int semid, int semnum, int cmd, ...);


If successful, semctl returns a nonnegative value whose interpretation depends on cmd. The GETVAL, GETPID, GETNCNT and GETZCNT values of cmd cause semctl to return the value associated with cmd. All other values of cmd cause semctl to return 0 if successful. If unsuccessful, semctl returns –1 and sets errno. The following table lists the mandatory errors for semctl.




operation is denied to the caller


value of semid or of cmd is invalid, or value of semnum is negative or too large


value of cmd is IPC_RMID or IPC_SET and caller does not have required privileges


cmd is SETVAL or SETALL and value to be set is out of range

Table 15.2 gives the POSIX:XSI values for the cmd parameter of semctl.

Table 15.2. POSIX:XSI values for the cmd parameter of semctl.




return values of the semaphore set in arg.array


return value of a specific semaphore element


return process ID of last process to manipulate element


return number of processes waiting for element to increment


return number of processes waiting for element to become 0


remove semaphore set identified by semid


set permissions of the semaphore set from arg.buf


copy members of semid_ds of semaphore set semid into arg.buf


set values of semaphore set from arg.array


set value of a specific semaphore element to arg.val

Several of these commands, such as GETALL and SETALL, require an arg parameter to read or store results. The arg parameter is of type union semun, which must be defined in programs that use it, as follows.

union semun {
   int val;
   struct semid_ds *buf;
   unsigned short *array;
} arg;

Example 15.5. initelement.c

The initelement function sets the value of the specified semaphore element to semvalue.

#include <sys/sem.h>

int initelement(int semid, int semnum, int semvalue) {
   union semun {
      int val;
      struct semid_ds *buf;
      unsigned short *array;
   } arg;
   arg.val = semvalue;
   return semctl(semid, semnum, SETVAL, arg);

The semid and semnum parameters identify the semaphore set and the element within the set whose value is to be set to semvalue.

If successful, initelement returns 0. If unsuccessful, initelement returns –1 with errno set (since semctl sets errno).

Example 15.6. removesem.c

The removesem function deletes the semaphore specified by semid.

#include <sys/sem.h>

int removesem(int semid) {
   return semctl(semid, 0, IPC_RMID);

If successful, removesem returns 0. If unsuccessful, removesem returns –1 with errno set (since semctl sets errno).

POSIX semaphore set operations

The semop function atomically performs a user-defined collection of semaphore operations on the semaphore set associated with identifier semid. The sops parameter points to an array of element operations, and the nsops parameter specifies the number of element operations in the sops array.

  #include <sys/sem.h>

  int semop(int semid, struct sembuf *sops, size_t nsops);


If successful, semop returns 0. If unsuccessful, semop returns –1 and sets errno. The following table lists the mandatory errors for semop.




value of nsops is too big


operation is denied to the caller


operation would block the process but (sem_flg & IPC_NOWAIT) != 0


value of sem_num for one of the sops entries is less than 0 or greater than the number elements in the semaphore set


semaphore identifier semid has been removed from the system


semop was interrupted by a signal


value of semid is invalid, or number of individual semaphores for a SEM_UNDO has exceeded limit


limit on processes requesting SEM_UNDO has been exceeded


operation would cause an overflow of a semval or semadj value

The semop function performs all the operations specified in sops array atomically on a single semaphore set. If any of the individual element operations would cause the process to block, the process blocks and none of the operations are performed.

The struct sembuf structure, which specifies a semaphore element operation, includes the following members.

short sem_num

number of the semaphore element

short sem_op

particular element operation to be performed

short sem_flg

flags to specify options for the operation

The sem_op element operations are values specifying the amount by which the semaphore value is to be changed.

  • If sem_op is an integer greater than zero, semop adds the value to the corresponding semaphore element value and awakens all processes that are waiting for the element to increase.

  • If sem_op is 0 and the semaphore element value is not 0, semop blocks the calling process (waiting for 0) and increments the count of processes waiting for a zero value of that element.

  • If sem_op is a negative number, semop adds the sem_op value to the corresponding semaphore element value provided that the result would not be negative. If the operation would make the element value negative, semop blocks the process on the event that the semaphore element value increases. If the resulting value is 0, semop wakes the processes waiting for 0.

The description of semop assumes that sem_flg is 0 for all the element operations. If sem_flg & IPC_NOWAIT is true, the element operation never causes the semop call to block. If a semop returns because it would have blocked on that element operation, it returns –1 with errno set to EAGAIN. If sem_flg & SEM_UNDO is true, the function also modifies the semaphore adjustment value for the process. This adjustment value allows the process to undo its effect on the semaphore when it exits. You should read the man page carefully regarding the interaction of semop with various settings of the flags.

Example 15.7. 

What is wrong with the following code to declare myopbuf and initialize it so that sem_num is 1, sem_op is 1, and sem_flg is 0?

struct sembuf myopbuf = {1, -1, 0};


The direct assignment assumes that the members of struct sembuf appear in the order sem_num, sem_op and sem_flg. You may see this type of initialization in legacy code and it may work on your system, but try to avoid it. Although the POSIX:XSI Extension specifies that the struct sembuf structure has sem_num, sem_op and sem_flg members, the standard does not specify the order in which these members appear in the definition nor does the standard restrict struct sembuf to contain only these members.

Example 15.8. setsembuf.c

The function setsembuf initializes the struct sembuf structure members sem_num, sem_op and sem_flg in an implementation-independent manner.

#include <sys/sem.h>

void setsembuf(struct sembuf *s, int num, int op, int flg) {
   s->sem_num = (short)num;
   s->sem_op = (short)op;
   s->sem_flg = (short)flg;

Example 15.9. 

The following code segment atomically increments element zero of semid by 1 and element one of semid by 2, using setsembuf of Example 15.8.

struct sembuf myop[2];

setsembuf(myop, 0, 1, 0);
setsembuf(myop + 1, 1, 2, 0);
if (semop(semid, myop, 2) == -1)
   perror("Failed to perform semaphore operation");

Example 15.10. 

Suppose a two-element semaphore set, S, represents a tape drive system in which Process 1 uses Tape A, Process 2 uses Tape A and B, and Process 3 uses Tape B. The following pseudocode segment defines semaphore operations that allow the processes to access one or both tape drives in a mutually exclusive manner.

struct sembuf get_tapes[2];
struct sembuf release_tapes[2];

setsembuf(&(get_tapes[0]), 0, -1, 0);
setsembuf(&(get_tapes[1]), 1, -1, 0);
setsembuf(&(release_tapes[0]), 0, 1, 0);
setsembuf(&(release_tapes[1]), 1, 1, 0);

Process 1:     semop(S, get_tapes, 1);
           <use tape A>
           semop(S, release_tapes, 1);

Process 2: semop(S, get_tapes, 2);
           <use tapes A and B>
           semop(S, release_tapes, 2);

Process 3: semop(S, get_tapes + 1, 1);
           <use tape B>
           semop(S, release_tapes + 1, 1);

S[0] represents tape A, and S[1] represents tape B. We assume that both elements of S have been initialized to 1.

If semop is interrupted by a signal, it returns –1 and sets errno to EINTR. Program 15.2 shows a function that restarts semop if it is interrupted by a signal.

Example 15.2. r_semop.c

A function that restarts semop after a signal.

#include <errno.h>
#include <sys/sem.h>

int r_semop(int semid, struct sembuf *sops, int nsops) {
   while (semop(semid, sops, nsops) == -1)
      if (errno != EINTR)
         return -1;
   return 0;

Program 15.3 modifies Program 14.1 to use POSIX:XSI semaphore sets to protect a critical section. Program 15.3 calls setsembuf (Example 15.8) and removesem (Example 15.6). It restarts semop operations if interrupted by a signal, even though the program does not catch any signals. You should get into the habit of restarting functions that can set errno equal to EINTR.

Once the semaphore of Program 15.3 is created, it persists until it is removed. If a child process generates an error, it just exits. If the parent generates an error, it falls through to the wait call and then removes the semaphore. A program that creates a semaphore for its own use should be sure to remove the semaphore before the program terminates. Be careful to remove the semaphore exactly once.

Example 15.3. chainsemset.c

A modification of Program 14.1 that uses semaphore sets to protect the critical section.

#include <errno.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/sem.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include "restart.h"
#define BUFSIZE 1024

int initelement(int semid, int semnum, int semvalue);
int r_semop(int semid, struct sembuf *sops, int nsops);
int removesem(int semid);
void setsembuf(struct sembuf *s, int num, int op, int flg);

void printerror(char *msg, int error) {
   fprintf(stderr, "[%ld] %s: %s
", (long)getpid(), msg, strerror(error));

int main (int argc, char *argv[]) {
   char buffer[MAX_CANON];
   char *c;
   pid_t childpid;
   int delay;
   int error;
   int i, j, n;
   int semid;
   struct sembuf semsignal[1];
   struct sembuf semwait[1];

   if ((argc != 3) || ((n = atoi(argv[1])) <= 0) ||
        ((delay = atoi(argv[2])) < 0))  {
      fprintf (stderr, "Usage: %s processes delay
", argv[0]);
      return 1;
                        /* create a semaphore containing a single element */
   if ((semid = semget(IPC_PRIVATE, 1, PERMS)) == -1) {
      perror("Failed to create a private semaphore");
      return 1;
   setsembuf(semwait, 0, -1, 0);                   /* decrement element 0 */
   setsembuf(semsignal, 0, 1, 0);                  /* increment element 0 */
   if (initelement(semid, 0, 1) == -1) {
      perror("Failed to initialize semaphore element to 1");
      if (removesem(semid) == -1)
         perror("Failed to remove failed semaphore");
      return 1;
   for (i = 1; i < n; i++)
      if (childpid = fork())
   snprintf(buffer, BUFSIZE, "i:%d PID:%ld  parent PID:%ld  child PID:%ld
           i, (long)getpid(), (long)getppid(), (long)childpid);
   c = buffer;
   /******************** entry section ************************************/
   if (((error = r_semop(semid, semwait, 1)) == -1) && (i > 1)) {
      printerror("Child failed to lock semid", error);
      return 1;
   else if (!error) {
      /***************** start of critical section ************************/
      while (*c != '') {
         fputc(*c, stderr);
         for (j = 0; j < delay; j++) ;
      /***************** exit section ************************************/
      if ((error = r_semop(semid, semsignal, 1)) == -1)
         printerror("Failed to unlock semid", error);
   /******************** remainder section *******************************/
   if ((r_wait(NULL) == -1) && (errno != ECHILD))
      printerror("Failed to wait", errno);
   if ((i == 1) && ((error = removesem(semid)) == -1)) {
      printerror("Failed to clean up", error);
      return 1;
   return 0;

A program calls semget to create or access a semaphore set and calls semctl to initialize it. If one process creates and initializes a semaphore and another process calls semop between the creation and initialization, the results of the execution are unpredictable. This unpredictability is an example of a race condition because the occurrence of the error depends on the precise timing between instructions in different processes. Program 15.3 does not have a race condition because the original parent creates and initializes the semaphore before doing a fork. The program avoids a race condition because only the original process can access the semaphore at the time of creation. One of the major problems with semaphore sets is that the creation and initialization are separate operations and therefore not atomic. Recall that POSIX:SEM named and unnamed semaphores are initialized at the time of creation and do not have this problem.

Program 15.4 can be used to create or access a semaphore set containing a single semaphore element. It takes three parameters, a semaphore key, an initial value and a pointer to a variable of type sig_atomic_t that is initialized to 0 and shared among all processes and threads that call this function. If this function is used among threads of a single process, the sig_atomic_t variable could be defined outside a block and statically initialized. Using initsemset among processes requires shared memory. We use Program 15.4 later in the chapter to protect a shared memory segment. The busy-waiting used in initsemset is not as inefficient as it may seem, since it is only used when the thread that creates the semaphore set loses the CPU before it can initialize it.

Example 15.4. initsemset.c

A function that creates and initializes a semaphore set containing a single semaphore.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <time.h>
#include <sys/sem.h>
#include <sys/stat.h>
#define TEN_MILLION 10000000L
int initelement(int semid, int semnum, int semvalue);

int initsemset(key_t mykey, int value, sig_atomic_t *readyp) {
   int semid;
   struct timespec sleeptime;

   sleeptime.tv_sec = 0;
   sleeptime.tv_nsec = TEN_MILLION;
   semid = semget(mykey, 2, PERMS | IPC_CREAT | IPC_EXCL);
   if ((semid == -1) && (errno != EEXIST))         /* real error, so return */
      return -1;
   if (semid >= 0) {          /* we created the semaphore, so initialize it */
      if (initelement(semid, 0, value) == -1)
         return -1;
      *readyp = 1;
      return semid;
   if ((semid = semget(mykey, 2, PERMS)) == -1)           /* just access it */
      return -1;
   while (*readyp == 0)                            /* wait for initialization */
      nanosleep(&sleeptime, NULL);
   return semid;

POSIX:XSI Shared Memory

Shared memory allows processes to read and write from the same memory segment. The sys/shm.h header file defines the data structures for shared memory, including shmid_ds, which has the following members.

struct ipc_perm shm_perm; /* operation permission structure */
size_t shm_segsz;         /* size of segment in bytes */
pid_t shm_lpid;           /* process ID of last operation */
pid_t shm_cpid;           /* process ID of creator */
shmatt_t shm_nattch;      /* number of current attaches */
time_t shm_atime;         /* time of last shmat */
time_t shm_dtime;         /* time of last shmdt */
time_t shm_ctime;         /* time of last shctl */

The shmatt_t data type is an unsigned integer data type used to hold the number of times the memory segment is attached. This type must be at least as large as an unsigned short.

Accessing a shared memory segment

The shmget function returns an identifier for the shared memory segment associated with the key parameter. It creates the segment if either the key is IPC_PRIVATE or shmflg & IPC_CREAT is nonzero and no shared memory segment or identifier is already associated with key. Shared memory segments are initialized to zero.


  #include <sys/shm.h>

  int shmget(key_t key, size_t size, int shmflg);

If successful, shmget returns a nonnegative integer corresponding to the shared memory segment identifier. If unsuccessful, shmget returns –1 and sets errno. The following table lists the mandatory errors for shmget.




shared memory identifier exists for key but permissions are not granted


shared memory identifier exists for key but ((shmflg & IPC_CREAT) && (shmflg & IPC_EXCL)) != 0


shared memory segment is to be created but size is invalid


no shared memory segment is to be created but size is inconsistent with system-imposed limits or with the segment size of key


shared memory identifier does not exist for key but (shmflg & IPC_CREAT) == 0


not enough memory to create the specified shared memory segment


systemwide limit on shared memory identifiers would be exceeded

Attaching and detaching a shared memory segment

The shmat function attaches the shared memory segment specified by shmid to the address space of the calling process and increments the value of shm_nattch for shmid. The shmat function returns a void * pointer, so a program can use the return value like an ordinary memory pointer obtained from malloc. Use a shmaddr value of NULL. On some systems it may be necessary to set shmflg so that the memory segment is properly aligned.


  #include <sys/shm.h>

  void *shmat(int shmid, const void *shmaddr, int shmflg);

If successful, shmat returns the starting address of the segment. If unsuccessful, shmat returns –1 and sets errno. The following table lists the mandatory errors for shmat.




operation permission denied to caller


value of shmid or shmaddr is invalid


number of shared memory segments attached to process would exceed limit


process data space is not large enough to accommodate the shared memory segment

When finished with a shared memory segment, a program calls shmdt to detach the shared memory segment and to decrement shm_nattch. The shmaddr parameter is the starting address of the shared memory segment.


  #include <sys/shm.h>

  int shmdt(const void *shmaddr);

If successful, shmdt returns 0. If unsuccessful, shmdt returns –1 and sets errno. The shmdt function sets errno to EINVAL when shmaddr does not correspond to the starting address of a shared memory segment.

The last process to detach the segment should deallocate the shared memory segment by calling shmctl.

Controlling shared memory

The shmctl function provides a variety of control operations on the shared memory segment shmid as specified by the cmd parameter. The interpretation of the buf parameter depends on the value of cmd, as described below.


  #include <sys/shm.h>

  int shmctl(int shmid, int cmd, struct shmid_ds *buf);

If successful, shmctl returns 0. If unsuccessful, shmctl returns –1 and sets errno. The following table lists the mandatory errors for shmctl.




cmd is IPC_STAT and caller does not have read permission


value of shmid or cmd is invalid


cmd is IPC_RMID or IPC_SET and caller does not have correct permissions

Table 15.3 gives the POSIX:XSI values of cmd for shmctl.

Table 15.3. POSIX:XSI values of cmd for shmctl.




remove shared memory segment shmid and destroy corresponding shmid_ds


set values of fields for shared memory segment shmid from values found in buf


copy current values for shared memory segment shmid into buf

Example 15.11. detachandremove.c

The detachandremove function detaches the shared memory segment shmaddr and then removes the shared memory segment specified by semid.

#include <stdio.h>
#include <errno.h>
#include <sys/shm.h>

int detachandremove(int shmid, void *shmaddr) {
   int error = 0;

   if (shmdt(shmaddr) == -1)
      error = errno;
   if ((shmctl(shmid, IPC_RMID, NULL) == -1) && !error)
      error = errno;
   if (!error)
      return 0;
   errno = error;
   return -1;

Shared memory examples

Program 4.11 on page 108 monitors two file descriptors by using a parent and a child. Each process echoes the contents of the files to standard output and then writes to standard error the total number of bytes received. There is no simple way for this program to report the total number of bytes received by the two processes without using a communication mechanism such as a pipe.

Program 15.5 modifies Program 4.11 so that the parent and child share a small memory segment. The child stores its byte count in the shared memory. The parent waits for the child to finish and then outputs the number of bytes received by each process along with the sum of these values. The parent creates the shared memory segment by using the key IPC_PRIVATE, which allows the memory to be shared among its children. The synchronization of the shared memory is provided by the wait function. The parent does not access the shared memory until it has detected the termination of the child. Program 15.5 calls detachandremove of Example 15.11 when it must both detach and remove the shared memory segment.

Example 15.5. monitorshared.c

A program to monitor two file descriptors and keep information in shared memory. The parent waits for the child, to ensure mutual exclusion.

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/shm.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include "restart.h"
#define PERM (S_IRUSR | S_IWUSR)

int detachandremove(int shmid, void *shmaddr);

int main(int argc, char *argv[]) {
   int bytesread;
   int childpid;
   int fd, fd1, fd2;
   int id;
   int *sharedtotal;
   int totalbytes = 0;

   if (argc != 3) {
      fprintf(stderr, "Usage: %s file1 file2
", argv[0]);
      return 1;
   if (((fd1 = open(argv[1], O_RDONLY)) == -1) ||
       ((fd2 = open(argv[2], O_RDONLY)) == -1)) {
      perror("Failed to open file");
      return 1;
   if ((id = shmget(IPC_PRIVATE, sizeof(int), PERM)) == -1) {
      perror("Failed to create shared memory segment");
      return 1;
   if ((sharedtotal = (int *)shmat(id, NULL, 0)) == (void *)-1) {
      perror("Failed to attach shared memory segment");
      if (shmctl(id, IPC_RMID, NULL) == -1)
         perror("Failed to  remove memory segment");
      return 1;
   if ((childpid = fork()) == -1) {
      perror("Failed to create child process");
      if (detachandremove(id, sharedtotal) == -1)
         perror("Failed to destroy shared memory segment");
      return 1;
   if (childpid > 0)                                         /* parent code */
      fd = fd1;
      fd = fd2;
   while ((bytesread = readwrite(fd, STDOUT_FILENO)) > 0)
      totalbytes += bytesread;
   if (childpid == 0) {                                      /* child code */
      *sharedtotal = totalbytes;
      return 0;
   if (r_wait(NULL) == -1)
      perror("Failed to wait for child");
   else {
      fprintf(stderr, "Bytes copied: %8d by parent
", totalbytes);
      fprintf(stderr, "              %8d by child
", *sharedtotal);
      fprintf(stderr, "              %8d total
", totalbytes + *sharedtotal);
   if (detachandremove(id, sharedtotal) == -1) {
      perror("Failed to destroy shared memory segment");
      return 1;
   return 0;

Using shared memory between processes that do not have a common ancestor requires the processes to agree on a key, either directly or with ftok and a pathname.

Program 13.5 on page 456 used mutex locks to keep a sum and count for threads of a given process. This was particularly simple because the threads automatically share the mutex and the mutex could be initialized statically. Implementing synchronized shared memory for independent processes is more difficult because you must set up the sharing of the synchronization mechanism as well as the memory for the sum and the count.

Program 15.6 uses a semaphore and a small shared memory segment to keep a sum and count. Each process must first call the initshared function with an agreed-on key. This function first tries to create a shared memory segment with the given key. If successful, initshared initializes the sum and count. Otherwise, initshared just accesses the shared memory segment. In either case, initshared calls initsemset with the ready flag in shared memory to access a semaphore set containing a single semaphore initialized to 1. This semaphore element protects the shared memory segment. The add and getcountandsum functions behave as in Program 13.5, this time using the semaphore, rather than a mutex, for protection.

Example 15.6. sharedmemsum.c

A function that keeps a synchronized sum and count in shared memory.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <sys/sem.h>
#include <sys/shm.h>
#include <sys/stat.h>
#define PERM (S_IRUSR | S_IWUSR)

int initsemset(key_t mykey, int value, sig_atomic_t *readyp);
void setsembuf(struct sembuf *s, int num, int op, int flg);

typedef struct {
   int count;
   double sum;
   sig_atomic_t ready;
} shared_sum_t;

static int semid;
static struct sembuf semlock;
static struct sembuf semunlock;
static shared_sum_t *sharedsum;

int initshared(int key) {              /* initialize shared memory segment */
   int shid;

   setsembuf(&semlock, 0, -1, 0);         /* setting for locking semaphore */
   setsembuf(&semunlock, 0, 1, 0);      /* setting for unlocking semaphore */
                          /* get attached memory, creating it if necessary */
   shid = shmget(key, sizeof(shared_sum_t), PERM | IPC_CREAT | IPC_EXCL);
   if ((shid == -1) && (errno != EEXIST))                    /* real error */
      return -1;
   if (shid == -1) {              /* already created, access and attach it */
      if (((shid = shmget(key, sizeof(shared_sum_t), PERM)) == -1) ||
          ((sharedsum = (shared_sum_t *)shmat(shid, NULL, 0)) == (void *)-1) )
         return -1;
   else {    /* successfully created, must attach and initialize variables */
      sharedsum = (shared_sum_t *)shmat(shid, NULL, 0);
      if (sharedsum == (void *)-1)
         return -1;
      sharedsum -> count = 0;
      sharedsum -> sum = 0.0;
   semid = initsemset(key, 1, &sharedsum->ready);
   if (semid == -1)
      return -1;
   return 0;

int add(double x) {                                       /* add x to sum */
   if (semop(semid, &semlock, 1) == -1)
      return -1;
   sharedsum -> sum += x;
   sharedsum -> count++;
   if (semop(semid, &semunlock, 1) == -1)
      return -1;
   return 0;

int getcountandsum(int *countp, double *sum) {    /* return sum and count */
   if (semop(semid, &semlock, 1) == -1)
      return -1;
   *countp = sharedsum -> count;
   *sum = sharedsum -> sum;
   if (semop(semid, &semunlock, 1) == -1)
      return -1;
   return 0;

Each process must call initshared at least once before calling add or getcountandsum. A process may call initshared more than once, but one thread of the process should not call initshared while another thread of the same process is calling add or getcountandsum.

Example 15.12. 

In Program 15.6, the three fields of the shared memory segment are treated differently. The sum and count are explicitly initialized to 0 whereas the function relies on the fact that ready is initialized to 0 when the shared memory segment is created. Why is it done this way?


All three fields are initialized to 0 when the shared memory segment is created, so in this case the explicit initialization is not necessary. The program relies on the atomic nature of the creation and initialization of ready to 0, but sum and count can be initialized to any values.

Program 15.7 displays the shared count and sum when it receives a SIGUSR1 signal. The signal handler is allowed to use fprintf for output, even though it might not be async-signal safe, since no output is done by the main program after the signal handler is set up and the signal is unblocked.

Program 15.8 modifies Program 15.5 by copying information from a single file to standard output and saving the number of bytes copied in a shared sum implemented by Program 15.6. Program 15.8 has two command-line arguments: the name of the file; and the key identifying the shared memory and its protecting semaphore. You can run multiple copies of Program 15.8 simultaneously with different filenames and the same key. The common shared memory stores the total number of bytes copied.

Example 15.7. showshared.c

A program to display the shared count and sum when it receives a SIGUSR1 signal.

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int getcountandsum(int *countp, double *sump);
int initshared(int key);

static void showit(int signo) {
   int count;
   double sum;
   if (getcountandsum(&count, &sum) == -1)
      printf("Failed to get count and sum
      printf("Sum is %f and count is %d
", sum, count);

int main(int argc, char *argv[]) {
   struct sigaction act;
   int key;
   sigset_t mask, oldmask;

   if (argc != 2) {
      fprintf(stderr, "Usage: %s key
", argv[0]);
      return 1;
   key = atoi(argv[1]);
   if (initshared(key) == -1) {
      perror("Failed to initialize shared memory");
      return 1;
   if ((sigfillset(&mask) == -1) ||
       (sigprocmask(SIG_SETMASK, &mask, &oldmask) == -1)) {
      perror("Failed to block signals to set up handlers");
      return 1;
   printf("This is process %ld waiting for SIGUSR1 (%d)
           (long)getpid(), SIGUSR1);

   act.sa_handler = showit;
   act.sa_flags = 0;
   if ((sigemptyset(&act.sa_mask) == -1) ||
       (sigaction(SIGUSR1, &act, NULL) == -1)) {
      perror("Failed to set up signal handler");
      return 1;
   if (sigprocmask(SIG_SETMASK, &oldmask, NULL) == -1) {
      perror("Failed to unblock signals");
      return 1;
   for ( ; ; )

Example 15.8. monitoroneshared.c

A program to monitor one file and send the output to standard output. It keeps track of the number of bytes received by calling add from Program 15.6.

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "restart.h"

int add(double x);
int initshared(int key);

int main(int argc, char *argv[]) {
    int bytesread;
    int fd;
    int key;

    if (argc != 3) {
        fprintf(stderr,"Usage: %s file key
        return 1;
    if ((fd = open(argv[1],O_RDONLY)) == -1) {
        perror("Failed to open file");
        return 1;
    key = atoi(argv[2]);
    if (initshared(key) == -1) {
        perror("Failed to initialize shared sum");
        return 1;
    while ((bytesread = readwrite(fd, STDOUT_FILENO)) > 0)
        if (add((double)bytesread) == -1) {
            perror("Failed to add to count");
            return 1;
    return 0;

Example 15.13. 

Start Program 15.7 in one window, using key 12345, with the following command.

showshared 12345

Create a few named pipes, say, pipe1 and pipe2. Start copies of monitoroneshared in different windows with the following commands.

monitoroneshared pipe1 12345
monitoroneshared pipe2 12345

In other windows, send characters to the pipes (e.g., cat > pipe1). Periodically send SIGUSR1 signals to showshared to monitor the progress.

POSIX:XSI Message Queues

The message queue is a POSIX:XSI interprocess communication mechanism that allows a process to send and receive messages from other processes. The data structures for message queues are defined in sys/msg.h. The major data structure for message queues is msqid_ds, which has the following members.

struct ipc_perm msg_perm; /* operation permission structure */
msgqnum_t msg_qnum;       /* number of messages currently in queue */
msglen_t msg_qbytes;      /* maximum bytes allowed in queue */
pid_t msg_lspid;          /* process ID of msgsnd */
pid_t msg_lrpid;          /* process ID of msgrcv */
time_t msg_stime;         /* time of last msgsnd */
time_t msg_rtime;         /* time of last msgrcv */
time_t msg_ctime;         /* time of last msgctl */

The msgqnum_t data type holds the number of messages in the message queue; the msglen_t type holds the number of bytes allowed in a message queue. Both types must be at least as large as an unsigned short.

Accessing a message queue

The msgget function returns the message queue identifier associated with the key parameter. It creates the identifier if either the key is IPC_PRIVATE or msgflg & IPC_CREAT is nonzero and no message queue or identifier is already associated with key.


    #include <sys/msg.h>

    int msgget(key_t key, int msgflg);

If successful, msgget returns a nonnegative integer corresponding to the message queue identifier. If unsuccessful, msgget returns –1 and sets errno. The following table lists the mandatory errors for msgget.




message queue exists for key, but permission denied


message queue exists for key, but ((msgflg & IPC_CREAT) && (msgflg & IPC_EXCL)) != 0


message queue does not exist for key, but (msgflg & IPC_CREAT) == 0


systemwide limit on message queues would be exceeded

Example 15.14. 

Create a new message queue.


int msqid;
if ((msqid = msgget(IPC_PRIVATE, PERMS)) == -1)
   perror("Failed to create new private message queue");

After obtaining access to a message queue with msgget, a program inserts messages into the queue with msgsnd. The msqid parameter identifies the message queue, and the msgp parameter points to a user-defined buffer that contains the message to be sent, as described below. The msgsz parameter specifies the actual size of the message text. The msgflg parameter specifies actions to be taken under various conditions.


   #include <sys/msg.h>

   int msgsnd(int msqid, const void *msgp, size_t msgsz, int msgflg);

If successful, msgsnd returns 0. If unsuccessful, msgsnd returns –1 and sets errno. The following table lists the mandatory errors for msgsnd.




operation is denied to the caller


operation would block the process, but (msgflg & IPC_NOWAIT) != 0


msqid has been removed from the system


msgsnd was interrupted by a signal


msqid is invalid, the message type is < 1, or msgsz is out of range

The msgp parameter points to a user-defined buffer whose first member must be a long specifying the type of message, followed by space for the text of the message. The structure might be defined as follows.

struct mymsg{
   long mtype;    /* message type */
   char mtext[1]; /* message text */
} mymsg_t;

The message type must be greater than 0. The user can assign message types in any way appropriate to the application.

Here are the steps needed to send the string mymessage to a message queue.

  1. Allocate a buffer, mbuf, which is of type mymsg_t and size

    sizeof(mymsg_t) + strlen(mymessage).
  2. Copy mymessage into the mbuf->mtext member.

  3. Set the message type in the mbuf->mtype member.

  4. Send the message.

  5. Free mbuf.

Remember to check for errors and to free mbuf if an error occurs. Code for this is provided in Program 15.9, discussed later.

A program can remove a message from a message queue with msgrcv. The msqid parameter identifies the message queue, and the msgp parameter points to a user-defined buffer for holding the message to be retrieved. The format of msgp is as described above for msgsnd. The msgsz parameter specifies the actual size of the message text. The msgtyp parameter can be used by the receiver for message selection. The msgflg specifies actions to be taken under various conditions.


   #include <sys/msg.h>

   ssize_t msgrcv(int msqid, void *msgp, size_t msgsz,
                  long msgtyp, int msgflg);

If successful, msgrcv returns the number of bytes in the text of the message. If unsuccessful, msgrcv returns (ssize_t) –1 and sets errno. The following table lists the mandatory errors for msgrcv.




value of the mtext member of msgp is greater than msgsize and (msgflg & MSG_NOERROR) == 0


operation is denied to the caller


msqid has been removed from the system


msgrcv was interrupted by a signal


value of msqid is invalid


queue does not contain a message of requested type and (msgflg & IPC_NOWAIT) != 0

Table 15.4 shows how msgrcv uses the msgtyp parameter to determine the order in which it removes messages from the queue.

Use msgctl to deallocate or change permissions for the message queue identified by msqid. The cmd parameter specifies the action to be taken as listed in Table 15.5. The msgctl function uses its buf parameter to write or read state information, depending on cmd.

Table 15.4. The POSIX:XSI values for the msgtyp parameter determine the order in which msgrcv removes messages from the queue.




remove first message from queue

> 0

remove first message of type msgtyp from the queue

< 0

remove first message of lowest type that is less than or equal to the absolute value of msgtyp

Table 15.5. POSIX:XSI values for the cmd parameter of msgctl.




remove the message queue msqid and destroy the corresponding msqid_ds


set members of the msqid_ds data structure from buf


copy members of the msqid_ds data structure into buf


    #include <sys/msg.h>

    int msgctl(int msqid, int cmd, struct msqid_ds *buf);

If successful, msgctl returns 0. If unsuccessful, msgctl returns –1 and sets errno. The following table lists the mandatory errors for msgctl.




cmd is IPC_STAT and the caller does not have read permission


msqid or cmd is invalid


cmd is IPC_RMID or IPC_SET and caller does not have privileges

Program 15.9 contains utilities for accessing a message queue similar to that of Program 15.6, but simpler because no initialization or synchronization is needed. Each process should call the initqueue function before accessing the message queue. The msgprintf function has syntax similar to printf for putting formatted messages in the queue. The msgwrite function is for unformatted messages. Both msgprintf and msgwrite allocate memory for each message and free this memory after calling msgsnd. The removequeue function removes the message queue and its associated data structures. The msgqueuelog.h header file contains the prototypes for these functions. If successful, these functions return 0. If unsuccessful, these functions return –1 and set errno.

Example 15.9. msgqueuelog.c

Utility functions that access and output to a message queue.

#include <errno.h>
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <sys/msg.h>
#include <sys/stat.h>
#include "msgqueuelog.h"
#define PERM (S_IRUSR | S_IWUSR)

typedef struct {
   long mtype;
   char mtext[1];
} mymsg_t;
static int queueid;

int initqueue(int key) {                    /* initialize the message queue */
   queueid = msgget(key, PERM | IPC_CREAT);
   if (queueid == -1)
      return -1;
   return 0;

int msgprintf(char *fmt, ...) {               /* output a formatted message */
   va_list ap;
   char ch;
   int error = 0;
   int len;
   mymsg_t *mymsg;

   va_start(ap, fmt);                       /* set up the format for output */
   len = vsnprintf(&ch, 1, fmt, ap);              /* how long would it be ? */
   if ((mymsg = (mymsg_t *)malloc(sizeof(mymsg_t) + len)) == NULL)
      return -1;
   vsprintf(mymsg->mtext, fmt, ap);                 /* copy into the buffer */
   mymsg->mtype = 1;                            /* message type is always 1 */
   if (msgsnd(queueid, mymsg, len + 1, 0) == -1)
      error = errno;
   if (error) {
      errno = error;
      return -1;
   return 0;

int msgwrite(void *buf, int len) {     /* output buffer of specified length */
   int error = 0;
   mymsg_t *mymsg;

   if ((mymsg = (mymsg_t *)malloc(sizeof(mymsg_t) + len - 1)) == NULL)
      return -1;
   memcpy(mymsg->mtext, buf, len);
   mymsg->mtype = 1;                            /* message type is always 1 */
   if (msgsnd(queueid, mymsg, len, 0) == -1)
      error = errno;
   if (error) {
      errno = error;
      return -1;
   return 0;

int remmsgqueue(void) {
   return msgctl(queueid, IPC_RMID, NULL);

Example 15.15. 

Why does the msgprintf function of Program 15.9 use len in malloc and len+1 in msgsnd?


The vsnprintf function returns the number of bytes to be formatted, not including the string terminator, so len is the string length. We need one extra byte for the string terminator. One byte is already included in mymsg_t.

Program 15.10, which outputs the contents of a message queue to standard output, can save the contents of a message queue to a file through redirection. The msgqueuesave program takes a key that identifies the message queue as a command-line argument and calls the initqueue function of Program 15.9 to access the queue. The program then outputs the contents of the queue to standard output until an error occurs. Program 15.10 does not deallocate the message queue when it completes.

Program 15.11 reads lines from standard input and sends each to the message queue. The program takes a key as a command-line argument and calls initqueue to access the corresponding message queue. Program 15.11 sends an informative message containing its process ID before starting to copy from standard input.

You should be able to run multiple copies of Program 15.11 along with a single copy of Program 15.10. Since none of the programs call removequeue, be sure to execute the ipcrm command when you finish.

Example 15.16. 

Why does Program 15.10 use r_write from the restart library even though the program does not catch any signals?


In addition to restarting when interrupted by a signal (which is not necessary here), r_write continues writing if write did not output all of the requested bytes.

Example 15.17. 

How would you modify these programs so that messages from different processes could be distinguished?


Modify the functions in Program 15.9 to send the process ID as the message type. Modify Program Program 15.10 to output the message type along with the message.

Example 15.10. msgqueuesave.c

A program that copies messages from a message queue to standard output.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/msg.h>
#include "msgqueuelog.h"
#include "restart.h"
#define MAXSIZE 4096
typedef struct {
   long mtype;
   char mtext[MAXSIZE];
} mymsg_t;

int main(int argc, char *argv[]) {
   int id;
   int key;
   mymsg_t mymsg;
   int size;

   if (argc != 2) {
      fprintf(stderr, "Usage: %s key
", argv[0]);
      return 1;
   key = atoi(argv[1]);
   if ((id = initqueue(key)) == -1) {
      perror("Failed to initialize message queue");
      return 1;
   for ( ; ; ) {
      if ((size = msgrcv(id, &mymsg, MAXSIZE, 0, 0)) == -1) {
         perror("Failed to read message queue");
      if (r_write(STDOUT_FILENO, mymsg.mtext, size) == -1) {
         perror("Failed to write to standard output");
   return 1;

Example 15.11. msgqueuein.c

A program that sends standard input to a message queue.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/msg.h>
#include <unistd.h>
#include "msgqueuelog.h"
#include "restart.h"
#define MAXLINE 1024

int main(int argc, char *argv[]) {
   char buf[MAXLINE];
   int key;
   int size;

   if (argc != 2) {
      fprintf(stderr, "Usage: %s key
", argv[0]);
      return 1;
   key = atoi(argv[1]);
   if (initqueue(key) == -1) {
      perror("Failed to initialize message queue");
      return 1;
   if (msgprintf("This is process %ld
", (long)getpid()) == -1) {
      perror("Failed to write header to message queue");
      return 1;
   for ( ; ; ) {
      if ((size = readline(STDIN_FILENO, buf, MAXLINE)) == -1) {
         perror("Failed to read from standard input");
      if (msgwrite(buf, size) == -1) {
         perror("Failed to write message to standard output");
   return 0;

Exercise: POSIX Unnamed Semaphores

This exercise describes an implementation of POSIX:SEM-like unnamed semaphores in terms of semaphore sets. Represent the unnamed semaphore by a data structure of type mysem_t, which for this exercise is simply an int. The mysem.h header file should contain the definition of mysem_t and the prototypes for the semaphore functions.

int mysem_init(mysem_t *sem, int pshared, unsigned int value);
int mysem_destroy(mysem_t *sem);
int mysem_wait(mysem_t *sem);
int mysem_post(mysem_t *sem);

All these functions return 0 if successful. On error, they return –1 and set errno appropriately. Actually, the last point is a little subtle. It will probably turn out that the only statements that can cause an error are the semaphore set calls and they set errno. If that is the case, the functions return the correct errno value as long as there are no intervening functions that might set errno.

Assume that applications call mysem_init before creating any threads. The mysem_t value is the semaphore ID of a semaphore set. Ignore the value of pshared, since semaphore sets are sharable among processes. Use a key of IPC_PRIVATE.

Implement the mysem_wait and mysem_post directly with calls to semop. The details will depend on how sem_init initializes the semaphore. Implement mysem_destroy with a call to semctl.

Test your implementation with Programs 14.5 and 14.6 to see that it enforces mutual exclusion.

Before logging out, use ipcs -s from the command line. If semaphores still exist (because of a program bug), delete each of them, using the following command.

ipcrm -s n

This command deletes the semaphore with ID n. The semaphore should be created only once by the test program. It should also be deleted only once, not by all the children in the process chain.

Exercise: POSIX Named Semaphores

This exercise describes an implementation of POSIX:SEM-like named semaphores in terms of semaphores sets. Represent the named semaphore by a structure of type mysem_t. The mysemn.h file should include the definition of mysem_t and the prototypes of the following functions.

mysem_t *mysem_open(const char *name, int oflag, mode_t mode,
                     unsigned int value);
int mysem_close(mysem_t *sem);
int mysem_unlink(const char *name);
int mysem_wait(mysem_t *sem);
int mysem_post(mysem_t *sem);

The mysem_open function returns NULL and sets errno when there is an error. All the other functions return –1 and set errno when there is an error. To simplify the interface, always call mysem_open with four parameters.

Represent the named semaphore by an ordinary file that contains the semaphore ID of the semaphore set used to implement the POSIX semaphore. First try to open the file with open, using O_CREAT | O_EXCL. If you created the file, use fdopen to get a FILE pointer for the file. Allocate the semaphore set and store the ID in the file. If the file already exists, open the file for reading with fopen. In either case, return the file pointer. The mysem_t data type will just be the type FILE.

The mysem_close function makes the semaphore inaccessible to the caller by closing the file. The mysem_unlink function deletes the semaphore and its corresponding file. The mysem_wait function decrements the semaphore, and the mysem_post function increments the semaphore. Each function reads the semaphore ID from the file by first calling rewind and then reading an integer. It is possible to get an end-of-file if the process that created the semaphore has not yet written to the file. In this case, try again.

Put all the semaphore functions in a separate library and treat this as an object in which the only items with external linkage are the five functions listed above. Do not worry about race conditions in using mysem_open to create the file until a rudimentary version of the test program works. Devise a mechanism that frees the semaphore set after the last mysem_unlink but only after the last process closes this semaphore. The mysem_unlink cannot directly do the freeing because other processes may still have the semaphore open. One possibility is to have mysem_close check the link count in the inode and free the semaphore set if the link count becomes 0.

Try to handle the various race conditions by using an additional semaphore set to protect the critical sections for semaphore initialization and access. What happens when two threads try to access the semaphore concurrently? Use the same semaphore for all copies of your library to protect against interaction between unrelated processes. Refer to this semaphore by a filename, which you can convert to a key with ftok.

Exercise: Implementing Pipes with Shared Memory

This section develops a specification for a software pipe consisting of a semaphore set to protect access to the pipe and a shared memory segment to hold the pipe data and state information. The pipe state information includes the number of bytes of data in the pipe, the position of next byte to be read and status information. The pipe can hold at most one message of maximum size _POSIX_PIPE_BUF. Represent the pipe by the following pipe_t structure allocated in shared memory.

typedef struct pipe {
   int semid;                    /* ID of protecting semaphore set */
   int shmid;                   /* ID of the shared memory segment */
   char data[_POSIX_PIPE_BUF];         /* buffer for the pipe data */
   int data_size;                   /* bytes currently in the pipe */
   void *current_start;        /* pointer to current start of data */
   int end_of_file;          /* true after pipe closed for writing */
} pipe_t;

A program creates and references the pipe by using a pointer to pipe_t as a handle. For simplicity, assume that only one process can read from the pipe and one process can write to the pipe. The reader must clean up the pipe when it closes the pipe. When the writer closes the pipe, it sets the end_of_file member of pipe_t so that the reader can detect end-of-file.

The semaphore set protects the pipe_t data structure during shared access by the reader and the writer. Element zero of the semaphore set controls exclusive access to data. It is initially 1. Readers and writers acquire access to the pipe by decrementing this semaphore element, and they release access by incrementing it. Element one of the semaphore set controls synchronization of writes so that data contains only one message, that is, the output of a single write operation. When this semaphore element is 1, the pipe is empty. When it is 0, the pipe has data or an end-of-file has been encountered. Initially, element one is 1. The writer decrements element one before writing any data. The reader waits until element one is 0 before reading. When it has read all the data from the pipe, the reader increments element one to indicate that the pipe is now available for writing. Write the following functions.

pipe_t *pipe_open(void);

creates a software pipe and returns a pointer of type pipe_t * to be used as a handle in the other calls. The algorithm for pipe_open is as follows.

  1. Create a shared memory segment to hold a pipe_t data structure by calling shmget. Use a key of IPC_PRIVATE and owner read/write permissions.

  2. Attach the segment by calling shmat. Cast the return value of shmat to a pipe_t * and assign it to a local variable p.

  3. Set p->shmid to the ID of the shared memory segment returned by the shmget.

  4. Set p->data_size and p->end_of_file to 0.

  5. Create a semaphore set containing two elements by calling semget with IPC_PRIVATE key and owner read, write, execute permissions.

  6. Initialize both semaphore elements to 1, and put the resulting semaphore ID value in p->semid.

  7. If all the calls were successful, return p.

  8. If an error occurs, deallocate all resources, set errno, and return a NULL pointer.

int pipe_read(pipe_t *p, char *buf, int bytes);

behaves like an ordinary blocking read function. The algorithm for pipe_read is as follows.

  1. Perform semop on p->semid to atomically decrement semaphore element zero, and test semaphore element one for 0. Element zero provides mutual exclusion. Element one is only 0 if there is something in the buffer.

  2. If p->data_size is greater than 0 do the following.

    1. Copy at most bytes bytes of information starting at position p->current_start of the software pipe into buf. Take into account the number of bytes in the pipe.

    2. Update the p->current_start and p->data_size members of the pipe data structure.

    3. If successful, set the return value to the number of bytes actually read.

  3. Otherwise, if p->data_size is 0 and p->end_of_file is true, set the return value to 0 to indicate end-of-file.

  4. Perform another semop operation to release access to the pipe. Increment element zero. If no more data is in the pipe, also increment element one unless p->end_of_file is true. Perform these operations atomically by a single semop call.

  5. If an error occurs, return –1 with errno set.

int pipe_write(pipe_t *p, char *buf, int bytes);

behaves like an ordinary blocking write function. The algorithm for pipe_write is as follows.

  1. Perform a semop on p->semid to atomically decrement both semaphore elements zero and one.

  2. Copy at most _POSIX_PIPE_BUF bytes from buf into the pipe buffer.

  3. Set p->data_size to the number of bytes actually copied, and set p->current_start to 0.

  4. Perform another semop call to atomically increment semaphore element zero of the semaphore set.

  5. If successful, return the number of bytes copied.

  6. If an error occurs, return –1 with errno set.

int pipe_close(pipe_t *p, int how);

closes the pipe. The how parameter determines whether the pipe is closed for reading or writing. Its possible values are O_RDONLY and O_WRONLY. The algorithm for pipe_close is as follows.

  1. Use the semop function to atomically decrement element zero of p->semid. If the semop fails, return –1 with errno set.

  2. If how & O_WRONLY is true, do the following.

    1. Set p->end_of_file to true.

    2. Perform a semctl to set element one of p->semid to 0.

    3. Copy p->semid into a local variable, semid_temp.

    4. Perform a shmdt to detach p.

    5. Perform a semop to atomically increment element zero of semid_temp.

    If any of the semop, semctl, or shmdt calls fail, return –1 immediately with errno set.

  3. If how & O_RDONLY is true, do the following.

    1. Perform a semctl to remove the semaphore p->semid. (If the writer is waiting on the semaphore set, its semop returns an error when this happens.)

    2. Copy p->shmid into a local variable, shmid_temp.

    3. Call shmdt to detach p.

    4. Call shmctl to deallocate the shared memory segment identified by shmid_temp.

    If any of the semctl, shmdt, or shmctl calls fail, return –1 immediately with errno set.

Test the software pipe by writing a main program that is similar to Program 6.4. The program creates a software pipe and then forks a child. The child reads from standard input and writes to the pipe. The parent reads what the child has written to the pipe and outputs it to standard output. When the child detects end-of-file on standard input, it closes the pipe for writing. The parent then detects end-of-file on the pipe, closes the pipe for reading (which destroys the pipe), and exits. Execute the ipcs command to check that everything was properly destroyed.

The above specification describes blocking versions of the functions pipe_read and pipe_write. Modify and test a nonblocking version also.

Exercise: Implementing Pipes with Message Queues

Formulate a specification of a software pipe implementation in terms of message queues. Implement the following functions.

pipe_t *pipe_open(void);
int pipe_read(pipe_t *p, char *buf, int chars);
int pipe_write(pipe_t *p, char *buf, int chars);
int pipe_close(pipe_t *p);

Design a pipe_t structure to fit the implementation. Test the implementation as described in Section 15.7.

Additional Reading

Most books on operating systems [107, 122] discuss the classical semaphore abstraction. UNIX Network Programming by Stevens [116] has an extensive discussion on System V Interprocess Communication including semaphores, shared memory and message queues.

