Go to the first, previous, next, last section, table of contents.
Processes are the primitive units for allocation of system resources. Each process has its own address space and (usually) one thread of control. A process executes a program; you can have multiple processes executing the same program, but each process has its own copy of the program within its own address space and executes it independently of the other copies.
This chapter explains what your program should do to handle the startup of a process, to terminate its process, and to receive information (arguments and the environment) from the parent process.
The system starts a C program by calling the function main
. It
is up to you to write a function named main
---otherwise, you
won't even be able to link your program without errors.
In ISO C you can define main
either to take no arguments, or to
take two arguments that represent the command line arguments to the
program, like this:
int main (int argc, char *argv[])
The command line arguments are the whitespace-separated tokens given in
the shell command used to invoke the program; thus, in `cat foo
bar', the arguments are `foo' and `bar'. The only way a
program can look at its command line arguments is via the arguments of
main
. If main
doesn't take arguments, then you cannot get
at the command line.
The value of the argc argument is the number of command line
arguments. The argv argument is a vector of C strings; its
elements are the individual command line argument strings. The file
name of the program being run is also included in the vector as the
first element; the value of argc counts this element. A null
pointer always follows the last element: argv[argc]
is this null pointer.
For the command `cat foo bar', argc is 3 and argv has
three elements, "cat"
, "foo"
and "bar"
.
If the syntax for the command line arguments to your program is simple
enough, you can simply pick the arguments off from argv by hand.
But unless your program takes a fixed number of arguments, or all of the
arguments are interpreted in the same way (as file names, for example),
you are usually better off using getopt
to do the parsing.
In Unix systems you can define main
a third way, using three arguments:
int main (int argc, char *argv[], char *envp)
The first two arguments are just the same. The third argument
envp gives the process's environment; it is the same as the value
of environ
. See section Environment Variables. POSIX.1 does not
allow this three-argument form, so to be portable it is best to write
main
to take two arguments, and use the value of environ
.
POSIX recommends these conventions for command line arguments.
getopt
(see section Parsing Program Options) makes it easy to implement them.
isalnum
;
see section Classification of Characters).
ld
command requires an argument--an output file name.
getopt
in the GNU C library normally makes
it appear as if all the option arguments were specified before all the
non-option arguments for the purposes of parsing, even if the user of
your program intermixed option and non-option arguments. It does this
by reordering the elements of the argv array. This behavior is
nonstandard; if you want to suppress it, define the
_POSIX_OPTION_ORDER
environment variable. See section Standard Environment Variables.
GNU adds long options to these conventions. Long options consist of `--' followed by a name made of alphanumeric characters and dashes. Option names are typically one to three words long, with hyphens to separate words. Users can abbreviate the option names as long as the abbreviations are unique.
To specify an argument for a long option, write `--name=value'. This syntax enables a long option to accept an argument that is itself optional.
Eventually, the GNU system will provide completion for long option names in the shell.
Here are the details about how to call the getopt
function. To
use this facility, your program must include the header file
`unistd.h'.
getopt
prints an
error message to the standard error stream if it encounters an unknown
option character or an option with a missing required argument. This is
the default behavior. If you set this variable to zero, getopt
does not print any messages, but it still returns the character ?
to indicate an error.
getopt
encounters an unknown option character or an option
with a missing required argument, it stores that option character in
this variable. You can use this for providing your own diagnostic
messages.
getopt
to the index of the next element
of the argv array to be processed. Once getopt
has found
all of the option arguments, you can use this variable to determine
where the remaining non-option arguments begin. The initial value of
this variable is 1
.
getopt
to point at the value of the
option argument, for those options that accept arguments.
getopt
function gets the next option argument from the
argument list specified by the argv and argc arguments.
Normally these values come directly from the arguments received by
main
.
The options argument is a string that specifies the option characters that are valid for this program. An option character in this string can be followed by a colon (`:') to indicate that it takes a required argument.
If the options argument string begins with a hyphen (`-'), this is treated specially. It permits arguments that are not options to be returned as if they were associated with option character `\0'.
The getopt
function returns the option character for the next
command line option. When no more option arguments are available, it
returns -1
. There may still be more non-option arguments; you
must compare the external variable optind
against the argc
parameter to check this.
If the option has an argument, getopt
returns the argument by
storing it in the variable optarg. You don't ordinarily need to
copy the optarg
string, since it is a pointer into the original
argv array, not into a static area that might be overwritten.
If getopt
finds an option character in argv that was not
included in options, or a missing option argument, it returns
`?' and sets the external variable optopt
to the actual
option character. If the first character of options is a colon
(`:'), then getopt
returns `:' instead of `?' to
indicate a missing option argument. In addition, if the external
variable opterr
is nonzero (which is the default), getopt
prints an error message.
getopt
Here is an example showing how getopt
is typically used. The
key points to notice are:
getopt
is called in a loop. When getopt
returns
-1
, indicating no more options are present, the loop terminates.
switch
statement is used to dispatch on the return value from
getopt
. In typical use, each case just sets a variable that
is used later in the program.
#include <unistd.h> #include <stdio.h> int main (int argc, char **argv) { int aflag = 0; int bflag = 0; char *cvalue = NULL; int index; int c; opterr = 0; while ((c = getopt (argc, argv, "abc:")) != -1) switch (c) { case 'a': aflag = 1; break; case 'b': bflag = 1; break; case 'c': cvalue = optarg; break; case '?': if (isprint (optopt)) fprintf (stderr, "Unknown option `-%c'.\n", optopt); else fprintf (stderr, "Unknown option character `\\x%x'.\n", optopt); return 1; default: abort (); } printf ("aflag = %d, bflag = %d, cvalue = %s\n", aflag, bflag, cvalue); for (index = optind; index < argc; index++) printf ("Non-option argument %s\n", argv[index]); return 0; }
Here are some examples showing what this program prints with different combinations of arguments:
% testopt aflag = 0, bflag = 0, cvalue = (null) % testopt -a -b aflag = 1, bflag = 1, cvalue = (null) % testopt -ab aflag = 1, bflag = 1, cvalue = (null) % testopt -c foo aflag = 0, bflag = 0, cvalue = foo % testopt -cfoo aflag = 0, bflag = 0, cvalue = foo % testopt arg1 aflag = 0, bflag = 0, cvalue = (null) Non-option argument arg1 % testopt -a arg1 aflag = 1, bflag = 0, cvalue = (null) Non-option argument arg1 % testopt -c foo arg1 aflag = 0, bflag = 0, cvalue = foo Non-option argument arg1 % testopt -a -- -b aflag = 1, bflag = 0, cvalue = (null) Non-option argument -b % testopt -a - aflag = 1, bflag = 0, cvalue = (null) Non-option argument -
To accept GNU-style long options as well as single-character options,
use getopt_long
instead of getopt
. This function is
declared in `getopt.h', not `unistd.h'. You should make every
program accept long options if it uses any options, for this takes
little extra work and helps beginners remember how to use the program.
getopt_long
. The argument longopts must be an array of
these structures, one for each long option. Terminate the array with an
element containing all zeros.
The struct option
structure has these fields:
const char *name
int has_arg
no_argument
,
required_argument
and optional_argument
.
int *flag
int val
flag
is a null pointer, then the val
is a value which
identifies this option. Often these values are chosen to uniquely
identify particular long options.
If flag
is not a null pointer, it should be the address of an
int
variable which is the flag for this option. The value in
val
is the value to store in the flag to indicate that the option
was seen.
getopt
. The argument longopts describes the long
options to accept (see above).
When getopt_long
encounters a short option, it does the same
thing that getopt
would do: it returns the character code for the
option, and stores the options argument (if it has one) in optarg
.
When getopt_long
encounters a long option, it takes actions based
on the flag
and val
fields of the definition of that
option.
If flag
is a null pointer, then getopt_long
returns the
contents of val
to indicate which option it found. You should
arrange distinct values in the val
field for options with
different meanings, so you can decode these values after
getopt_long
returns. If the long option is equivalent to a short
option, you can use the short option's character code in val
.
If flag
is not a null pointer, that means this option should just
set a flag in the program. The flag is a variable of type int
that you define. Put the address of the flag in the flag
field.
Put in the val
field the value you would like this option to
store in the flag. In this case, getopt_long
returns 0
.
For any long option, getopt_long
tells you the index in the array
longopts of the options definition, by storing it into
*indexptr
. You can get the name of the option with
longopts[*indexptr].name
. So you can distinguish among
long options either by the values in their val
fields or by their
indices. You can also distinguish in this way among long options that
set flags.
When a long option has an argument, getopt_long
puts the argument
value in the variable optarg
before returning. When the option
has no argument, the value in optarg
is a null pointer. This is
how you can tell whether an optional argument was supplied.
When getopt_long
has no more options to handle, it returns
-1
, and leaves in the variable optind
the index in
argv of the next remaining argument.
#include <stdio.h> #include <stdlib.h> #include <getopt.h> /* Flag set by `--verbose'. */ static int verbose_flag; int main (argc, argv) int argc; char **argv; { int c; while (1) { static struct option long_options[] = { /* These options set a flag. */ {"verbose", 0, &verbose_flag, 1}, {"brief", 0, &verbose_flag, 0}, /* These options don't set a flag. We distinguish them by their indices. */ {"add", 1, 0, 0}, {"append", 0, 0, 0}, {"delete", 1, 0, 0}, {"create", 0, 0, 0}, {"file", 1, 0, 0}, {0, 0, 0, 0} }; /*getopt_long
stores the option index here. */ int option_index = 0; c = getopt_long (argc, argv, "abc:d:", long_options, &option_index); /* Detect the end of the options. */ if (c == -1) break; switch (c) { case 0: /* If this option set a flag, do nothing else now. */ if (long_options[option_index].flag != 0) break; printf ("option %s", long_options[option_index].name); if (optarg) printf (" with arg %s", optarg); printf ("\n"); break; case 'a': puts ("option -a\n"); break; case 'b': puts ("option -b\n"); break; case 'c': printf ("option -c with value `%s'\n", optarg); break; case 'd': printf ("option -d with value `%s'\n", optarg); break; case '?': /*getopt_long
already printed an error message. */ break; default: abort (); } } /* Instead of reporting `--verbose' and `--brief' as they are encountered, we report the final status resulting from them. */ if (verbose_flag) puts ("verbose flag is set"); /* Print any remaining command line arguments (not options). */ if (optind < argc) { printf ("non-option ARGV-elements: "); while (optind < argc) printf ("%s ", argv[optind++]); putchar ('\n'); } exit (0); }
Having a single level of options is sometimes not enough. There might be too many options which have to be available or a set of options is closely related.
For this case some programs use suboptions. One of the most prominent
programs is certainly mount
(8). The -o
option take one
argument which itself is a comma separated list of options. To ease the
programming of code like this the function getsubopt
is
available.
The optionp parameter must be a pointer to a variable containing the address of the string to process. When the function returns the reference is updated to point to the next suboption or to the terminating `\0' character if there is no more suboption available.
The tokens parameter references an array of strings containing the
known suboptions. All strings must be `\0' terminated and to mark
the end a null pointer must be stored. When getsubopt
finds a
possible legal suboption it compares it with all strings available in
the tokens array and returns the index in the string as the
indicator.
In case the suboption has an associated value introduced by a `=' character, a pointer to the value is returned in valuep. The string is `\0' terminated. If no argument is available valuep is set to the null pointer. By doing this the caller can check whether a necessary value is given or whether no unexpected value is present.
In case the next suboption in the string is not mentioned in the tokens array the starting address of the suboption including a possible value is returned in valuep and the return value of the function is `-1'.
The code which might appear in the mount
(8) program is a perfect
example of the use of getsubopt
:
#include <stdio.h> #include <stdlib.h> int do_all; const char *type; int read_size; int write_size; int read_only; enum { RO_OPTION = 0, RW_OPTION, READ_SIZE_OPTION, WRITE_SIZE_OPTION }; const char *mount_opts[] = { [RO_OPTION] = "ro", [RW_OPTION] = "rw", [READ_SIZE_OPTION] = "rsize", [WRITE_SIZE_OPTION] = "wsize" }; int main (int argc, char *argv[]) { char *subopts, *value; int opt; while ((opt = getopt (argc, argv, "at:o:")) != -1) switch (opt) { case 'a': do_all = 1; break; case 't': type = optarg; break; case 'o': subopts = optarg; while (*subopts != '\0') switch (getsubopt (&subopts, mount_opts, &value)) { case RO_OPTION: read_only = 1; break; case RW_OPTION: read_only = 0; break; case READ_SIZE_OPTION: if (value == NULL) abort (); read_size = atoi (value); break; case WRITE_SIZE_OPTION: if (value == NULL) abort (); write_size = atoi (value); break; default: /* Unknown suboption. */ printf ("Unknown suboption `%s'\n", value); break; } break; default: abort (); } /* Do the real work. */ return 0; }
When a program is executed, it receives information about the context in
which it was invoked in two ways. The first mechanism uses the
argv and argc arguments to its main
function, and is
discussed in section Program Arguments. The second mechanism uses
environment variables and is discussed in this section.
The argv mechanism is typically used to pass command-line arguments specific to the particular program being invoked. The environment, on the other hand, keeps track of information that is shared by many programs, changes infrequently, and that is less frequently used.
The environment variables discussed in this section are the same
environment variables that you set using assignments and the
export
command in the shell. Programs executed from the shell
inherit all of the environment variables from the shell.
Standard environment variables are used for information about the user's home directory, terminal type, current locale, and so on; you can define additional variables for other purposes. The set of all environment variables that have values is collectively known as the environment.
Names of environment variables are case-sensitive and must not contain the character `='. System-defined environment variables are invariably uppercase.
The values of environment variables can be anything that can be represented as a string. A value must not contain an embedded null character, since this is assumed to terminate the string.
The value of an environment variable can be accessed with the
getenv
function. This is declared in the header file
`stdlib.h'.
getenv
(but not by any other library function). If the
environment variable name is not defined, the value is a null
pointer.
putenv
function adds or removes definitions from the environment.
If the string is of the form `name=value', the
definition is added to the environment. Otherwise, the string is
interpreted as the name of an environment variable, and any definition
for this variable in the environment is removed.
The GNU library provides this function for compatibility with SVID; it may not be available in other systems.
You can deal directly with the underlying representation of environment objects to add more variables to the environment (for example, to communicate with another program you are about to execute; see section Executing a File).
This variable is declared in the header file `unistd.h'.
If you just want to get the value of an environment variable, use
getenv
.
Unix systems, and the GNU system, pass the initial value of
environ
as the third argument to main
.
See section Program Arguments.
These environment variables have standard meanings. This doesn't mean that they are always present in the environment; but if these variables are present, they have these meanings. You shouldn't try to use these environment variable names for some other purpose.
HOME
HOME
to any value.
If you need to make sure to obtain the proper home directory
for a particular user, you should not use HOME
; instead,
look up the user's name in the user database (see section User Database).
For most purposes, it is better to use HOME
, precisely because
this lets the user specify the value.
LOGNAME
getlogin
(see section Identifying Who Logged In) is better for that purpose.
For most purposes, it is better to use LOGNAME
, precisely because
this lets the user specify the value.
PATH
PATH
holds a path used
for searching for programs to be run.
The execlp
and execvp
functions (see section Executing a File)
use this environment variable, as do many shells and other utilities
which are implemented in terms of those functions.
The syntax of a path is a sequence of directory names separated by
colons. An empty string instead of a directory name stands for the
current directory (see section Working Directory).
A typical value for this environment variable might be a string like:
:/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local/binThis means that if the user tries to execute a program named
foo
,
the system will look for files named `foo', `/bin/foo',
`/etc/foo', and so on. The first of these files that exists is
the one that is executed.
TERM
TERM
environment variable, for example.
TZ
TZ
, for information about
the format of this string and how it is used.
LANG
LC_ALL
nor the specific environment variable for that
category is set. See section Locales and Internationalization, for more information about
locales.
LC_COLLATE
LC_CTYPE
LC_MONETARY
LC_NUMERIC
LC_TIME
_POSIX_OPTION_ORDER
getopt
. See section Program Argument Syntax Conventions.
The usual way for a program to terminate is simply for its main
function to return. The exit status value returned from the
main
function is used to report information back to the process's
parent process or shell.
A program can also terminate normally by calling the exit
function.
In addition, programs can be terminated by signals; this is discussed in
more detail in section Signal Handling. The abort
function causes
a signal that kills the program.
A process terminates normally when the program calls exit
.
Returning from main
is equivalent to calling exit
, and
the value that main
returns is used as the argument to exit
.
exit
function terminates the process with status
status. This function does not return.
Normal termination causes the following actions:
atexit
or on_exit
functions are called in the reverse order of their registration. This
mechanism allows your application to specify its own "cleanup" actions
to be performed at program termination. Typically, this is used to do
things like saving program state information in a file, or unlocking
locks in shared data bases.
tmpfile
function are removed; see section Temporary Files.
_exit
is called, terminating the program. See section Termination Internals.
When a program exits, it can return to the parent process a small
amount of information about the cause of termination, using the
exit status. This is a value between 0 and 255 that the exiting
process passes as an argument to exit
.
Normally you should use the exit status to report very broad information about success or failure. You can't provide a lot of detail about the reasons for the failure, and most parent processes would not want much detail anyway.
There are conventions for what sorts of status values certain programs should return. The most common convention is simply 0 for success and 1 for failure. Programs that perform comparison use a different convention: they use status 1 to indicate a mismatch, and status 2 to indicate an inability to compare. Your program should follow an existing convention if an existing convention makes sense for it.
A general convention reserves status values 128 and up for special purposes. In particular, the value 128 is used to indicate failure to execute another program in a subprocess. This convention is not universally obeyed, but it is a good idea to follow it in your programs.
Warning: Don't try to use the number of errors as the exit status. This is actually not very useful; a parent process would generally not care how many errors occurred. Worse than that, it does not work, because the status value is truncated to eight bits. Thus, if the program tried to report 256 errors, the parent would receive a report of 0 errors--that is, success.
For the same reason, it does not work to use the value of errno
as the exit status--these can exceed 255.
Portability note: Some non-POSIX systems use different
conventions for exit status values. For greater portability, you can
use the macros EXIT_SUCCESS
and EXIT_FAILURE
for the
conventional status value for success and failure, respectively. They
are declared in the file `stdlib.h'.
exit
function to indicate
successful program completion.
On POSIX systems, the value of this macro is 0
. On other
systems, the value might be some other (possibly non-constant) integer
expression.
exit
function to indicate
unsuccessful program completion in a general sense.
On POSIX systems, the value of this macro is 1
. On other
systems, the value might be some other (possibly non-constant) integer
expression. Other nonzero status values also indicate failures. Certain
programs use different nonzero status values to indicate particular
kinds of "non-success". For example, diff
uses status value
1
to mean that the files are different, and 2
or more to
mean that there was difficulty in opening the files.
Your program can arrange to run its own cleanup functions if normal
termination happens. If you are writing a library for use in various
application programs, then it is unreliable to insist that all
applications call the library's cleanup functions explicitly before
exiting. It is much more robust to make the cleanup invisible to the
application, by setting up a cleanup function in the library itself
using atexit
or on_exit
.
atexit
function registers the function function to be
called at normal program termination. The function is called with
no arguments.
The return value from atexit
is zero on success and nonzero if
the function cannot be registered.
atexit
. It
accepts two arguments, a function function and an arbitrary
pointer arg. At normal program termination, the function is
called with two arguments: the status value passed to exit
,
and the arg.
This function is included in the GNU C library only for compatibility for SunOS, and may not be supported by other implementations.
Here's a trivial program that illustrates the use of exit
and
atexit
:
#include <stdio.h> #include <stdlib.h> void bye (void) { puts ("Goodbye, cruel world...."); } int main (void) { atexit (bye); exit (EXIT_SUCCESS); }
When this program is executed, it just prints the message and exits.
You can abort your program using the abort
function. The prototype
for this function is in `stdlib.h'.
abort
function causes abnormal program termination. This
does not execute cleanup functions registered with atexit
or
on_exit
.
This function actually terminates the process by raising a
SIGABRT
signal, and your program can include a handler to
intercept this signal; see section Signal Handling.
Future Change Warning: Proposed Federal censorship regulations may prohibit us from giving you information about the possibility of calling this function. We would be required to say that this is not an acceptable way of terminating a program.
The _exit
function is the primitive used for process termination
by exit
. It is declared in the header file `unistd.h'.
_exit
function is the primitive for causing a process to
terminate with status status. Calling this function does not
execute cleanup functions registered with atexit
or
on_exit
.
When a process terminates for any reason--either by an explicit termination call, or termination as a result of a signal--the following things happen:
wait
or waitpid
; see
section Process Completion.
init
process, with process ID 1.)
SIGCHLD
signal is sent to the parent process.
SIGHUP
signal is sent to each process in the foreground job,
and the controlling terminal is disassociated from that session.
See section Job Control.
SIGHUP
signal and a SIGCONT
signal are sent to each process in the
group. See section Job Control.
Go to the first, previous, next, last section, table of contents.