Introduction to Linux |
Under Linux, "programs" are "executed" by "processes", either by the process interpreting information, or by a process replacing it's own image in memory with the binary program image stored in a file. In Unix(tm), a search path is used to find the program to be executed. The search path is stored in an environment variable called PATH, which consists of a sequence of directory names separated by `;'s. Each of the directories named in the sequence is searched in order for an executable file with the right name. Once found, the file is interpreted. To see the PATH from an X11 window, type:
A process is a single sequence of instructions being executed in an order defined by the input, output, and state of the process storage areas. In Linux, processes are generated in a tree structure by a parent process forking a child process. The parent process is essentially duplicated in the child process, and the child process can then replace its memory image with the image of a desired program. Each process has a process identifier (Pid) used by the system and other processes to identify it. The processes with Pid=0 and Pid=1 are created at system bootup, and the process with Pid=1 is the ancestor of all other processes in the system. By using the Proc selection from the System Status menu under Administrator in X11, the following listing is shown:
In this case, we are looking at the process tree displayed with subprocesses indented under their parent processes. The columns represent the User ID that owns the process, the process ID number, the parent process ID number, various other parameters, and finally, the command line that generated the process. Command lines with square brackets (e.g., [kswapd]) were not created by command lines and correspond to system processes created by the operating system directly. This listing can also be generated from the xterm bash command line using the Process Status (ps) command with various options:
The program that displays this system status information is also shown as a subprocess of the program 'perl', which is the language used to display the System Status window on the screen. Note that process 1 is called 'init' and under 'init' there are a number of other processes, including at a lower level, 'blackbox'. This is the process within X11 that provides for menus on the screen and other facets of the graphical user interface. The 'xterm' entries correspond to xterms running on the screen and within those xterms are other programs such as bash (the command interpreter) and under one of those you see 'top'. This corresponds to the window below:
The 'top' program runs in an xterm or other terminal and displays the processes currently running in the computer, sorted by the percentage or Central Processing Unit (CPU) time they are using at the moment. This program redisplays every few seconds so you get an updated picture of the situation over time. Note that top itself takes some amount of space and time.
At the top of this listing we see the system date and time, the number of user terminals on the system, the load average (the system slows down when the load averages exceed 1), the total number of processes on the system, the number of them waiting for something to do (sleeping), the number running at the moment, the number in a 'zombie' state (sometimes processes get lost in Linux and become zombies), and the number stopped. Next we see the percentage of the time spent in user programs, system programs, low priority (nice) programs, and the idle time when the CPU is doing nothing. Then we see the total memory available, the overage amount used, the average amount free to use, the amount shared between processes, and the amount used for buffering input and output. Finally, we see the amount of 'swap' space available, used, and free. Swap space is used to store the memory associated with processes when they can no longer fit in RAM. As you run more programs, there is less and less RAM space, and if you run out, then something has to give. If there is no swap space on disk, then a process has to fail or a program refuse to run. This is true on all Unix-like systems.
In the main body of the listing, each process is identified by process ID, process owner, the priority with which requests from this process are run, the 'niceness' level that determines which runnable process runs first (lower niceness levels always run before higher niceness levels), the memory size of the process, the size of the portion of the process used most often, the amount of shared memory in the process, the status of the process (Runnable, Sleeping, Sleeping Waiting, and non-zero niceness), its percentage of CPU usage and memory usage the amount of CPU time it has consumed so far, and the command run.
Each process has a process identifier created by the operating system at the creation of the process. This is normally an integer in the range from 1 to a system defined maximum. As old processes "die", their Pids become available to newly generated processes. Each process also has a process group that may be shared with other processes to allow them to be manipulated together, and a parent process that created it. If a parent process dies, all of its child processes are normally terminated as well, but in some cases, a parent process dies without the child process dying. In such cases, the orphan process is adopted by the process with Pid=1. As an example, we can kill process 1303 in the earlier listing by typing:
The result will be that process 1303 dies along with 1304 and 1350. The 'top' program will stop running and the 'bash' command interpreter running in that xterm window will also stop. The window will also disappear from the screen. After typing that command into one of the open xterm windows, press the proc button to refresh the listing:
The 'kill' program is actually a bit more general than simply being able to stop processes from running. It can also be used to send other 'signals' to a process. The kill signal (signal 9) cannot be circumvented by a process and is thus a very forceful way to kill a process. Other signals, like 'HUP' are commonly used to tell programs to reload their configuration files or do other similar things.
In an environment with multiple processes, there are almost always processes awaiting execution. In order to control the order of execution and assure to as high a degree as possible that each process gets a fair chance to make progress, the system has a "scheduler" process that determines which process to grant execution time to at each "time slice". In Linux, the scheduler process is normally the process with Pid=0.
The Process Control item under the Administrator X11 menu allows you to view and control processes graphically. You can select a process with your mouse by clicking on the line in the window and then kill the process, change its scheduling priority (i.e., niceness), and show process details.
Rather than encode every system service into the operating system itself, Linux uses system generated processes to perform processing associated with system services. This has the advantage of keeping the size of the operating system relatively small, while making it extensible to meet the needs of many environments. For example, the program that controls a printer is a process that only becomes ready to execute when a file is being sent to a printer. For two printers, you simply use two printer processes. For different sorts of printers, you change the parameters sent to the appropriate printer process. This is one of the reasons there are so many processes running at one time in most Linux systems.
processes can communicate between each other through the use of interprocess communications facilities, shared memory, interprocess files called "pipes", signals, or regular files. processes can be created, deleted, or modified by other processes, and a number of facilities exist for performing these operations. Using the Proc selection from the System Status display, we can see some examples of active pipes in the system. The 'lsof' program used to generate this listing uses pipes of its own, as you can see from the listing below:
By using the mouse to grab text from the system status window and paste it into the text editor window, I was able to edit the listing below to annotate one of the things going on:
lsof 1596 root cwd DIR 0,10 0 1672 /root --- current working directory lsof 1596 root rtd DIR 0,10 0 34 / --- root directory lsof 1596 root txt REG 240,0 97070 5137946 /cdrom/local/sbin/lsof --- the running program lsof 1596 root mem REG 0,10 107243 1647 /lib/ld-2.2.1.so lsof 1596 root mem REG 0,10 1307173 1652 /lib/libc-2.2.1.so --- libraries used by this program lsof 1596 root 0u CHR 4,1 18 /dev/vc/1 lsof 1596 root 1w FIFO 0,6 354629 pipe lsof 1596 root 2w FIFO 0,6 354629 pipe --- the FIFO (first in first out) pipe connection to the display window lsof 1596 root 3r DIR 0,3 0 1 /proc lsof 1596 root 4r DIR 0,3 0 104595464 /proc/1596/fd --- the process table entries for this process lsof 1596 root 5w FIFO 0,6 354634 pipe lsof 1596 root 6r FIFO 0,6 354635 pipe -- The lsof file has two processes - 1596 and 1597 -- It appears to use the FIFO pipe numbered 354634 to send from 1596 to 1597 -- and the pipe numbered 354635 to send from 1597 to 1596 -- this is revealed by the '4r' (read) and '5w' (write) on pipe 354634 and the -- '6r' (read) and '7w' (write) on pipe 354635 lsof 1597 root cwd DIR 0,10 0 1672 /root --- the working directory for the second process lsof 1597 root rtd DIR 0,10 0 34 / --- the root directory for the second process lsof 1597 root txt REG 240,0 97070 5137946 /cdrom/local/sbin/lsof lsof 1597 root mem REG 0,10 107243 1647 /lib/ld-2.2.1.so lsof 1597 root mem REG 0,10 1307173 1652 /lib/libc-2.2.1.so -- the program and its libraries lsof 1597 root 4r FIFO 0,6 354634 pipe lsof 1597 root 7w FIFO 0,6 354635 pipe --- the pipes from and to the other process
The editing program I used to do this is called 'me' and it is one of many editors available on the White Glove and most Unix-like systems. Another popular editor is 'vim'. |
The login program identifies and authenticates a user, and creates a process for that user by running a system specified program. The terminal being used by the user for login is normally attached to that process for input and output, and the default program run at login is normally the bash command interpreter. This was demonstrated above.
The bash program is designed as an interactive interface between the user and the operating system (commonly called a "command interpreter"). This program allows the user to specify programs, "command line" arguments, inputs to those programs, and output from those programs. It then takes the specifications and translates them into appropriate sequences of system calls. bash also has a substantial programming capability and can interpret commands from files, thus making is a powerful language for performing many of the tasks that would require programs in other operating environments. This program is used throughout our examples.
The ps program produces a list of processes on the system and their status. Depending on the command line parameters passed to ps, it can list processes from the current user or all users, and list various subsets of the available information on those processes. The available information includes the Pid, Gid, parent, system time, user time, IO time, name of the calling program and command line parameters, and process scheduler status. This was demonstrated above.
The newgrp program changes the group of the current process.
The kill program sends a specified signal to a specified process or set of processes. It is most commonly used to terminate a process, but is occasionally used to send other signals. This was demonstrated above.
The at program schedules a process for a later time and date. This is most often used to perform "batch" processing, schedule processing for the evening hours, or (on occasion) to do something at a particular date and time in the future. The cron program facility is normally used for periodic tasks.
The cron program is a program that chronically wakes up to run programs scheduled for particular times and dates. cron can be used to schedule specific events, but more often is used for periodic processing like remote mail delivery and reminding the operator to do backups.
The nice program requests that the system scheduler change the priority with which the present and all subprocesses be scheduled to run. For normal users, nice can only lower the execution priority, but for the superuser, nice can be used to schedule higher priorities as well. Priorities on most Unix systems range from -127 to 128, with -127 being the "highest" priority. In Linux priorities range from -32 to +32. user programs normally run at priority 0, while device drivers tend to run at negative priorities, with DMA priorities more negative then sequential devices. The nice program is used by the process control program shown earlier.
The time program times a subprocess, returning the user CPU time, system CPU time, and real time. As an example, run the program 'ps' using 'time' as follows:
One of the problems with this particular result is that the total time is so small (well under 1/10 of a second on a slow computer) that the timing isn't very accurate. To get a more accurate timing on fast events, we do them many times and compare the runtime with the event we are trying to time to the runtime without the event.
In this example, the only difference between the two runs is the use of the 'ps' command. We take (3.667 - 0.018) / 100 to get the time to run 'ps', about 0.0365 seconds of realtime using between .0167 seconds of user time and 0.0197 seconds of system time. The rest of the 'real time' is consumed in other things going on in the computer.
When memory runs out, processes cannot be started, and some processes that allocate memory may be unable to continue. As described above, virtual memory can put off this problem, but in the end, it is always possible to ask for more memory than the total available. When this happens, things start to break. Similarly, processes are kept in a process table, and the process table can fill up if too many processes are run at once. This can be a large number of processes and hard to reach by hand, but automation comes to our aide once again. In this example, we are going to write a simple process virus that runs the system out of resources.
This prepares the virus to run. In this case, the '&' in the command lines tells the command interpreter to start this program running but not wait for it to finish before doing the next command. The "echo -n '.'" will print a '. for every time the program is run but not print any [RETURN] characters. This is a handy way to keep track of how many of these programs have been run. When we run this program, each copy will try to run 5 more copies, and each of those will run 5 more, and so on. Run it like this:
On the computer we use to test these programs, we got about 2 lines of '.'s to print out and the computer then slowed to the point where we could not move the mouse very often or very far. Within about 10 seconds we could no longer use the computer and nothing we did or typed had any effect. We had to press the reset button and restart the computer. Fortunately, under White Glove this is not a big problem, so restart your computer now and we will move on to the next part of the class.
Don't do this on other versions of Linux or you might lose some of your files and have problems rebooting. |
In this section we have reviewed basic information about programs and processes and how programs and processes work. We have written a few simple bash scripts, learned how to start and kill processes, investigated internal structures, and seen some of the limitations associated with processes in a Linux environment.