Chapter VII. Learning More

[Index] [Prev]

Covered: man setenv whatis apropos history grep chmod rehash cc

One of the major difficulties inherent in dealing with Unix is that it was obviously designed for people who already know how to use it. Probably the best way to learn about Unix is to know someone who knows more than you do. One thing you can do by yourself, though, is to use the online manuals. These are accessed using the "man" command. After "man", you type the name of the command you want information about. For example, to see the manual for "ls", type
% man ls
What you see then is called the "manual page" or "man page" for ls, even though it is usually more than one page long. As is common, "man" sends its output through "more", so you see the first screen after you type the command, and hitting <space> lets you see later screens. First, you should see something like the following. (I will surround the output of man with lines of asterisks. Note that the man pages on your system may look very different from those on mine.)
************************************************************************
LS(1V)                   USER COMMANDS                     LS(1V)

NAME
     ls - list the contents of a directory
************************************************************************
This tells you that this is the man page for "ls", which is in section "(1v)" of the manual, entitled "User Commands". The name of the command is "ls" and what it does is to list the contents of a directory.

Next:

************************************************************************
SYNOPSIS
     ls [ -aAcCdfFgilLqrRstu1 ] filename ...
************************************************************************
The SYNOPSIS section gives the syntax of the command. Brackets enclose optional parameters, and ellipses indicate parameters that may be repeated. Thus, in this case, the command is ls, it is followed by optional hyphen-letter combinations, then one or more filenames. Note that to be precisely correct, there should be brackets around "filename ...".

Now, you can see that ls has many options. You can find out what these options do by reading the explanations in a later section, "OPTIONS".

man pages can have many sections. A very useful one is the following.

************************************************************************
DESCRIPTION
     For each filename which is a directory, ls  lists  the  con-
************************************************************************
The DESCRIPTION section tells what the program does, what its output means, etc. It is a short description of how to use the program.

Now, "NAME", "SYNOPSIS" and "DESCRIPTION" are present in almost every man page. There may or may not be other sections, such as the following.

USAGE - This section tells how to use the program. It often goes into more detail than DESCRIPTION.

OPTIONS - A very useful section that describes all the options listed in the SYNOPSIS section.

EXAMPLES - Gives examples of how to use the program.

ENVIRONMENT - Lists the "environment variables" used by the program. Environment variables are special types of shell variables. Normal shell variables are set using the "set" command. Environment variables work in exactly the same way except that (1) you set their value using "setenv", which has the same syntax as "set", but without the "=" (don't ask me why), and (2) their values are accessible to programs you run. This last feature makes them useful for setting up your "environment", that is, the way you want commands to work.

For example, in the man page for "more" on my system, it says that the more command looks in the environment variable "MORE" for default command line options. Since I like the option "-c", I have the following line in my .cshrc:

setenv MORE -c
FILES - Tells what files the program uses. Looking in this section of "man mail" was how I found out where my mailbox was.

SEE ALSO - Gives the names of related programs. This section can be extremely useful. Once you know the Unix command you need, getting information about it is generally rather easy; the hard part is finding out about the existence of the command. The SEE ALSO section tells you about other commands related to the current one.

BUGS - This section tells about known bugs in the command. You may wonder why, f they know about bugs, they don't just fix them. The answer has to do with the fact that what is considered a bug is different from person to person. More importantly, many Unix programs are written to work with a buggy operating system. If the bugs were fixed, the programs wouldn't work any more. In fact, sometimes, when a bug is fixed, an option is added to the command to allow the old, buggy behavior, for exactly this reason.

At this point, it may be instructive to look at the man page for man, by typing "man man". The synopsis section will probably say that man has three versions:

************************************************************************
SYNOPSIS
     man [-] [-t] [-M path] [-T macro-package]  [[section]  title
     ...] ...
     man [-M path] -k keyword ...
     man [-M path] -f filename ...
************************************************************************
I have been telling you about the first version, which accesses man pages. The other two versions, deal with a list of summaries of commands called the "whatis" database.

For example, type "man -f ls". This gives you all information in the whatis database about the file ls (and ls is a file, since it is a program that you run). Specifically, it will tell you that ls will "list the contents of a directory".

On many systems you can also use the command "whatis" for this; "whatis" is just a synonym for "man -f", almost as if the line "alias whatis man -f" were in your .cshrc file, although, if you check, you will find that whatis is not an alias. Thus, "whatis ls" quite literally answers the question "What is 'ls'?"

The other version of man is "man -k", which allows keyword search of the whatis database. For example, "man -k ls" prints every line of the whatis database that has "ls" in it. If you try it, you'll find that the result is not very useful; not only do you get the line describing "ls", but also the lines describing "lsearch", "lseek", "lsw", and every line with words like "signals" or "protocols" in it. You'll also notice that the output doesn't stop at each page. Thus, using longer words and piping the result through "more" is generally a good idea. For example, "man -k directory | more" will give you a list of all directory-related commands.

Just as with "man -f", "man -k" often has a sort of alias. This is usually "apropos" (although on some systems it is "help"). Whether or not this exists, however, you can almost always use "man -k". If you would like to access the whatis database directly, its filename is listed at the end of the man page for man.

Moving back to the first version of man, man pages come in sections. You might have noticed that every man page you have seen so far has been in section (1), or perhaps (1v). This is because section 1 is for user commands. The sections are as follows:

  1. User Commands
  2. System Calls
  3. Subroutines
  4. Devices
  5. File Formats
  6. Games
  7. Miscellaneous
  8. System Administration
There may also be sections called "local", "new", "old", and "public".

Clearly, if you don't do any programming and don't play any games, you'll have little reason to look at any section besides 1, or perhaps 7.

On many systems, each section has two special man pages called "intro" and "list". "intro" gives you an introduction to that section, while "list" gives a complete list of all the man pages in that section. You specify the section by putting the number just after "man", e.g., "man 1 intro" will give you an introduction to man section 1, while "man 1 list" will give you a list of every command you're ever likely to use. In the list page, the left column is the command, the middle column is the man page and section, and the right column is a short explanation.

Sometimes two man pages will have the same name, but be in different sections. For example, if you type "man 7 list", you'll probably see that section 7 also has a man page called "man", but if you type "man man", you get the section 1 page. In this case, you would have to type "man 7 man" to see the page in section 7.

Further, two man pages in the same numbered section may have the same name. For example, there may be a command in section (1) with a command of the same name in (1V). To see the second, you would type "man 1v command". Note: The section letters are always given to you in upper case, but you must type them in lower case: "man 1V command" will *not* work.

I have found it useful to read the entire man page of programs I use often. For example, csh has many useful features and options that few people know about. mail is similar. All of these features are documented in the man pages.


Here are some comments on a few miscellaneous topics:

Filename Completion

This is a nice feature of the C-shell that I use often. Find out more about it in "man csh".

If you put the line "set filec" in your .cshrc, then (once you have logged out and in again) "file completion" is enabled. This lets you type only part of a filename, then hit and let csh figure out the rest.

Say you have a file called "biglongfilename" that you want to see. You could type "more biglongfilename", but with file completion, all you have to type is something like "more big" then hit and . You have to type enough to uniquely identify the file, though. If you had another file called "biglocomotive" and you typed "more big" and , the system would go out to "more biglo" then beep at you to let you know you hadn't uniquely identified the file. Then you could type "n", or maybe "ngf", and hit escape again, and csh would type in the rest of the filename for you.

History Substitution

"History" refers to recently typed shell commands. Find out about these in "man csh". Typing "history" gives you a list of all the commands you have typed recently. Each will have a number before it. To re-execute that command, type "!" followed immediately (no space) by the number, e.g., "!67" repeats command number 67. Other possibilities are "!!" to repeat the last command you typed, "!-2" to repeat the command before that, "!-3" to repeat the one before that, etc. Getting more complex, "!!:s/qx/wsa" repeats the last command you typed, changing the first occurrence of "qx" to "wsa". !vi repeats the last command you typed that began with the letters "vi".

Regular Expressions & grep

Many of the Unix programs that do some sort of searching use what are called "regular expressions". These began with the ancient editor "ed", and so you can find out about them in "man ed". A regular expression is a pattern that text can either match or not match. Typically a program that uses regular expressions will tell you about the first match it finds; for example, the vi "/" command does this for the file currently being edited.

A regular expression is given as a string of characters. Most of these have no special meaning, i.e., they match only themselves. For example, "a" matches "a", and "abcd" matches only "abcd". ".", however, will match any printable character. So, if, in vi, you do a search for "a.b", you might find, say, "aXb" or "a3b". If you want to search for a period, precede it with a backslash: "a\.b" matches only "a.b". So, a backslash has a special meaning, too. What if you want to search for a backslash? Use "\\". The backslash is called the "escape" character, and preceding a character with a backslash is known as "escaping" it.

"^" matches the beginning of a line and "$" matches the end (does this look familiar?), so "^a..b$" matches only a line which is 4 characters long, begins with an "a" and ends with a "b". As before, if you want to match "^" or "$", use a backslash. "*" placed after a character matches zero or more occurrences of that character, so "ab*c" matches "ac", "abc", "abbbbbc", etc. Putting them all together, "^a.*b$" matches any line that begins with "a" and ends with "b".

A number of other characters have special meaning in regular expressions; see "man ed" for details.

Many programs use regular expressions. Foremost among them is "grep", which also has the distinction of having the most incomprehensible name of any Unix command. "grep" actually derives from a description of an "ed" command, and stands for "Global Regular Expression Print".

grep is given a regular expression and a filename on the command line. It prints all lines of the file that contain text matching the regular expression. So, e.g.,

% grep 'a.b' myfile
will print all the lines of myfile containing "aQb" or "a%b", etc. grep has many useful command line options, which are, of course, described in the man page.

Shell Scripts

A shell script is a text file of shell commands, which can be executed just as if you typed them in. Find out about these in "man csh".

The simplest form of script is just an ordinary text file of commands, which can be executed by typing "source filename".

You can also execute shell scripts by typing their name, as long as three conditions are met:

  1. The file is executable. To make a file executable, use "chmod" ("Change MODe"). E.g., "chmod u+x filename" will make a file executable by you, while "chmod a+x filename" will make it executable by anyone. To execute a file it also needs to be readable; read permissions are also set by the chmod command, using "r" instead of "x". See "man chmod" for more information.
  2. The file is in a directory listed in your "path" variable. You can move the file into the correct directory using "mv", and add directories to "path" using "set path = ($path newdirectoryname)".
  3. The shell knows about the file. When you log in, csh looks around for executable files. When you make a new one, type "rehash" to tell csh to update its knowledge of which executable files are available.
Scripts that can be executed by typing their name are a bit tricky to write. Normally, they are executed by "sh", which is an older Unix shell. If the first line of the script is "#", then they are executed by csh, but a new copy of csh is started just to execute the script, and it reads through your ".cshrc" file, which can take some time. Perhaps the best solution is to make the first line "#!/bin/csh -fb" which tells csh to do a "fast start" without reading ".cshrc". In this last case, however, none of your aliases are available for use in the file, since those are set up in ".cshrc".

Shell scripts are more useful than you might think, because shell commands constitute a full-fledged language, complete with if/then/else statements, loops, variables and comments. For example, try putting these lines in a file:

@ a = 1
while ($a <= 20)
  echo $a
  @ a = $a + 1
end
then execute the file using "source filename". Note: for serious programming, csh scripts are not recommended, due to bugs in csh. There are a number of other shells around, sh (the "Bourne Shell") being the most common, that are more reliable. Even better yet is a wonderful language called "perl". See the appropriate man pages for more information.

Compilers

Compilers are generally very straightforward - assuming you know the language they compile. If you have a "C" program in a text file called "prog.c" then "cc prog.c" compiles it. The executable result is placed in a file called "a.out", which you can execute by typing "a.out" You can send the output of the compiler to another file using the "-o" parameter. For example, "cc prog.c -o prog" compiles "prog.c" putting the executable output in "prog", which you can execute by typing "prog". For Pascal programs, the compiler is called "pc" (and the standard source file suffix is ".p").

If you create many executable files, whether they are shell scripts or compiled code, you may want to create a special directory for them. If you do so, you will need to put the name of that directory in your "path" variable so that csh will know to look in it when you type the name of one of the executable files. For example, if your directory is called "bin" (a standard Unix name for a directory full of executable files), then you can add the line

set path=($path ~/bin)
to the end of your .cshrc file. Whenever you make a new executable file, type "rehash" to tell the system to update its list of executables.

Games

There are many standard Unix text games. Your system may also have graphic games (which you won't be able to use on a text-only terminal). Games are usually kept in the directory /usr/games. Find out about these in section 6 of the online manual.

... and that's the end of the tutorial. Happy Unix'ing!


Unix Tutorial (September 1994 version) by Glenn Chappell <ggc@uiuc.edu> (Feel free to distribute this document however you want.)
[Index] [Prev]