Menu Close

Beginning Bash GNU

Why bash?

Because bash is everywhere. It may not be the newest, and it’s arguably not the fanciest or the most powerful (though if not, it comes close), nor is it the only shell that’s distributed as open source software, but it is ubiquitous. The reason has to do with history.

The first shells were fairly good programing tools, but not very convenient for users. The C shell added a lot of user conveniences (like the ability to repeat a command you just typed), but as a programming language it was quirky. The Korn shell, which came along next (in the early 80s), added a lot of user conveniences, and improved the programming language, and looked like it was on the path to widespread adoption.

But ksh wasn’t open source software at first; it was a proprietary software product, and was therefore difficult to ship with a free operating system like Linux. (The Korn shell’s license was changed in 2000, and again in 2005.) In the late 1980s, the Unix community decided standardization was a good thing, and the POSIX working groups (organized by the IEEE) were formed. POSIX standardized the Unix libraries and utilities, including the shell.

The bash Shell

bash is a shell: a command interpreter. The main purpose of bash (or of any shell) is to allow you to interact with the computer’s operating system so that you can accomplish whatever you need to do. Usually that involves launching programs, so the shell takes the commands you type, determines from that input what programs need to be run, and launches them for you.

You will also encounter tasks that involve a sequence of actions to perform that are recurring, or very complicated, or both. Shell programming, usually referred to as shell scripting, allows you to automate these tasks for ease of use, reliability, and reproducibility.

In case you’re new to bash, we’ll start with some basics. If you’ve used Unix or Linux at all, you probably aren’t new to bash—but you may not have known you were using it. bash is really just a language for executing commands—so the commands you’ve been typing all along (e.g., ls, cd, grep, cat) are, in a sense, bash commands. Some of these commands are built into bash itself; others are separate programs. For now, it doesn’t make a difference which is which.

GNU

GNU Project (http://www.gnu.org/). GNU (pronounced guh-noo, like canoe) is a recursive acro- nym for “GNU’s Not Unix” and the project dates back to 1984. Its goal is to develop a free (as in freedom) Unix-like operating system. Without getting into too much detail, what is commonly referred to as Linux is, in fact, a kernel with various supporting software as a core.

The GNU tools are wrapped around it and it has a vast array of other software possibly included, depending on your distribution. However, the Linux kernel itself is not GNU software. The GNU project argues that Linux should in fact be called “GNU/Linux” and they have a good point, so some distributions, notably Debian, do this.

Therefore GNU’s goal has arguably been achieved, though the result is not exclusively GNU. The GNU project has contributed a vast amount of superior software, notably including bash, but there are GNU versions of practically every tool we discuss in this book. And while the GNU tools are more rich in terms of features and (usually)
friendliness, they are also sometimes a little different.

A Note About Code Examples

When we show an executable piece of shell scripting in this book, we typically show it in an offset area like this:

$ ls
a.out  cong.txt  def.conf  file.txt  more.txt  zebra.list
$

The first character is often a dollar sign ($) to indicate that this command has been typed at the bash shell prompt.(Remember that you can change the prompt) The prompt is printed by the shell; you type the remainder of the line. Similarly, the last line in such an example is often a prompt (the $ again), to show that the command has ended execution and control has returned to the shell.

The pound or hash sign (#) is a little trickier. In many Unix or Linux files, including bash shell scripts, a leading # denotes a comment, and we have used it that way in some out our code examples. But as the trailing symbol in a bash command prompt (instead of $), # means you are logged in as root. We only have one example that is running anything as root, so that shouldn’t be confusing, but it’s important to understand.


When you see an example without the prompt string, we are showing the contents of a shell script. For several large examples we will number the lines of the script, though the numbers are not part of the script. We may also occasionally show an example as a session log or a series of commands. In some cases, we may cat one or more files so you can see the script and/or data files we’ll be using in the example or in the results of our operation.

$ cat data_file
static header line1
static header line2
1 foo
2 bar
3 baz

Many of the longer scripts and functions are available to download as well. See the end of this Preface for details. We have chosen to use #!/usr/bin/env bash for these examples, where applicable, as that is more portable than the #!/bin/bash you will see on Linux or a Mac.

Useless Use of cat

Certain Unix users take a positively giddy delight in pointing out inefficiencies in other people’s code. Most of the time this is constructive criticism gently given and gratefully received. Probably the most common case is the so-called “useless use of cat award” bestowed when someone does something like cat file | grep foo instead of simply grep foo
file.

In this case, cat is unnecessary and incurs some system overhead since it runs in a subshell. Another common case would be cat file | tr ‘[A-Z]’ ‘[a-z]’ instead of tr ‘[A-Z]’ ‘[a-z]’ < file. Sometimes using cat can even cause your script to fail. But… (you knew that was coming, didn’t you?) sometimes unnecessarily using cat actually does serve a purpose. It might be a placeholder to demonstrate the fragment of a pipeline, with other commands later replacing it (perhaps even cat -n).

Or it might be that placing the file near the left side of the code draws the eye to it more clearly than hiding it behind a < on the far right side of the page. While we applaud efficiency and agree it is a goal to strive for, it isn’t as critical as it once was. We are not advocating carelessness and code-bloat, we’re just saying that processors aren’t getting any slower any time soon. So if you like cat, use it.