About
Subscribe

Back to basics with Sed and Awk

Sometimes it is easy to forget that it is the basics and not the flashy graphical interfaces that make Linux so appealing.
By Alastair Otter, Journalist, Tectonic
Johannesburg, 20 Jun 2002

Sometimes I spend so much time thinking about Linux on the desktop that I forget about the true basics that make Linux my operating system of choice.

In the past week, I have happily re-discovered this attraction thanks to two rather peculiarly named utilities called Sed and Awk. Included with all Linux and Unix versions, Sed and Awk are simple but incredibly powerful text-editing tools that illustrate the power inherent in a system such as Linux or FreeBSD.

It is not so much the abilities of Sed and Awk that make me such a fan, but rather the fact that they are included in Linux distributions by default.

Alastair Otter, Journalist, ITWeb

Sed stands for "stream editor" and Awk is named after its makers, and while Awk is the more powerful of the two, both take in streams of text, manipulate the input and then output the result. Doesn`t sound very impressive, does it? After all, a simple word processor or text editor can do the same task interactively. True, but imagine that you need to edit hundreds of files. Or even just one log file that grows exponentially every day. Of course you could edit it by hand, but if the text has a regular structure, then why not automate the process? This is where these two tools come in.

Sed is probably the easier to learn of the two because it is the more basic and it is easy enough to use as soon as you get your head around the syntax. For example: sed `/^$/d` removes all the blank lines from a text file. Similarly sed`/s/^ *//` removes all leading spaces from a text file. Simple enough if you just want to clean up files of blank lines and spaces. But typical of Linux tools, the options are almost limitless, so sorting, cleaning and formatting text files become a simple script-based task.

The big easy

Awk, on the other hand, is more of a programming language and is really handy if you need to manipulate a structured text much like a database. For example, suppose you have a file of first and last names and you need to re-arrange them so that instead of reading "first name, last name" you want them to be ordered as "last name, first name" a simple awk `{print $2 $1 }` will do the job.

Why would you want to do any of this? My current use is to strip an Evolution addressbook of its formatting tags, re-order the file and then merge it with another comma-separated list of names and addresses. A seemingly simple enough task and perhaps a bit of time with a word processor could do the job, but because I add new contacts every day I need to update my complete list just about as often. And I don`t have the time or energy to do it manually every day.

Other examples are formatting of text files for online publishing, something I do every day with a Sed script. Or if you`re in the administration side of the business, then perhaps you need to build coherent and intelligent log performance reports out of hundreds of smaller log files. Sed and Awk to the rescue again.

It is not so much the abilities of Sed and Awk that make me such a fan, but rather the fact that they are included in Linux by default. Because of this I now have, at hand and for free, a set of tools that makes it possible for me to build, without extensive experience, utilities that achieve day to day text jobs that make my life simpler, and Linux even more appealing.

Share