: Learning GNU Emacs, 3rd Edition

11.1.2 Defining Functions

11.1.2 Defining Functions

Now it's time for an example of a simple function definition. Start Emacs without any arguments; this puts you into the *scratch* buffer, an empty buffer in Lisp interaction mode (see Chapter 9), so that you can actually try this and subsequent examples.

Before we get to the example, however, some more comments on Lisp syntax are necessary. First, you will notice that the dash (-) is used as a "break" character to separate words in names of variables, functions, and so on. This practice is simply a widely used Lisp programming convention; thus the dash takes the place of the underscore (_) in languages like C and Ada. A more important issue has to do with all of the parentheses in Lisp code. Lisp is an old language that was designed before anyone gave much thought to language syntax (it was still considered amazing that you could use any language other than the native processor's binary instruction set), so its syntax is not exactly programmer-friendly. Yet Lisp's heavy use of listsand thus its heavy use of parentheseshas its advantages, as we'll see toward the end of this chapter.

The main problem a programmer faces is how to keep all the parentheses balanced properly. Compounding this problem is the usual programming convention of putting multiple right parentheses at the end of a line, rather than the more readable technique of placing each right parenthesis directly below its matching left parenthesis. Your best defense against this is the support the Emacs Lisp modes give you, particularly the Tab key for proper indentation and the flash-matching-parenthesis feature.

Now we're ready for our example function. Suppose you are a student or journalist who needs to keep track of the number of words in a paper or story you are writing. Emacs has no built-in way of counting the number of words in a buffer, so we'll write a Lisp function that does the job:

1 (defun count-words-buffer ()
2(let ((count 0))
4(goto-char (point-min))
5(while (< (point) (point-max))
6(forward-word 1)
7(setq count (1+ count)))
8(message "buffer contains %d words." count))))

Let's go through this function line by line and see what it does. (Of course, if you are trying this in Emacs, don't type the line numbers in.)

The defun on line 1 defines the function by its name and arguments. Notice that defun is itself a functionone that, when called, defines a new function. (defun returns the name of the function defined, as a symbol.) The function's arguments appear as a list of names inside parentheses; in this case, the function has no arguments. Arguments can be made optional by preceding them with the keyword &optional. If an argument is optional and not supplied when the function is called, its value is assumed to be nil.

Line 2 contains a let construct, whose general form is:

(let ((var1 value1) (var2 value2) ... )

The first thing let does is define the variables var1, var2, etc., and set them to the initial values value1, value2, etc. Then let executes the statement block, which is a sequence of function calls or values, just like the body of a function.

It is useful to think of let as doing three things:

Defining (or declaring) a list of variables

Setting the variables to initial values, as if with setq

Creating a block in which the variables are known; the let block is known as the scope of the variables

If a let is used to define a variable, its value can be reset later within the let block with setq. Furthermore, a variable defined with let can have the same name as a global variable; all setqs on that variable within the let block act on the local variable, leaving the global variable undisturbed. However, a setq on a variable that is not defined with a let affects the global environment. It is advisable to avoid using global variables as much as possible because their names might conflict with those of existing global variables and therefore your changes might have unexpected and inexplicable side effects later on.

So, in our example function, we use let to define the local variable count and initialize it to 0. As we will see, this variable is used as a loop counter.

Lines 3 through 8 are the statements within the let block. The first of these calls the built-in Emacs function save-excursion, which is a way of being polite. The function is going to move the cursor around the buffer, so we don't want to disorient the user by jumping them to a strange place in their file just because they asked for a word count. Calling save-excursion tells Emacs to remember the location of cursor at the beginning of the function, and go back there after executing any statements in its body. Notice how save-excursion is providing us with capability similar to let; you can think of it as a way of making the cursor location itself a local variable.

Line 4 calls goto-char. The argument to goto-char is a (nested) function call to the built-in function point-min. As we have mentioned before, point is Emacs's internal name for the position of the cursor, and we'll refer to the cursor as point throughout the remainder of this chapter. point-min returns the value of the first character position in the current buffer, which is almost always 1; then, goto-char is called with the value 1, which has the effect of moving point to the beginning of the buffer.

The next line sets up a while loop; Java and Perl have a similar construct. The while construct has the general form

(while conditionstatement-block)

Like let and save-excursion, while sets up another statement block. condition is a value (an atom, a variable, or a function returning a value). This value is tested; if it is nil, the condition is considered to be false, and the while loop terminates. If the value is other than nil, the condition is considered to be true, the statement block gets executed, the condition is tested again, and the process repeats.

Of course, it is possible to write an infinite loop. If you write a Lisp function with a while loop and try running it, and your Emacs session hangs, chances are that you have made this all-too-common mistake; just type C-g to abort it.

In our sample function, the condition is the function <, which is a less-than function with two arguments, analogous to the < operator in Java or Perl. The first argument is another function that returns the current character position of point; the second argument returns the maximum character position in the buffer, that is, the length of the buffer. The function < (and other relational functions) return a Boolean value, t or nil.

The loop's statement block consists of two statements. Line 6 moves point forward one word (i.e., as if you had typed M-f). Line 7 increments the loop counter by 1; the function 1+ is shorthand for (+ 1 variable-name). Notice that the third right parenthesis on line 7 matches the left parenthesis preceding while. So, the while loop causes Emacs to go through the current buffer a word at a time while counting the words.

The final statement in the function uses the built-in function message to print a message in the minibuffer saying how many words the buffer contains. The form of the message function will be familiar to C programmers. The first argument to message is a format string, which contains text and special formatting instructions of the form %x, where x is one of a few possible letters. For each of these instructions, in the order in which they appear in the format string, message reads the next argument and tries to interpret it according to the letter after the percent sign. Table 11-1 lists meanings for the letters in the format string.

Table11-1.Message format strings

Format string Meaning
%s String or symbol
%c Character
%d Integer
%e Floating point in scientific notation
%f Floating point in decimal-point notation
%g Floating point in whichever format yields the shortest string

For example:

(message ""%s" is a string, %d is a number, and %c is a character"
"hi there" 142 ?q)

causes the message:

"hi there" is a string, 142 is a number, and q is a character

to appear in the minibuffer. This is analogous to the C code:

printf (""%s" is a string, %d is a number, and %c is a charactern",
"hi there", 142, 'q');

The floating-point-format characters are a bit more complicated. They assume a certain number of significant digits unless you tell them otherwise. For example, the following:

(message "This book was printed in %f, also known as %e." 2004 2004)

yields this:

This book was printed in 2004.000000, also known as 2.004000e+03.

But you can control the number of digits after the decimal point by inserting a period and the number of digits desired between the % and the e, f, or g. For example, this:

(message "This book was printed in %.3e, also known as %.0f." 2004 2004)

prints in the minibuffer:

This book was printed in 2.004e+03, also known as 2004.

: 0.271. /Cache: 3 / 0