: Learning GNU Emacs, 3rd Edition

11.3.2.2 Grouping and alternation

11.3.2.2 Grouping and alternation

If you want to get *, +, or ? to operate on more than one character, you can use the ( and ) operators for grouping. Notice that, in this case (and others to follow), the backslashes are part of the operator. (All of the nonbasic regular expression operators include backslashes so as to avoid making too many characters "special." This is the most profound way in which Emacs regular expressions differ from those used in other environments, like Perl, so it's something to which you'll need to pay careful attention.) As we saw before, these characters need to be double-backslash-escaped so that Emacs decodes them properly. If one of the basic operators immediately follows ), it works on the entire group inside the ( and ). For example, (read)* matches the empty string, "read," "readread," and so on, and read(file)? matches "read" or "readfile." Now we can handle Example 1, the first of the examples given at the beginning of this section, with the following Lisp code:

(replace-regexp "read(file)?" "get")

The alternation operator | is a "one or the other" operator; it matches either whatever precedes it or whatever comes after it. | treats parenthesized groups differently from the basic operators. Instead of requiring parenthesized groups to work with subexpressions of more than one character, its "power" goes out to the left and right as far as possible, until it reaches the beginning or end of the regexp, a (, a ), or another |. Some examples should make this clearer:

read|get matches "read" or "get"

readfile|read|get matches "readfile", "read," or "get"

(read|get)file matches "readfile" or "getfile"

In the first example, the effect of the | extends to both ends of the regular expression. In the second, the effect of the first | extends to the beginning of the regexp on the left and to the second | on the right. In the third, it extends to the backslash-parentheses.


: 0.257. /Cache: 3 / 1