Книга: Practical Common Lisp

Character Escaping

Character Escaping

The first bit of the foundation you'll need to lay is the code that knows how to escape characters with a special meaning in HTML. There are three such characters, and they must not appear in the text of an element or in an attribute value; they are <, >, and &. In element text or attribute values, these characters must be replaced with the character reference entities&lt;, &gt;, and &amp;. Similarly, in attribute values, the quotation marks used to delimit the value must be escaped, ' with &apos; and " with &quot;. Additionally, any character can be represented by a numeric character reference entity consisting of an ampersand, followed by a sharp sign, followed by the numeric code as a base 10 integer, and followed by a semicolon. These numeric escapes are sometimes used to embed non-ASCII characters in HTML.

The Package

Since FOO is a low-level library, the package you develop it in doesn't rely on much external code—just the usual dependency on names from the COMMON-LISP package and, almost as usual, on the names of the macro-writing macros from COM.GIGAMONKEYS.MACRO-UTILITIES. On the other hand, the package needs to export all the names needed by code that uses FOO. Here's the DEFPACKAGE from the source that you can download from the book's Web site:

(defpackage :com.gigamonkeys.html
(:use :common-lisp :com.gigamonkeys.macro-utilities)
(:export :with-html-output

The following function accepts a single character and returns a string containing a character reference entity for that character:

(defun escape-char (char)
(case char
(#& "&amp;")
(#< "&lt;")
(#> "&gt;")
(#' "&apos;")
(#" "&quot;")
(t (format nil "&#~d;" (char-code char)))))

You can use this function as the basis for a function, escape, that takes a string and a sequence of characters and returns a copy of the first argument with all occurrences of the characters in the second argument replaced with the corresponding character entity returned by escape-char.

(defun escape (in to-escape)
(flet ((needs-escape-p (char) (find char to-escape)))
(with-output-to-string (out)
(loop for start = 0 then (1+ pos)
for pos = (position-if #'needs-escape-p in :start start)
do (write-sequence in out :start start :end pos)
when pos do (write-sequence (escape-char (char in pos)) out)
while pos))))

You can also define two parameters: *element-escapes*, which contains the characters you need to escape in normal element data, and *attribute-escapes*, which contains the set of characters to be escaped in attribute values.

(defparameter *element-escapes* "<>&")
(defparameter *attribute-escapes* "<>&"'")

Here are some examples:

HTML> (escape "foo & bar" *element-escapes*)
"foo &amp; bar"
HTML> (escape "foo & 'bar'" *element-escapes*)
"foo &amp; 'bar'"
HTML> (escape "foo & 'bar'" *attribute-escapes*)
"foo &amp; &apos;bar&apos;"

Finally, you'll need a variable, *escapes*, that will be bound to the set of characters that need to be escaped. It's initially set to the value of *element-escapes*, but when generating attributes, it will, as you'll see, be rebound to the value of *attribute-escapes*.

(defvar *escapes* *element-escapes*)

Оглавление книги

Генерация: 0.056. Запросов К БД/Cache: 0 / 0
Вверх Вниз