Книга: Practical Common Lisp
What's Next
What's Next
Obviously, you could do a lot more with this code. To turn it into a real spam-filtering application, you'd need to find a way to integrate it into your normal e-mail infrastructure. One approach that would make it easy to integrate with almost any e-mail client is to write a bit of code to act as a POP3 proxy—that's the protocol most e-mail clients use to fetch mail from mail servers. Such a proxy would fetch mail from your real POP3 server and serve it to your mail client after either tagging spam with a header that your e-mail client's filters can easily recognize or simply putting it aside. Of course, you'd also need a way to communicate with the filter about misclassifications—as long as you're setting it up as a server, you could also provide a Web interface. I'll talk about how to write Web interfaces in Chapter 26, and you'll build one, for a different application, in Chapter 29.
Or you might want to work on improving the basic classification—a likely place to start is to make extract-features
more sophisticated. In particular, you could make the tokenizer smarter about the internal structure of e-mail—you could extract different kinds of features for words appearing in the body versus the message headers. And you could decode various kinds of message encoding such as base 64 and quoted printable since spammers often try to obfuscate their message with those encodings.
But I'll leave those improvements to you. Now you're ready to head down the path of building a streaming MP3 server, starting by writing a general-purpose library for parsing binary files.
- Next attachment ID
- Chapter 5. Preparations
- Chapter 6. Traversing of tables and chains
- Chapter 7. The state machine
- Chapter 8. Saving and restoring large rule-sets
- Chapter 9. How a rule is built
- Chapter 10. Iptables matches
- Chapter 11. Iptables targets and jumps
- Chapter 12. Debugging your scripts
- Chapter 14. Example scripts
- Chapter 15. Graphical User Interfaces for Iptables
- Chapter 16. Commercial products based on Linux, iptables and netfilter