Revision 7 as of 2005-06-10 18:30:33

Clear message

Escaping HTML

The cgi module that comes with Python has an escape function:

   1 import cgi
   2 
   3 s = cgi.escape( """& < >""" )   # s = "&amp; &lt; &gt;"

However, it doesn't escape characters beyond &, <, and >.

Here's a small snippet that will let you escape those as well:

   1 html_escape_table = \
   2     {"&": "&amp;",
   3      '"': "&quot;",
   4      "'": "&apos;",
   5      ">": "&gt;",
   6      "<": "&lt;"}
   7 
   8 def html_escape(text):
   9     """Produce entities within text."""
  10     L=[]
  11     for c in text:
  12         L.append(html_escape_table.get(c,c))
  13     return "".join(L)

Discussion

LionKimbro: Is there anything in the standard library for going the other way? Is there something where you can give it "&" and get back "&"? Perhaps in the XML libraries? I looked, but did not see anything. DOM, SAX- wouldn't be there. Not exactly XML-RPC either. Anyone know? (Answer needed for XML; HTML would be nice as well.) Date(2005-06-10T16:35:16Z)

Unable to edit the page? See the FrontPage for instructions.