Differences between revisions 1 and 2
Revision 1 as of 2005-01-13 21:56:42
Size: 800
Editor: av9361
Comment:
Revision 2 as of 2005-10-31 00:52:26
Size: 806
Editor: 68-189-248-26
Comment: Fixed non-WikiWords.
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:
When you concatenate byte string "abc" with unicode string u"bcd" Python will first convert "abc" into u"abc" by calling "abc".decode(sys.getdefaultencoding()). If you put non-ascii characters into byte string then .decode(sys.getdefaultencoding()) method will fail with UnicodeEncodeError, therefore byte strings should not contain non-ascii characters. In ["Python3.0"] sys.getdefaultencoding will be removed. When you concatenate byte string "abc" with unicode string u"bcd" Python will first convert "abc" into u"abc" by calling "abc".decode(sys.getdefaultencoding()). If you put non-ascii characters into byte string then .decode(sys.getdefaultencoding()) method will fail with {{{UnicodeEncodeError}}}, therefore byte strings should not contain non-ascii characters. In ["Python3.0"] sys.getdefaultencoding will be removed.

Python users who are new to Unicode sometimes are attracted by default encoding returned by sys.getdefaultencoding(). The first thing you should know about default encoding is that you don't need to care about it. Its value should be 'ascii' and it is used when converting byte strings ["StrIsNotAString"] to unicode strings. As in this example:

   1 a = "abc" + u"bcd"

When you concatenate byte string "abc" with unicode string u"bcd" Python will first convert "abc" into u"abc" by calling "abc".decode(sys.getdefaultencoding()). If you put non-ascii characters into byte string then .decode(sys.getdefaultencoding()) method will fail with UnicodeEncodeError, therefore byte strings should not contain non-ascii characters. In ["Python3.0"] sys.getdefaultencoding will be removed.

DefaultEncoding (last edited 2008-11-15 14:00:50 by localhost)

Unable to edit the page? See the FrontPage for instructions.