== Inroduction ==
Python 3 Supports Non-ASCII Identifiers as per [[http://www.python.org/dev/peps/pep-3131/|PEP 3131]]. But this support is incomplete for  certain languages where special characters such as ZWJ, ZWNJ are used extensively. Example for such languages are Malayalam, Kannada, Sinhala, Farsi etc.

== Unicode standard on Using ZWJ/ZWNJ etc in Identifiers ==
ZWJ and ZWNJ are format control characters and unicode defines the usage of these characters in identifiers in [[http://unicode.org/reports/tr31/#Layout_and_Format_Control_Characters| TR31 in section 2.3  Layout and Format Control Characters]]

Unicode recommends allowing usage of ZWJ/ZWNJ or "the Join_Control characters" in Identifiers limited to 3 contexts.

 * Allow ZWNJ in breaking a cursive connection : That is, in the context based on the Joining_Type property, consisting of:

  * A Left-Joining or Dual-Joining character, followed by zero or more Transparent characters, followed by a ZWNJ, followed by zero or more Transparent characters, followed by a Right-Joining or Dual-Joining character
  * This corresponds to the following regular expression (in Perl-style syntax): /$LJ $T* ZWNJ $T* $RJ/
      where:

          $T = [:Joining_Type=Transparent:]

          $RJ = [ [:Joining_Type=Dual_Joining:][:Joining_Type=Right_Joining:] ]

          $LJ = [ [:Joining_Type=Dual_Joining:][:Joining_Type=Left_Joining:] ] 

 * Allow ZWNJ in a conjunct context. That is, a sequence of the form:

   * A Letter, followed by a Virama, followed by a ZWNJ
   * This corresponds to the following regular expression (in Perl-style syntax): /$L $V ZWNJ/
      where:

          $L = [:General_Category=Letter:]

          $V = [:Canonical_Combining_Class=Virama:] 

 * Allow ZWJ in a conjunct context. That is, a sequence of the form:

  * A Letter, followed by a Virama, followed by a ZWJ
  * This corresponds to the following regular expression (in Perl-style syntax): /$L $V ZWJ/      where:

          $L= [:General_Category=Letter:]
          $V = [:Canonical_Combining_Class=Virama:] 



== Affected Languages ==
 * Malayalam
 * Kannada
 * Bengali
 * Languages that use Devanagari Script (Hindi, Marathi..)
 * Telugu
 * Farsi
 * Sinhala
 * Arabi
 * Khmer

== References ==

 * http://bugs.python.org/issue5358 -- Rejected issue about control characters
 * http://www.python.org/dev/peps/pep-3131/
 * http://unicode.org/review/pr-96.html
 * http://unicode.org/reports/tr31/#Layout_and_Format_Control_Characters
 * http://www.unicode.org/reports/tr36/
 * [[http://www.reddit.com/r/Python/comments/dgf1q/how_to_approach_a_complex_issue_where_python_core/|Suggestions from /r/Python community]]