Python Variables – Using Prime Symbol in Variable Names

pythonunicodevariables

So I'm a terrible person and I want to name a variable in my mathy-python3 code s′ (that's a prime symbol).

I was under the impression Unicode literals work as identifiers in Python 3, which is why my ɣ, α, and ε's are working fine.

Is the prime symbol specifically forbidden because of how terrible it would be to use?

Best Answer

You cannot just use any Unicode character in Python identifiers. There are rules governing what can be used:

identifier   ::=  xid_start xid_continue*
id_start     ::=  <all characters in general categories Lu, Ll, Lt, Lm, Lo, Nl, the underscore, and characters with the Other_ID_Start property>
id_continue  ::=  <all characters in id_start, plus characters in the categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
xid_start    ::=  <all characters in id_start whose NFKC normalization is in "id_start xid_continue*">
xid_continue ::=  <all characters in id_continue whose NFKC normalization is in "id_continue*">

The Unicode category codes mentioned above stand for:

Lu - uppercase letters
Ll - lowercase letters
Lt - titlecase letters
Lm - modifier letters
Lo - other letters
Nl - letter numbers
Mn - nonspacing marks
Mc - spacing combining marks
Nd - decimal numbers
Pc - connector punctuations
Other_ID_Start - explicit list of characters in PropList.txt to support backwards compatibility
Other_ID_Continue - likewise

The above is the Unicode equivalent to any number of letters, digits and underscores, as long as you don't start with a digit.

The U+2032 PRIME codepoint doesn't fall in any of those classes; it is considered Po - other punctuation instead.

I've linked the Unicode categories to codepoint.net queries for you so you can see what kind of characters are permitted. Your other identifier choices are all Ll lowercase letter characters, by the way.

You do want to be careful and use some common sense in what you use as identifiers. Keep them readable and recognisable. Preferably they should be easy to type too, so personally I'd steer away from using greek lowercase letters myself. I'd rather use alpha, gamma, epsilon and s_prime instead here.