Python – UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xa3′

character encodingpython

I have an Excel spreadsheet that I'm reading in that contains some £ signs.

When I try to read it in using the xlrd module, I get the following error:

x = table.cell_value(row, col)
x = x.decode("ISO-8859-1")
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 0: ordinal not in range(128)

If I rewrite this to x.encode('utf-8') it stops throwing an error, but unfortunately when I then write the data out somewhere else (as latin-1), the £ signs have all become garbled.

How can I fix this, and read the £ signs in correctly?

— UPDATE —

Some kind readers have suggested that I don't need to decode it at all, or that I can just encode it to Latin-1 when I need to. The problem with this is that I need to write the data to a CSV file eventually, and it seems to object to the raw strings.

If I don't encode or decode the data at all, then this happens (after I've added the string to an array called items):

for item in items:
    #item = [x.encode('latin-1') for x in item]
    cleancsv.writerow(item)
File "clean_up_barnet.py", line 104, in <module>
 cleancsv.writerow(item)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2022' in position 43: ordinal not in range(128)

I get the same error even if I uncomment the Latin-1 line.

Best Answer

A very easy way around all the "'ascii' codec can't encode character…" issues with csvwriter is to instead use unicodecsv, a drop-in replacement for csvwriter.

Install unicodecsv with pip and then you can use it in the exact same way, eg:

import unicodecsv
file = open('users.csv', 'w')
w = unicodecsv.writer(file)
for user in User.objects.all().values_list('first_name', 'last_name', 'email', 'last_login'):
    w.writerow(user)