Python – ‘ascii’ codec can’t decode byte 0xef in position

pythonpython-2.7

I am getting this pesky error on this line:

    r += '\n<Placemark><name>'+row[3].encode('utf-8','xmlcharrefreplace')+'</name>' \
         '\n<description>'+desc.encode('utf-8','xmlcharrefreplace')+'</description>\n' \
         '<Point><coordinates>'+row[clat].encode('utf-8','xmlcharrefreplace')+
         ','+row[clongitude].encode('utf-8','xmlcharrefreplace')+'</coordinates></Point>\n' \
         '<address>'+row[4].encode('utf-8','xmlcharrefreplace')+'</address>\n' \
         '<styleUrl>'+row[cstyleID].encode('utf-8','xmlcharrefreplace')+'</styleUrl>\n' \
         '</Placemark>'    

here is the error:

Traceback (most recent call last):
  File "<pyshell#38>", line 1, in <module>
    doStuff()
  File "C:\Python27\work\GenerateKML.py", line 5, in doStuff
    createFiles('together.csv')
  File "C:\Python27\work\GenerateKML.py", line 55, in createFiles
    '<styleUrl>'+row[cstyleID].encode('utf-8','xmlcharrefreplace')+'</styleUrl>\n' \
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 60: ordinal not in range(128)

What am I doing wrong?

thank you for your help.

here's the full source:

import hashlib
import csv

def doStuff():
  createFiles('together.csv')

def readFile(fileName):
  a=open(fileName)
  fileContents=a.read()
  a.close()
  return fileContents

def readCSVFile(fileName):
  return list(csv.reader(open(fileName, 'rb'), delimiter=',', quotechar='"'))

def GetDistinctValues(theFile, theColumn):
    with open(theFile, "rb") as fp:
        reader = csv.reader(fp)
        return list(set(line[theColumn] for line in reader))

def createFiles(inputFile):
  cNAME=3
  clat=0
  clongitude=1
  caddress1=4
  caddress2=5
  cplace=6
  ccity=7
  cstate=8
  czip=9
  cphone=10
  cwebsite=11
  cstyleID=18
  inputFileText=readCSVFile(inputFile)
  headerFile = readFile('header.txt')
  footerFile = readFile('footer.txt')
  r=headerFile
  DISTINCTCOLUMN=12
  dValues = GetDistinctValues(inputFile,DISTINCTCOLUMN)
  counter=0

  for uniqueValue in dValues:
    counter+=1
    print uniqueValue
    theHash=hashlib.sha224(uniqueValue).hexdigest()
    for row in inputFileText:
      if uniqueValue==row[DISTINCTCOLUMN]:
        for eachElement in row:
          eachElement=eachElement.replace('&','&amp;')            
        desc = ' '.join(row[3:])
        r += '\n<Placemark><name>'+row[3].encode('utf-8','xmlcharrefreplace')+'</name>' \
             '\n<description>'+desc.encode('utf-8','xmlcharrefreplace')+'</description>\n' \
             '<Point><coordinates>'+row[clat].encode('utf-8','xmlcharrefreplace')+
             ','+row[clongitude].encode('utf-8','xmlcharrefreplace')+'</coordinates></Point>\n' \
             '<address>'+row[4].encode('utf-8','xmlcharrefreplace')+'</address>\n' \
             '<styleUrl>'+row[cstyleID].encode('utf-8','xmlcharrefreplace')+'</styleUrl>\n' \
             '</Placemark>'      
    r += footerFile

    f = open(theHash+'.kml','w')
    f.write(r)
    f.close()
    r=headerFile

Best Answer

You might be trying to encode bytes. In this case Python first decodes the bytes using default encoding ( ASCII ) and then proceeds with encoding the result Unicode string using the encoding you supplied.

The solution is: don't encode bytes i.e., use encode() only on Unicode strings. In your case don't use it at all.

To create a valid XML document you could use xml.etree.ElementTree module.