Problem:

In Sublime Text 3, my Python3 script failed at processing a list of names and affiliations. The failure was due to French characters like these:

Olivier,Bonaventure,[email protected],Universit̩ catholique de Louvain

The processing script is as follows:

#!/usr/bin/python3.5

with codecs.open(CSV_FILE, 'r', encoding='utf-8') as csvfile:
    csv_reader = csv.reader(csvfile, delimiter=',')
    for i, row in enumerate(csv_reader):
        try:
            pre_st = ""
            suf_st = ""
            print(pre_st+"    "+row[0]+" "+row[1]+" ("+row[3]+")"+suf_st)
        except UnicodeEncodeError:
            print(i)

The error is UnicodeEncodeError, which says “‘ascii’ codec can’t encode character ‘\u0329’ in position 34: ordinal not in range(128)

Solution:

I modified my script as follows:

print(row[3].encode('utf-8'))
print(bytes(row[3], encoding='utf-8'))
print(row[3].encode('utf-8').decode('utf-8'))

The error remained after I ran the script in Sublime. It might be resulted from my incorrect configuration of Sublime’s build system for Python 3. This guess was verified after I successfuly executed in iTerm.

With modification on Python 3 build system like below:

{
	"cmd": ["/usr/local/bin/python3", "-u", "$file"],
    "env": {"PYTHONIOENCODING": "utf-8"},
	"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
	"selector": "source.python"
}

"env": {"PYTHONIOENCODING": "utf-8"}, is the key to save me from the failure. The correct execution result should like this:

b'Universit\xcc\xa9 catholique de Louvain'
b'Universit\xcc\xa9 catholique de Louvain'
Universit̩ catholique de Louvain

Reference: