If the reason you're checking is so you can do something like if file_exists: open_it()
, it's safer to use a try
around the attempt to open it. Checking and then opening risks the file being deleted or moved or something between when you check and when you try to open it.
If you're not planning to open the file immediately, you can use os.path.isfile
Return True
if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.
import os.path
os.path.isfile(fname)
if you need to be sure it's a file.
Starting with Python 3.4, the pathlib
module offers an object-oriented approach (backported to pathlib2
in Python 2.7):
from pathlib import Path
my_file = Path("/path/to/file")
if my_file.is_file():
# file exists
To check a directory, do:
if my_file.is_dir():
# directory exists
To check whether a Path
object exists independently of whether is it a file or directory, use exists()
:
if my_file.exists():
# path exists
You can also use resolve(strict=True)
in a try
block:
try:
my_abs_path = my_file.resolve(strict=True)
except FileNotFoundError:
# doesn't exist
else:
# exists
Your code works when run in an script because Python encodes the output to whatever encoding your terminal application is using. If you are piping you must encode it yourself.
A rule of thumb is: Always use Unicode internally. Decode what you receive, and encode what you send.
# -*- coding: utf-8 -*-
print u"åäö".encode('utf-8')
Another didactic example is a Python program to convert between ISO-8859-1 and UTF-8, making everything uppercase in between.
import sys
for line in sys.stdin:
# Decode what you receive:
line = line.decode('iso8859-1')
# Work with Unicode internally:
line = line.upper()
# Encode what you send:
line = line.encode('utf-8')
sys.stdout.write(line)
Setting the system default encoding is a bad idea, because some modules and libraries you use can rely on the fact it is ASCII. Don't do it.
Best Answer
To delimit by a tab you can use the
sep
argument ofto_csv
:To use a specific encoding (e.g. 'utf-8') use the
encoding
argument: