Notebook 8 - Files, Errors, and Exceptions¶
Make a copy of this notebook by selecting File->Save a copy in Drive from the menu bar above.
Things you'll learn in this lesson:
- reading and writing files
- exceptions and error handling strategies
Reading and Writing Files¶
Our programs have amnesia¶
- Program variables reside in memory, and main memory is not persistent, so when you close a notebook, or terminate a program running locally, your data disappears.
- Imagine having to re-enter your contact list every time you use your phone.
- We'll need a way to store and retrieve data.
Storage Tradeoffs¶
- There are two kinds of storage in your computer:
- main memory is fast, but transient (like human memory)
- disk storage is slow(er), but permanent (like a notebook) and higher capacity
- All the things we've worked with so far (variables, functions, program statements) reside in main memory.
- We'll save information across program executions using disk storage in units we call files.
What is a file anyway?¶
- A named chunk of stored data is called a
file
. - Files are organized into hierarchical structures, called directories or folders.
- Examples:
- Windows:
c:\Users\marccohen\my_fave_movies.md
- Mac/Linux:
/Users/marccohen/my_fave_movies.md
- Windows:
path
is the file's location, e.g.c:\Users\marccohen\
- it's the "where"
filename
is the file's name, e.g.my_fave_movies.md
- it's the "which"
King Charles -----> the which
Buckingham Palace \
London, UK |---> the where
SW1A 1AA. /
Opening a File¶
- Before you can read or write a file, you need to open it.
- Use the
open()
function to open a file. - prototype:
variable = open(filename, mode)
- example:
file = open("myfile", "r")
- The first argument is a file specificaton, which can include a path or not.
- If no path is provided, the filename is assumed to reside in the current directory/folder.
- We'll cover the second argument, the mode, in the next cell.
- Open returns a special type, called a file object, which is used for subsequent operations on the file.
File Access Modes¶
Mode | Description | access | if file exists... | if file doesn't exist... |
---|---|---|---|---|
"r" | read from a file | read | open file | generate error |
"w" | write to a file | write | overwrite & open | create file & open |
"a" | append to a file | write | open for append | create file & open |
"r+" | read/write from/to a file | read/write | open file | generate error |
"w+" | write/read from/to a file | read/write | overwrite & open | create file & open |
"a+" | append/read a text file | read/write | open for append | create file & open |
Closing a File¶
- the opposite of
open()
isclose()
- when you're done working with a file, you should close it
- closing a file cleans up the loose ends
close()
is a method of the file object- example:
file.close()
Writing to a File¶
file.write('this is a line of text\n')
file
must have been opened with write or append access- writes the passed string into the file
- you have to include newline characters where you want them, otherwise subsequent write calls will build one long line
- writes may not be visible until you close the file
f = open('test.txt', 'w')
f.write('This is my test file.\n')
for i in range(10):
f.write('line number ' + str(i) + '\n')
f.close()
Reading From a File¶
mystr = file.read()
- file must have been opened with read access
- reads the entire file into memory
- the result is returned in a string
- you can pass an argument to limit how many characters are read
f = open('test.txt', 'r')
s = f.read()
f.close()
print(s)
Reading a file iteratively¶
for line in file:
- this iterates over the lines in a file
- each iteration of the loop reads a line from the file and sets the loop variable (
line
in this case) to the string value of each line in the file - the string includes the trailing newline
- this is a very handy way of processing a text file one line at a time
- also space-efficient because it only needs to store one line at a time in main memory
! cat test.txt # Display current contents of test.txt file.
myfile = open('test.txt', 'r')
for text in myfile:
print(text, end='')
myfile.close()
The with
Statement¶
- automatically ensures files get closed (and any other resoures get cleaned up)
- without the
with
statement...
file = open('file_path', 'w')
file.write('Hello world!')
file.close()
- using the
with
statement...
with open('file_path', 'w') as file:
file.write('Hello world!')
with open('test.txt', 'r') as f:
for line in f:
print(line, end='')
Summary of File Functions and Methods¶
open()
- open a fileclose()
- close a fileread(n)
- read up to n chars from current position to end of file and return in a string. if n not provided, read all chars from current position to end of file.write(s)
- write string s to a file- There are many more functions and methods available to operate on files in Python. You can learn more here.
Errors¶
This section is derived from work that is Copyright (c) The Carpentries.
Errors in Python have a very specific form, called a traceback. Let's examine one:
# This code has an intentional error. You can type it directly or
# use it for reference to understand the error message below.
def favorite_ice_cream():
ice_creams = [
'chocolate',
'vanilla',
'strawberry'
]
print(ice_creams[3])
favorite_ice_cream()
When run, this code produces the following result:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-1-70bd89baa4df> in <module>()
9 print(ice_creams[3])
10
----> 11 favorite_ice_cream()
<ipython-input-1-70bd89baa4df> in favorite_ice_cream()
7 ‘strawberry’
8 ]
—-> 9 print(ice_creams[3])
10
11 favorite_ice_cream()
IndexError: list index out of range
This particular traceback has two levels.
The first shows code from the cell above, with an arrow pointing to Line 11 (which is
favorite_ice_cream()
).The second shows some code in the function
favorite_ice_cream
, with an arrow pointing to Line 9 (which isprint(ice_creams[3])
).
The last level is where the error occurred. The other level(s) show what function the program executed to get to the next level down.
So, in this case, the program first called the function favorite_ice_cream
. Inside this function, the program encountered an error on Line 6, when it tried to run the code print(ice_creams[3])
.
So what error did the program actually encounter?
In the last line of the traceback,
Python helpfully tells us the category or type of error (in this case, it is an IndexError
)
and a more detailed error message (in this case, it says "list index out of range").
If you encounter an error and don't know what it means, it is still important to read the traceback closely. That way, if you fix the error, but encounter a new one, you can tell that the error changed. Additionally, sometimes knowing where the error occurred is enough to fix it, even if you don't entirely understand the message.
If you do encounter an error you don't recognize, try looking at the official documentation on errors. However, note that you may not always be able to find the error there, as it is possible to create custom errors. In that case, hopefully the custom error message is informative enough to help you figure out what went wrong.
Syntax Errors¶
When you forget a colon at the end of a line,
accidentally add one space too many when indenting under an if
statement, or forget a parenthesis,
you will encounter a syntax error.
This means that Python couldn't figure out how to read your program.
This is similar to forgetting punctuation in English:
for example,
this text is difficult to read there is no punctuation there is also no capitalization
why is this hard because you have to figure out where each sentence ends
you also have to figure out where each sentence begins
to some extent it might be ambiguous if there should be a sentence break or not
People can typically figure out what is meant by text with no punctuation, but people are much smarter than computers. If Python doesn't know how to read the program, it will give up and inform you with an error. For example:
def some_function()
msg = 'hello, world!'
print(msg)
return msg
File "<ipython-input-3-6bb841ea1423>", line 1
def some_function()
^
SyntaxError: invalid syntax
Here, Python tells us that there is a SyntaxError
on line 1,
and even puts a little arrow in the place where there is an issue.
In this case the problem is that the function definition is missing a colon at the end.
Actually, the function above has two issues with syntax.
If we fix the problem with the colon,
we see that there is also an IndentationError
,
which means that the lines in the function definition do not all have the same indentation:
def some_function():
msg = 'hello, world!'
print(msg)
return msg
File "<ipython-input-4-ae290e7659cb>", line 4
return msg
^
IndentationError: unexpected indent
Both SyntaxError
and IndentationError
indicate a problem with the syntax of your program,
but an IndentationError
is more specific:
it always means that there is a problem with how your code is indented.
Tabs and Spaces¶
Some indentation errors are harder to spot than others.
In particular, mixing spaces and tabs can be difficult to spot
because they are both whitespace.
In the example below, the first two lines in the body of the function
some_function
are indented with tabs, while the third line — with spaces.
def some_function():
msg = 'hello, world!'
print(msg)
return msg
Visually it is impossible to spot the error. Fortunately, Python does not allow you to mix tabs and spaces.
File "<ipython-input-5-653b36fbcd41>", line 4
return msg
^
TabError: inconsistent use of tabs and spaces in indentation
Variable Name Errors¶
Another very common type of error is called a NameError
,
and occurs when you try to use a variable that does not exist.
For example:
print(a)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-9d7b17ad5387> in <module>()
----> 1 print(a)
NameError: name ‘a’ is not defined
Variable name errors come with some of the most informative error messages, which are usually of the form "name 'the_variable_name' is not defined".
Why does this error message occur? That's a harder question to answer, because it depends on what your code is supposed to do. However, there are a few very common reasons why you might have an undefined variable. The first is that you meant to use a string, but forgot to put quotes around it:
print(hello)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-9553ee03b645> in <module>()
----> 1 print(hello)
NameError: name ‘hello’ is not defined
The second reason is that you might be trying to use a variable that does not yet exist.
In the following example,
count
should have been defined (e.g., with count = 0
) before the for
loop:
for number in range(10):
count = count + number
print('The count is:', count)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-9-dd6a12d7ca5c> in <module>()
1 for number in range(10):
----> 2 count = count + number
3 print('The count is:', count)
NameError: name ‘count’ is not defined
Finally, the third possibility is that you made a typo when you were writing your code.
Let's say we fixed the error above by adding the line Count = 0
before the for loop.
Frustratingly, this actually does not fix the error.
Remember that variable names are case-sensitive,
so the variable named count
is different from Count
. We still get the same error,
because we still have not defined count
:
Count = 0
for number in range(10):
count = count + number
print('The count is:', count)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-10-d77d40059aea> in <module>()
1 Count = 0
2 for number in range(10):
----> 3 count = count + number
4 print('The count is:', count)
NameError: name ‘count’ is not defined
Index Errors¶
Next up are errors having to do with containers (like lists and strings) and the items within them. If you try to access an item in a list or a string that does not exist, then you will get an error. This makes sense: if you asked someone what day they would like to get coffee, and they answered "caturday", you might be a bit annoyed. Python gets similarly annoyed if you try to ask it for an item that doesn't exist:
letters = ['a', 'b', 'c']
print('Letter #1 is', letters[0])
print('Letter #2 is', letters[1])
print('Letter #3 is', letters[2])
print('Letter #4 is', letters[3])
Letter #1 is a
Letter #2 is b
Letter #3 is c
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-11-d817f55b7d6c> in <module>()
3 print('Letter #2 is', letters[1])
4 print('Letter #3 is', letters[2])
----> 5 print('Letter #4 is', letters[3])
IndexError: list index out of range
Here,
Python is telling us that there is an IndexError
in our code,
meaning we tried to access a list index that did not exist.
File Errors¶
The last type of error we'll cover today
are those associated with reading and writing files: FileNotFoundError
.
If you try to read a file that does not exist,
you will receive a FileNotFoundError
telling you so.
If you attempt to write to a file that was opened read-only, Python 3
returns an UnsupportedOperationError
.
More generally, problems with input and output manifest as
IOError
s or OSError
s, depending on the version of Python you use.
file_handle = open('myfile.txt', 'r')
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-14-f6e1ac4aee96> in <module>()
----> 1 file_handle = open('myfile.txt', 'r')
FileNotFoundError: [Errno 2] No such file or directory: ‘myfile.txt’
One reason for receiving this error is that you specified an incorrect path to the file.
For example,
if I am currently in a folder called myproject
,
and I have a file in myproject/writing/myfile.txt
,
but I try to open myfile.txt
,
this will fail.
The correct path would be writing/myfile.txt
.
It is also possible that the file name or its path contains a typo.
A related issue can occur if you use the "read" flag instead of the "write" flag.
Python will not give you an error if you try to open a file for writing
when the file does not exist.
However,
if you meant to open a file for reading,
but accidentally opened it for writing,
and then try to read from it,
you will get an UnsupportedOperation
error
telling you that the file was not opened for reading:
file_handle = open('myfile.txt', 'w')
file_handle.read()
---------------------------------------------------------------------------
UnsupportedOperation Traceback (most recent call last)
<ipython-input-15-b846479bc61f> in <module>()
1 file_handle = open('myfile.txt', 'w')
----> 2 file_handle.read()
UnsupportedOperation: not readable
These are the most common errors with files, though many others exist. If you get an error that you've never seen before, searching the Internet for that error type often reveals common reasons why you might get that error.
Challenges¶
Reading Error Messages¶
Read the Python code and the resulting traceback below, and answer the following questions:
- How many levels does the traceback have?
- What is the function name where the error occurred?
- On which line number in this function did the error occur?
- What is the type of error?
- What is the error message?
# This code has an intentional error. Do not type it directly;
# use it for reference to understand the error message below.
def print_message(day):
messages = {
'monday': 'Hello, world!',
'tuesday': 'Today is Tuesday!',
'wednesday': 'It is the middle of the week.',
'thursday': 'Today is Donnerstag in German!',
'friday': 'Last day of the week!',
'saturday': 'Hooray for the weekend!',
'sunday': 'Aw, the weekend is almost over.'
}
print(messages[day])
def print_friday_message():
print_message(‘Friday’)
print_friday_message()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-1-4be1945adbe2> in <module>()
14 print_message('Friday')
15
---> 16 print_friday_message()
<ipython-input-1-4be1945adbe2> in print_friday_message()
12
13 def print_friday_message():
---> 14 print_message('Friday')
15
16 print_friday_message()
<ipython-input-1-4be1945adbe2> in print_message(day)
9 'sunday': 'Aw, the weekend is almost over.'
10 }
---> 11 print(messages[day])
12
13 def print_friday_message():
KeyError: 'Friday'
Solution¶
- 3 levels
print_message
- 11
KeyError
- There isn't really a message; you're supposed to infer that
Friday
is not a key inmessages
.
Identifying Syntax Errors¶
- Read the code below, and (without running it) try to identify what the errors are.
- Run the code, and read the error message. Is it a
SyntaxError
or anIndentationError
? - Fix the error.
- Repeat steps 2 and 3, until you have fixed all the errors.
def another_function
print('Syntax errors are annoying.')
print('But at least Python tells us about them!')
print('So they are usually not too hard to fix.')
Solution¶
SyntaxError
for missing ():
at end of first line,
IndentationError
for mismatch between second and third lines.
A fixed version is:
def another_function():
print('Syntax errors are annoying.')
print('But at least Python tells us about them!')
print('So they are usually not too hard to fix.')
Identifying Variable Name Errors¶
- Read the code below, and (without running it) try to identify what the errors are.
- Run the code, and read the error message.
What type of
NameError
do you think this is? In other words, is it a string with no quotes, a misspelled variable, or a variable that should have been defined but was not? - Fix the error.
- Repeat steps 2 and 3, until you have fixed all the errors.
for number in range(10):
# use a if the number is a multiple of 3, otherwise use b
if (Number % 3) == 0:
message = message + a
else:
message = message + 'b'
print(message)
Solution¶
3 NameError
s for number
being misspelled, for message
not defined,
and for a
not being in quotes.
Fixed version:
message = ''
for number in range(10):
# use a if the number is a multiple of 3, otherwise use b
if (number % 3) == 0:
message = message + 'a'
else:
message = message + 'b'
print(message)
Identifying Index Errors¶
- Read the code below, and (without running it) try to identify what the errors are.
- Run the code, and read the error message. What type of error is it?
- Fix the error.
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
print('My favorite season is ', seasons[4])
Solution¶
IndexError
; the last entry is seasons[3]
, so seasons[4]
doesn't make sense.
A fixed version is:
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
print('My favorite season is ', seasons[-1])
Exceptions¶
When a Python program has an error, something called an exception is raised. If nothing special is done to handle the exception, the program stops running and an error message describing the exception is displayed, like this:
>>> int('x')
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
int('x')
ValueError: invalid literal for int() with base 10: 'x'
With Python's exception handling facilities, you can:
- handle exceptions before they stop your program
- raise your own exceptions
- structure your error handling code in a simpler, more natural way
Exception Handling¶
Here's how to handle (or catch) an exception in Python:
try:
<block of code>
except:
<exception handler block of code>
How do we delineate the scope of the try
and except
blocks? As usual, by indentation.
Python tries to run the try
block.
If that causes an exception, then it runs the except
block.
if no exception is generated by the try
block, the except
block is skipped.
Exception Handling Example¶
Here's an example exception handler:
resp = input('enter an integer: ')
try:
num = int(resp)
except:
print('problem converting', resp, 'to int')
This code handles every possible exception type. You can specify a specific exception type you want to handle by including it after the except keyword, like this:
except ValueError: # handle ValueError exceptions only
Common Exception Types¶
Exception Type | Description |
---|
Exception|base type of all exceptions IOError|I/O operation failed IndexError|invalid index applied to a sequence KeyError|invalid key applied to a dictionary NameError|invalid or unknown variable or function name SyntaxError|invalid Python language syntax encountered TypeError|operator or function applied to inappropriate type ValueError|operator or function applied to invalid value ZeroDivisionError|division or modulus by zero
Handling Multiple Exceptions¶
You can handle multiple exception types with one except clause by specifying a comma separated list of exception types in parentheses (aka, brackets), like this:
except (TypeError, ValueError):
You can also use multiple except clauses in a single statement, like this:
def convert(param):
try:
value = int(param)
return value
except TypeError:
print('can\'t convert', type(param), 'to int')
except ValueError:
print('can\'t convert', param, 'to int')
Getting Exception Arguments¶
Exceptions often come with data passed as an exception argument, which can be obtained like this:
except ValueError as msg:
print(msg)
For example, this code:
try:
int('x')
except ValueError as msg:
print(msg)
displays this message:
invalid literal for int() with base 10: 'x'
Exceptions Can Have Else Clauses¶
An optional else clause, if included, is executed if no exception occurs, for example:
try:
int(resp)
except ValueError as msg:
print(msg)
else:
print('conversion succeeded')
This statement is certain to display one (and only one!) of the two prints statements above. What it won't do, is terminate your program due to a conversion error.
How to Raise an Exception¶
You can raise your own exceptions, anywhere in a Python program, using the raise
statement, like this:
raise <exception type>
or like this, to pass an argument along with the exception:
raise <exception type>(<argument>)
this causes an exception to be raised. Control resumes at the first enclosing code (in inner-to-outer order) that handles the raised exception type. If no enclosing code handles the exception, Python terminates the program and displays information about the exception.
Structuring Code Around Exceptions¶
What happens if we pass an empty list to this function:
def avg(numbers):
sum = 0
cnt = 0
for i in numbers:
sum += i
cnt += 1
return sum // cnt
We can use exceptions to catch errors like this:
try:
result = avg([])
except:
print('something went wrong')
Instead of writing code like this:
if choice == "a":
word = input('word to add: ')
err = add_word(word)
if (err):
print(err)
else:
print(word, 'successfully added')
elif choice == "d":
word = input('word to delete: ')
err = del_word(word)
if (err):
print(err)
else:
print(word, 'successfully deleted')
We can write code like this (one exception handler vs. multiple ifs):
try:
if choice == 'a':
word = input('word to add: ')
add_word(word)
print(word, 'successfully added')
elif choice == 'd':
word = input('word to delete: ')
del_word(word)
print(word, 'successfully deleted')
except Exception as msg:
print('ERROR:', msg)
Error Handling Styles¶
Style 1 (nested):
def add_friend(user, friend):
if user in users:
if friend in users:
if friend not in users[user]:
users[user].append(friend)
else:
raise Err('friend already on list')
else:
raise Err('unregistered friend name')
else:
raise Err('unregistered user name')
Style 2 (linear):
def add_friend(user, friend):
if user not in users:
raise Err('unregistered user name')
elif friend not in users
raise Err('unregistered friend name')
elif friend in users[user]
raise Err('friend already on list')
users[user].append(friend)
return None
I like this style because I find it more readable. Why is readability so important? Readable code is maintainable code.
Defining and Using Our Own Exceptions¶
We can limit the scope of our exception handler to our own exceptions, like this:
# define a new type of exception called avgError
class avgError(Exception):
pass
def avg(numbers):
if len(numbers) <= 0:
raise avgError(’empty sequence not supported’)
for i in numbers:
sum += i
cnt += 1
return sum // cnt
In calling code:
try:
result = avg([])
except avgError as msg:
print('ERROR:', msg)
This is just like our previous version except that we're handling an application specific exception. Other exceptions will not be caught, which is good (why?).
Homework¶
Question 1¶
Write a function called write_file() that takes two arguments: a filename and a list of strings, opens the named file for write access and uses a for
loop to write the list contents into the file, one string per line.
For example:
li = ['test', 'another test', 'last test']
write_file('output.txt', li)
Using your systems file explorer or command line, verify the file was created and has the expected contents. If you're not sure how to do that, you could also use your new read_file()
function!