| What is a "gotcha"? The word "gotcha" started out as the expression "Got you!" This is something that someone who speaks idiomatic American English might say when he succeeds in playing a trick or prank on someone else. "I really got you with that trick!" The expression "Got you!" is pronounced "Got ya!" or "Got cha!". Among computer programmers, a "gotcha" has become a term for a feature of a programming language that is likely to play tricks on you to display behavior that is different than what you expect. Just as a fly or a mosquito can "bite" you, we say that a gotcha can "bite" you. |
About this Page
This page is being migrated to the Python Conquers the Universe blog.
The gotchas that have not been migrated will not apply after Python 3.0 is released in August 2008.
This is a page devoted to Python "gotchas". Python is a very clean and intuitive language, so it hasn't got many gotchas, but it still has a few that often bite beginning Python programmers. My hope is that if you are warned in advance about these gotchas, you won't be bit quite so hard!
Note that a gotcha isn't necessarily a problem in the language itself. Rather, it is a situation in which there is a mismatch between the programmer's expections of how the language will work, and the way the language actually does work. Often, the source of a gotcha lies not in the language, but in the programmer. Part of what creates a programmer's expectations is his own personal background. A programmer with a Windows or mainframe background, or a background in COBOL or the Algol-based family of languages (PL/1, Pascal, etc.), is especially prone to experiencing gotchas in Python, a language that evolved in a Unix environment and incorporates a number of conventions of the C family of programming languages (C, C++, Java).
If you're such a programmer, don't worry. There aren't many Python gotchas. Keep learning Python. It is a great language, and you'll soon come to love it.
Other Lists of Python Gotchas
This post has moved to the Python Conquers the Universe blog.
This post has moved to the Python Conquers the Universe blog.
This is the feature of Python that (along with case-sensitivity) most frequently bites students who are learning Python as their first programming language.
The gotcha is this — if you divide an integer by an integer, you get an integer. If the division produces a remainder, the result is truncated downward. (Note that this is true truncation, not rounding toward zero.)
print 5/3 # produces 1 print (-5)/3 # produces -2 print 5/3.0 # produces 1.66666666667, because 3.0 is a float, not an integer
This behavior is called "classic" division, and is something that Python inherited from the C programming language. What one would like, of course, is for Python division to behave more intuitively, like "true" division:
print 5/3 # produces 1.66666666667 print (-5)/3 # produces -1.66666666667
Starting in Python 2.2, the process of slowly changing the behavior of the division operator began. First, a new floor operator // was introduced to provide division with downward truncation of the results. Second, a technique was introduced for a programmer to make the division operator behave in the more intuitive "true division" fashion. In order to get true division, you need to put this statement at the top of your module.
from __future__ import division
In Python version 2.3, Python will continue to use old-style classic division by default, but will also issue a warning whenever division is applied to two integers. You can use this to find code that's affected by the change and fix it. The fix — depending on what you want your program to be doing — will mean either changing the / division operator to the // floor operator, or adding the from __future__ import division statement to your module.
Eventually (in Python version 3) the new "true division" behavior for the division operator will become standard, and you won't need the from __future__ import division statement.
The bottom line is that you should not be running a version of Python older than 2.2, and should upgrade to 2.3 as soon as possible, and you should put from __future__ import division at the beginning of all of your modules.
For more information on this subject, see section 6 of What's New in Python 2.2.
It doesn't really write a trailing space -- it just causes an immediately subsequent print statement to write a leading space!
The Python Language Reference Manual says, about the print statement,
A "\n" character is written at the end, unless the print statement ends with a comma.
But it also says that if two print statements in succession write to stdout, and the first one ends with a comma (and so doesn't write a trailing newline), then the second one prepends a leading space to its output. (See section "6.6 The print statement" in the Python Reference Manual. Thanks to Marcus Rubenstein and Hans Meine for pointing this out to me.)
So
for i in range(10): print "*", print
produces
* * * * * * * * * *
If you want to print a string without any trailing characters at all, your best bet is to use sys.stdout.write()
import sys
for i in range(10): sys.stdout.write("*")
sys.stdout.write("\n")
produces
**********
This post has moved to the Python Conquers the Universe blog.
There's a Python gotcha that bites everybody as they learn Python. In fact, I think it was Tim Peters who suggested that every programmer get caught by it exactly two times. It is call the mutable defaults trap. Programmers are usually bit by the mutable defaults trap when coding class methods, but I'd like to begin with explaining it in functions, and then move on to talk about class methods.
The gotcha occurs when you are coding default values for the arguments to a function or a method. Here is an example for a function named functionF:
def functionF(argString = "abc", argList = []):
Here's what most beginning Python programmers believe will happen when functionF is called without any arguments:
A new string object containing "abc" will be created and bound to the "argString" variable name. A new, empty list object will be created and bound to the "argList" variable name. In short, if the arguments are omitted by the caller, the functionF will always get "abc" and [] in its arguments.
This, however, is not what will happen. Here's why.
The objects that provide the default values are not created at the time that functionF is called. They are created at the time that the statement that defines the function is executed. (See the discussion at Default arguments in Python: two easy blunders: "Expressions in default arguments are calculated when the function is defined, not when it’s called.")
If functionF, for example, is contained in a module named moduleM, then the statement that defines functionF will probably be executed at the time when moduleM is imported.
When the def statement that creates functionF is executed:
A new function object is created, bound to the name functionF, and stored in the namespace of moduleM.
Within the functionF function object, for each argument with a default value, an object is created to hold the default object. In the case of functionF, a string object containing "abc" is created as the default for the argString argument, and an empty list object is created as the default for the argList argument.
After that, whenever functionF is called without arguments, argString will be bound to the default string object, and argList will be bound to the default list object. In such a case, argString will always be "abc", but argList may or may not be an empty list. Here's why.
There is a crucial difference between a string object and a list object. A string object is immutable, whereas a list object is mutable. That means that the default for argString can never be changed, but the default for argList can be changed.
Let's see how the default for argList can be changed. Here is a program. It invokes functionF four times. Each time that functionF is invoked it displays the values of the arguments that it receives, then adds something to each of the arguments.
def functionF(argString="abc", argList = []):
print argString, argList
argString = argString + "xyz"
argList.append("F")
for i in range(4): functionF()
|
The output of this program is:
abc [] abc ['F'] abc ['F', 'F'] abc ['F', 'F', 'F']
As you can see, the first time through, the argument have exactly the default that we expect. On the second and all subsequent passes, the argString value remains unchanged — just what we would expect from an immutable object. The line
argString = argString + "xyz"
creates a new object — the string "abcxyz" — and binds the name "argString" to that new object, but it doesn't change the default object for the argString argument.
But the case is quite different with argList, whose value is a list — a mutable object. On each pass, we append a member to the list, and the list grows. On the fourth invocation of functionF — that is, after three earlier invocations — argList contains three members.
This behavior is not a wart in the Python language. It really is a feature, not a bug. There are times when you really do want to use mutable default arguments. One thing they can do (for example) is retain a list of results from previous invocations, something that might be very handy.
But for most programmers — especially beginning Pythonistas — this behavior is a gotcha. So for most cases we adopt the following rules.
So... we plan always to follow rule #1. Now, the question is how to do it... how to code functionF in order to get the behavior that we want.
Fortunately, the solution is straightforward. The mutable objects used as defaults are replaced by None, and then the arguments are tested for None.
def functionF(argString="abc", argList = None):
if argList is None: argList = []
...
|
Another solution that you will sometimes see is this:
def functionF(argString="abc", argList=None):
argList = argList or []
...
|
This solution, however, is not equivalent to the first, and should be avoided. See Learning Python p. 123 for a discussion of the differences. Thanks to Lloyd Kvam for pointing this out to me.
And of course, in some situations the best solution is simply not to supply a default for the argument.
Now let's look at how the mutable arguments gotcha presents itself when a class method is given a mutable default for one of its arguments. Here is a complete program.
# define a class for company employees
class Employee:
def __init__ (self, argName, argDependents=[]):
# an employee has two attributes: a name, and a list of his dependents
self.name = argName
self.Dependents = argDependents
def addDependent(self, argName):
# an employee can add a dependent by getting married or having a baby
self.Dependents.append(argName)
def show(self):
print
print "My name is.......: ", self.name
print "My dependents are: ", str(self.Dependents)
#---------------------------------------------------
# main routine -- hire employees for the company
#---------------------------------------------------
# hire a married employee, with dependents
joe = Employee("Joe Smith", ["Sarah Smith", "Suzy Smith"])
# hire a couple of unmarried employess, without dependents
mike = Employee("Michael Nesmith")
barb = Employee("Barbara Bush")
# mike gets married and acquires a dependent
mike.addDependent("Nancy Nesmith")
# now have our employees tell us about themselves
joe.show()
mike.show()
barb.show()
|
Let's look at what happens when this program is run. First, the code that defines the Employee class is run. Then we hire Joe. Joe has two dependents, so that fact is recorded at the time that the joe object is created. Next we hire Mike and Barb. Then Mike acquires a dependent. Finally, the last three statements of the program ask each employee to tell us about himself. Here is the result.
My name is.......: Joe Smith My dependents are: ['Sarah Smith', 'Suzy Smith'] My name is.......: Michael Nesmith My dependents are: ['Nancy Nesmith'] My name is.......: Barbara Bush My dependents are: ['Nancy Nesmith'] |
Joe is just fine. But somehow, when Mike acquired Nancy as his dependent, Barb also acquired Nancy as a dependent. This of course is wrong. And we're now in a position to understand what is causing the program to behave this way.
When the code that defines the Employee class is run, objects for the class definition, the method definitions, and the default values for each argument are created. The constructor has an argument argDependents whose default value is an empty list, so an empty list object is created and attached to the __init__ method as the default value for argDependents.
When we hire Joe, he already has a list of dependents, which is passed in to the Employee constructor — so the argDependents attribute does not use the default empty list object.
Next we hire Mike and Barb. Since they have no dependents, the default value for argDependents is used. Remember — this is the empty list object that was created when the code that defined the Employee class was run. So in both cases, the empty list is bound to the argDependents argument, and then — again in both cases — it is bound to the self.Dependents attribute. The result is that after Mike and Barb are hired, the self.Dependents attribute of both Mike and Barb point to the same object — the default empty list object.
When Michael gets married, and Nancy Nesmith is added to his self.dependents list, Barb also acquires Nancy as a dependent, because Barb's self.dependents variable name is bound to the same list object as Mike's self.dependents variable name.
So this is what happens when mutuable objects are used as defaults for arguments in class methods. If the defaults are used when the method is called, different class instances end up sharing references to the same object.
And that is why you should never, never, NEVER use a list or a dictionary as a default value for an argument to a class method. Unless, of course, you really, really, REALLY know what you're doing.
This work is licensed under the
Creative
Commons Attribution 2.0 License You are free to copy, distribute, and
display the work, and to make derivative works (including translations). If you
do, you must give the original author credit. The author specifically permits
(and encourages) teachers to post, reproduce, and distribute some or all of
this material for use in their classes or by their students.