Note: This article refers to division as it exists in python 2.7.x. Subsequent releases of python do not behave the same way in all cases listed here
When learning a new programming language, one of the first things you encounter is how to do arithmetic on different types of numbers. It is such a common part of every language that we tend to take for granted that our calculations will "just work" in all cases. I learned that is not necessarily true.
I recently wrote a program that had a bug in it - a bug based on a feature of the Python implementation of division. What follows is an overview of how Python handles division and some of the ways it can lead to unexpected behavior for the unaware.
Most programming languages do integer division, meaning that when you divide two integers, the quotient will also be an integer. Any remainder of the division is simply discarded, even if the remainder is something other than 0. For example, consider these calculations:
>>> 1/2
0
>>> 3/2
1
The correct mathematical answer to the first computation is 0.5
. However, since both operands are integers, the answer (i.e., quotient) is also integer. Therefore, we get 0
as the solution to 1/2
, and 1
as the solution to 3/2
, even though the remainder is 1
in both cases. While this can be confusing to the uninitiated, it is a familiar result for experienced programmers.
Speaking of remainders, you cannot really talk about dividing two integers without also considering their remainder (we will return to this soon). If a
is evenly divisible by b
, then the remainder will be 0
. Otherwise, the remainder will be an integer less than or equal to b-1
.
In python, it is easy to get remainders using the modulus operator, %
.
>>> 1%2
1
>>> 3%2
1
>>> 4%2
0
>>> 16%9
7
Sometimes, it is helpful to get both the quotient and the remainder from an integer division at the same time. For this purpose, we have the builtin divmod
function. This function returns a tuple (q, r)
, where q
is the quotient and r
is the remainder of an integer division.
>>> divmod(1, 2)
(0, 1)
>>> divmod(3, 2)
(1, 1)
>>> divmod(16, 9)
(1, 7)
>>> divmod(16, 4)
(4, 0)
What if you want to force the calculation to include the remainder? The python int
type is not designed to represent fractional components of numbers (they are integers, after all!), so this will require a type change. Since Python treats division on integers as integer division, we have to coerce the interpreter into using float division instead. One might think that something like float(1/2)
would work, but that just gives a float version of the answer above (0.0
if you are keeping score at home). The easiest way to accomplish this is by explicitly forcing one of the operands to be a float.
>>> 1.0/2
0.5
>>> 1.0*3/2
1.5
>>> float(3)/2
1.5
>>> 3./2
1.5
As expected, the first computation gives us the result 0.5
(0
, with a remainder of 1
), and the second gives 1.5
(1
with a remainder of 1
). Note that if either of the two operands is a float
, the result will always be a float
.
This introduces some unnecessary complexity into the behavior of the /
division operator; namely, that the type of a computation involving the /
operator cannot be determined a priori; we have to know the types of the arguments. This is not a problem with trivial examples like the ones here, but in a large program this could lead to unexpected behavior. (One might argue that this is a problem with almost all numeric operators in dynamically typed languages, e.g, 1.0 + 2 == 3.0
. However, operations like addition will not lose information in the case that both operands are of the same type; that is the case with the division example.)
At some point, Python's maintainers decided to resolve this resolve this by making the behavior of the division operator more explicit, because explicit is always better than implicit. This became the default behavior in Python 3, but it is also available as a backport in Python 2.
There is a way to force Python 2 division to behave like Python 3 division, i.e., make it always do float division irrespective of the operand types.
>>> from __future__ import division
>>> 1/2
0.5
As you can see, this import declaration overrides the default implementation of the /
operator, coercing this example to produce the same output that 1.0 / 2
would normally produce in python 2. This can be useful if you know you are going to be dealing with integers and division and you want to avoid unexpected outcomes (for example, when calculating the mean of an integer valued data set). However, it is irreversible within a module or in an interactive session, so make sure that is what you want before you commit to it.
There is a also a way to force python into always discarding the remainder, regardless of the type of the operands. This is usually called truncating division, and is expressed using the //
operator. It is available by default since python 2.2, so no need to import anything - you can just use it whenever you do not care about the remainder of a division operator.
>>> 1 // 2
0
>>> 3 // 2
1
>>> 1.0 // 2.0
0.0
>>> 3.0 // 2
1.0
>>> 3 // 2
1
Note that even though we discard any remainder of the division, the result still obeys the match the type of our operand that requires the greatest precision (always float)
rule. Even when doing truncating integer division on floats, your answer will be a float.
So far, so good. However, we have only dealt with positive numbers. What happens when we allow our operands to be negative? Let us take a look at a few examples.
>>> -1/2
-1
What!! How can -1/2
give a result of -1
? Mathematically, -1/2
should give -0.5
(2 goes into -1 zero times, and there is -1 as a remainder. Even if we use python's truncating integer division, we should expect to see 0
as the answer, not -1
. Let us check some other examples to see if it we can make sense of it.
>>> -1.0 / 2
-0.5
>>> -1.0 // 2
-1.0
>>> -3.0 // 2
-2.0
Okay, the first looks okay, but whenever integer division is required (the remainder discarded), it looks like dividing negative numbers makes they answer 1 whole integer less than the mathematically expected result. What is going on?
Think back to the section on remainders. There, we noted that it is difficult to talk about integer division without also talking about remainders at the same time. As it turns out, the these two concepts are not only closely related mathemtical entities, they are also tightly coupled in python's implementation of integer division.
At some point in Python's development, Guido decided to preservethe formal, mathematical definitition of integer division and its remainder from number theory. Here it is:
a / b = q
with remainder r
such that a = b*q + r
and 0 <= r < b
This means that whenever dividing an integer a
by a positive integer b
, you can always expect q <= a
and 0 <= r < b
. No matter what.
For integer division problems where both dividend and divisor are positive numbers, this is fine. But what about our problem of a negative dividend? Well it can be shown that if you do the normal integer division thing and truncate in the direction of 0 (e.g., -1//2 == 0
), then it forces the remainder of the division to be negative. This breaks the nice mathematical relationship between quotient and remainder mentioned above.
A second option is to round the quotient portion toward negative infinity, effectively doing int(math.floor(1.0 * a/b))
, but at a much lower level. Now, when calculating the remainder, it satisfies the same constraint as above.
Python's designer chose this second option because it reinforces a certain mathematically consistent aesthetic on the division and remainder operators; it has the added benefit of making some calculations involving the remainder simpler. For example, if it were possible to obtain a negative remainder when using the %
operator, it would needlessly complicate some easy calculations involving negative offsets from periodic patterns (i.e., What day of the week was it three days ago?). See the links at the end of this post for some concrete examples.
Many other language designers made different decisions. For example, most of the C style languages chose to keep a more intuitive (less surprising, at least) definition for division and remainder.
#include <stdio.h>
int main(void) {
int q = -1/2;
printf("%s", q == 0 ? "true" : "false" ); // true
return 0;
}
import java.util.*;
import java.lang.*;
import java.io.*;
/* Name of the class has to be "Main" only if the class is public. */
class ExampleDivision
{
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(-1/2 == 0); // true
}
}
Why did they make different decisions? I do not know, but in the case of C, Guido speculates that it may have been due to hardware limitations of the time, perhaps combined with the internal format used to represent signed ints (sign + magnitude, rather than two's complement).
Even though I have been doing python for quite a while now, it still has many dark secrets to yield. In the course of writing a program, I hit a an unexpected behavior related to division that let me do the research featured in this post. It was nice to find that there is a rational (ha-ha!) explanation for the surprising behavior I encountered, but there are definitely some caveats that need to be attached to python division. So here are a few guidelines for dealing with python's division quirks:
from __future__ import division
to remove ambiguity about the /
operator.//
for integer division when it is necessary.