By: Mark Mruss
Note: This article was first published the November 2007 issue of Python Magazine
While the equality operator works great on numbers and strings the fact the way it treats your custom objects really is not that useful. This article looks into overloading the equality operator so that you can easily compare your custom classes.
- Introducing the terms: operators and operator overloading
- A Quick Example of the Default Equality Operator
- Overloading the Equality Operator
- Telling Python that the Comparison has Not Been Implemented
- The Inequality Operator
In my experience as a professional programmer, testing for the equality between two instances of a class is a fairly common task. In other words, you are comparing the data that each class contains and checking whether the data in one class is identical to the data in the other class.
One of the nice features of Python is that it has a default equality operator defined for any custom objects that you create. The unfortunate thing about this default equality operator is that it doesnÃ¢Â€Â™t provide the functionality that you expect. This is because the equality operator (==) actually performs an identity comparison, rather than an equivalence test. If you were to run the following code:
if (object_one == object_two):
By default Python actually compares whether or not
object_two (this is the same comparison that can be made using the
is keyword) instead of determining whether or not
object_one is equivalent to
object_two. Fortunately for us, overloading the default equality operator in Python is a relatively easy task. There are, however, some “gotchas” and other interesting features of which one should be aware.
An operator can be difficult to define, and like many programming definitions, sometimes the definition only serves to confuse the matter further. In general though, you can think of operators as being very similar to the operators that you encountered in Math class, such as: the + operator, the – operator, and so forth.
In Python the following are operators:
+ - * / // % <>> & | ^ ~ <> < = >= == != <>
In programming languages we generally encounter binary operators. This means that each operator takes two operands. An operand serves as input to an operator. For example, in the statement:
2 + 6
+ is a binary operator that takes two operands, 2 and 6 as inputs. Similarly, in this statement:
my_value - 6
– is an operator that takes two operands,
my_value and 6 as inputs.
Operator overloading is a programming term that means taking the default behaviour of an operator and overloading it. That is, changing the default implementation of an operator for a given object. An example of this (although something that you should never do) would be to overload the + operator to actually perform subtraction instead when it is applied to your class.
Now that the definitions are out of the way, let’s look at an example where one might want to overload the equality operator. For this example I will bring back a favourite example from my Computer Science days: the
class Student(object): def __init__(self, name, student_number): self.name = name self.student_number = student_number
As you can see the
Student class has two data members: 1) the student’s name, and, 2) her student number.
If we run the following code:
mark = Student("Mark Mruss", 067213) guido = Student("Guido van Rossum", 000001) if (mark == guido): print "Equal" else: print "Not Equal"
“Not Equal” will be printed out as you would expect since the two students are clearly not equivalent. But what about this code:
mark = Student("Mark Mruss", 067213) mark_two = Student("Mark Mruss", 067213) if (mark == mark_two): print "Equal" else: print "Not Equal"
Here, as in the previous example, “Not Equal” will be printed out. This is because, as mentioned earlier, the default implementation of the equality operator is to perform an identity comparison. In other words, the default equality operator asks, is
mark the same object as
mark_two? In Python the equality comparison depends on the type of objects being compared. For custom classes that you or I will create, the equality comparison will perform an identity comparison by comparing the objectÃ¢Â€Â™s internal id. In other words, it will only result in True if the objects being compared actually are each other. For example:
student_one = Student("Mark Mruss", 067213) student_two = student_one if (student_one == student_two): print "Equal" else: print "Not Equal"
Results in “Equal” being printed out, as would:
student_one = Student("Mark Mruss", 067213) student_two = student_one if (id(student_one) == id(student_two)): print "Equal" else: print "Not Equal"
Note: The equality comparison for built-in objects and types like numbers, strings, lists, tuples, and mappings behave differently. Numbers are compared arithmetically. The numerical values of the characters within strings are compared arithmetically. The comparison of lists and tuples is simply a comparison of their inner values, while the comparison of mappings are comparisons of an ordered list of their values.
Hopefully the above example illustrated a case where we might want to overload the equality operator to make it so that the following code:
student_one = Student("Mark Mruss", 067213) student_two = Student("Mark Mruss", 067213) if (student_one == student_two): print "Equal" else: print "Not Equal"
Would result in “Equal” being printed out, i.e. a true equality comparison as opposed to an identity comparison. In order to do this we need to change to the default functionality of the equality operator. In other words we need to overload it.
In general, operator overloading in Python means adding a special function to your class that will perform the function of the operator it is meant to represent. There are two ways in which one can overload the equality operator in Python: 1) the first method is to use the
__eq__ function, a so-called “rich comparison” function. “Rich comparison” functions are functions that overload specific comparison operators (i.e.
__eq__ to overload ==). 2) The second is to use the
__cmp__ function, which is used to overload all comparison operators if no “rich comparison” functions are present.
__cmp__ is used to override all comparison operators (
==, !=, < , <=, >, >=), I would suggest using the “rich comparison” method unless you are using a version of Python that is earlier then version 2.1, or you are convinced that you know what
< = means to our
Student class. Let’s forget about the
__cmp__ operator for now and focus on using the “rich comparison” functions to overload the equality operator.
“Rich comparison” functions can return any value, but you should try to return a value that is, or can be, interpreted as a boolean value. This is important because these functions will often be used in situations where the return value will be used in a boolean comparison.
When using the “rich comparison” functions it is important to know which functions are being called internally. For example, when we run:
student_one == student_two
__eq__ exists in the
Student class, the following is actually being called:
When we run:
student_two == student_one
The following is actually called:
As you can see it is the operand on the left-hand side whose
__eq__ function will be called. It is important to note that if the operand on the left-hand side lacks the
__eq__ function while the operand on the right-hand side has one, the right-hand operand’s
__eq__ function will not be called.
Lets start off with a simple, but incorrect, example (the reasons for its incorrectness will be explained below):
def __eq__(self, other): return ((self.name == other.name) and (self.student_number == other.student_number))
This is very straightforward. In the equality comparison, we simply compare the
Student class’ two data members. This performs as expected when we run:
student_one = Student("Mark Mruss", 067213) student_two = Student("Guido van Rossum", 000001) student_three = Student("Mark Mruss", 000001) print (student_one == student_two) print (student_one == student_three)
But what happens when we introduce the
Professor class and try the overloaded equality operator:
class Professor(object): def __init__(self, instructor, course): self.instructor = instructor self.course = course
As you can see, the
Professor class lacks the
student_number data members. What happens when we compare an instance of the
Professor class with an instance of the
guido = Student("Guido van Rossum", 000001) rob = Professor("Rob Ward", "74-300") print (guido == rob)
It results in something like this:
File "operators.py", line 10, in __eq__ return ((self.name == other.name) AttributeError: 'Professor' object has no attribute 'name'
The way we are overriding the equality operator is not correct because it automatically assumes that the other object has the
student_number data members. There are a number of methods to get around this problem, including: 1) using the
hasattr function, or 2) using the
isinstance function. Using the
hasattr function determines if
other has the attributes we are looking for before actually querying them.
hasattr simply tells us if an object has a specific attribute or not. Here is a quick example illustrating how to do this:
def __eq__(self, other): if (hasattr(other, "name") and hasattr(other, "student_number")): return ((self.name == other.name) and (self.student_number == other.student_number)) else: return False
First, we check to see if
other has the
student_number attributes. If it does, we proceed as normal. If it does not, we simply return false. When we compare the professor and the student we get “False” as expected.
What’s nice about this method is that we don’t have to care what type
other is. We only care whether or not it contains the attributes we need to compare. However, the drawback to this function is that you have to test for the existence of each attribute. Although this may not always be a big deal, if you are dealing with fifty data members in your classes this can quickly become a pain in the neck.
Another solution to the problem with our first overloading example is to use the
isinstance function to make sure that
other is an instance of our class type. This has the drawback of forcing
other to be the same type as your class. In practice however, I believe this to be more of an advantage than a disadvantage.
def __eq__(self, other): if (isinstance(other, Student)): return ((self.name == other.name) and (self.student_number == other.student_number)) else: return False
The first thing we do is check the variable
other to make sure that it is an instance of the
Student class. If it is, we then compare all of the data members in the
Student class. If
object is not an instance of the
Student class, we return
In my opinion, this is the preferred method since knowing that the class is the correct type is often important. The
hasattr method seems more appropriate for simple data containers like a “rect” or “vector” class where you are only interested in three or four data members.
Up until this point in time we have been returning
False when our
__eq__ function does not support the type of object passed in as
other. While this is acceptable and correct given the Python documentation, it seems to be “proper” to actually return
NotImplemented. According to the Python documentation, “Numeric methods and rich comparison methods may return this value if they do not implement the operation for the operands provided. (The interpreter will then try the reflected operation, or some other fallback, depending on the operator.)” Let’s forget abou In other words, if the left operand returns
NotImplemented, Python will attempt to use the right hand operand’s equality operator. And if that does not exist, Python will fall back to the default equality operator.
We can return
NotImplemted from our
Student class if the operand passed in is not an instance of the
def _eq__(self, other): if (isinstance(other, Student)): return ((self.name == other.name) and (self.student_number == other.student_number)) else: return NotImplemented
Now if we perform the following comparison:
guido = Student("Guido van Rossum", 000001) rob = Professor("Rob Ward", "74-300") print guido == rob
The first step in the processing will be:
NotImplemented. As a result, the reflected operation is attempted:
rob == guido
Professor class does not have the equality operator overloaded, the default operation is executed and
False is printed out just like we wanted.
NotImplemented is useful in because instead of returning
False, which means that the two operand are not equivalent, you return a value that says that the comparison between the operands has not been implemented.
Now that we know how to overload the equality operator, it stands to reason that we have the opposite operation, the inequality operator (!=) covered as well. But not so fast. In Python the inequality and equality operators are handled separately, meaning that inequality is not simply the opposite of equality. This means that whenever you overload the equality operator, you have to be sure to overload the inequality operator as well. If you don’t you might get some strange results. For example, when we use the current code (without the inequality operator overloaded), the following:
guido = Student("Guido van Rossum", 000001) guido_too = Student("Guido van Rossum", 000001) print guido == guido_too print guido != guido_too
In the first comparison the overloaded equality operator is used, and results in
True being printed. Because the inequality operator is not overloaded in the second comparison, the default inequality operator is used (the identity comparison).
True is printed because
guido_too are not the same instances.
Thankfully once you have overloaded the equality operator, overloading the inequality operator is very easy. As a general rule, you have to return the opposite of the equality operator, but because we are working with
NotImplemented, we have to do a bit more processing to ensure that we don’t return
False when we really want to return
NotImplemented. Here is how we can overload the inequality operator in the
def __ne__(self, other): equal_result = self.__eq__(other) if (equal_result is not NotImplemented): return not equal_result return NotImplemented
First, we call
self.__eq__ to test whether or not we are equal to
other. We then check to make sure that
equal_result is not
NotImplemented. If it is not, we know that the equality test was implemented and we can safely return itsÃ¢Â€Â™ opposite. If the result for the equality comparison was
NotImplemented, we return
NotImplemented for the inequality comparison.
Note: It is safe to use the
is check on
NotImplemented (rather than an
isinstance check) because
NotImplemented is a singleton, meaning that there is only ever one instance of
NotImplemented at anytime.
While it may seem like operator overloading should become part of every class that you write, a word of warning is necessary. There is a large school of thought that views operator overloading as a dangerous programming technique. They argue that overloading operators changes the default way that an operator works, and not always correctly. Moreover, instead of overriding the equality operator, one can simply add an
is_equal_to function to perform the equality check.
The logic behind this criticism is that when someone is using a class or reading some code that you wrote, they will be unable to tell what the equality operator is doing. For example, if they see:
value = MyClass(10) value_two = MyClass(10) print value == value_two
What gets printed out? True or False? If Ã¢Â€ÂœMyClassÃ¢Â€Â overrode the equality operator then True will be printed. However, if the equality operator is not overloaded, the standard Python behaviour of equality will result with False being printed out.
While it’s true that overloading the equality operator does change the default way the Python functions, I feel that it’s generally a safe and beneficial addition to your classes. Especially since unless people know the ins and outs of the equality operator they will generally assume that should work the way it does when you overload it. Like all the decisions that you make when working with Python, context is key.