As an informatics student you learn hundreds of mathematical definitions and theorems and all of them seem to be useless programming. Here the evidence from the “real programmer life” to demonstrate you that’s not true.
I was asked to analyze some very strange behaving class. The Contains(object obj) method was not working correctly on after populating the list with instances of this class. The problem was a mess implementation of bool Equals(Object obj) overloaded methods. There was no logical or some kind of conditional mistake in code. So what was wrong?
There is an article on MSDN named Guidelines for Implementing Equals Method describing the way you should implement Equals method when overriding it. Sounds like a set of non mandatory recommendations. Let’s look at them in deep. Most of them are almost intuitive, but these three are essential ones:
Follow the contract defined on the Object.Equals Method as follows:
1. x.Equals(x) returns true.
2. x.Equals(y) returns the same value as y.Equals(x).
3. (x.Equals(y) && y.Equals(z)) returns true if and only if x.Equals(z) returns true.
And now some algebra; the definition of equivalence relation says (see: Wikipedia – Equivalence relation):
A given binary relation ~ on a set A is said to be an equivalence relation if and only if it is reflexive, symmetric and transitive.
Equivalently, for all a, b and c in A:
1. a ~ a. (Reflexivity)
2. if a ~ b then b ~ a. (Symmetry)
3. if a ~ b and b ~ c then a ~ c. (Transitivity)
So the Equals method must be an equivalence relation, thus only equivalence relation partitions a set into several disjoint subsets, called equivalence classes. All the elements in a given equivalence class are equivalent among themselves, and no element is equivalent with any element from a different class.
This theorem states that if we define a function which satisfies this tree simple rules, the result we get will correspond our “common sense” and what we intuitively understand under classifying objects and (IMPRTANT!) if one of the rules is violated the result will definitively lead to “inconsistent” behavior.
Examples
Let’s try to check following sample relation against this tree rules and see weather they can be used for equality or not.
class Foo
{
public string Text { get; set; }
public override bool Equals(object obj)
{
// Check for null values and compare run-time types.
if (obj == null || GetType() != obj.GetType()) { return false; }
this.Equals((Foo)obj);
}
private bool Equals(Foo obj)
{
return this.Text.Contains(obj.Text);
}
}
Here we try to use String.Contains(string substring) as equality relation.
1. String contains called on itself returns always true.
3. If string “ABC” contains “BC” and “BC” contains “C” than “ABC” contains “C” is also truth.
However the second rule will not pass. For instance “ABC” contains “BC” but “BC” does not contain “ABC”.
Another interesting sample would be do let two strings be equal if they are reverse to each other. In this case the rules 2. and 3. are satisfied but the rule 1. is not.
Testing
Testing several combinations is generally not enough to prove that a specific implementation of equality really satisfies these 3 requirements. You need either check all combinations or mathematically prove every of this three facts. So you still need quite a bit of mathematics.
Soon I am going to write about how implementations of Equals() and GetGashCode() are related to each other, what is the mathematical background of this relation and how many truth is in guidelines and recommendations on this topic.