Saturday, March 16, 2013


Surprises in SortedSet

There is more information in a collection than its elements. There is also its structure.

For instance, if the elements of a collection were steps in a recipe, the order of these steps is significant. Change the order and your meal may taste bad.

So, if I used a SortedSet to hold the steps in my recipe, I might reasonably assume that this is not the same recipe as one with the same steps but in a different order.

However, in Java this is not true.

        SortedSet sortedSet = new TreeSet();
        assertOrder(sortedSet); // passes
        Set noSortGuarantee = new HashSet();
        assertEquals(sortedSet, noSortGuarantee);
        assertOrder(noSortGuarantee); // fails

TreeSet.equals doesn't take order into account. Indeed, the equals method is in the superclass - the same superclass as HashSet, which has no notion of order.

Consistent With Equals

The JavaDocs say:

"The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C.... Virtually all Java core classes that implement Comparable have natural orderings that are consistent with equals. One exception is java.math.BigDecimal, whose natural ordering equates BigDecimal objects with equal values and different precisions (such as 4.0 and 4.00)."

BigDecimal is not consistent with equals. To check whether two BigDecimals are equal as most people would understand it, we must compare them thus:

        BigDecimal _4_00 = new BigDecimal("4.00");
        BigDecimal _4 = new BigDecimal("4");
        assertFalse(_4_00.equals(_4));        // wrong
        assertEquals(0, _4_00.compareTo(_4)); // right


Comparisons are useful in identity elements. An identity element "is a special type of element of a set with respect to a binary operation on that set. It leaves other elements unchanged when combined with them."

For instance, for the + operator, the identity is 0 (since x + 0 = x).

For the * operator, the identity element is 1 (since x * 1 = x).

Where it gets interesting is that for the maximum operator, the identity element is negative infinity. That is, any argument to the partially applied function max(NEGATIVE_INFINITY, ... is itself.

For floating point numbers, x > NEGATIVE_INFINITY for all x.

The converse is true for the minimum function.

No comments:

Post a Comment