Comparing Two Floating-Point Numbers

The mathematician Leopold Kronecker is believed to have said:

God made the integers, all else is the work of man.

And Kronecker didn’t even know the floating-point numbers “made” for computers. Comparing two numbers of type float or double in C++ and related languages is a source for regular errors. When are the if-conditions in the following code snippet true?

float actual = calculate(10.0f, 0.2f, 45); // 10 - 0.2 * 45 = 1
if (actual == 1.0f) {
    // is this ever executed?
}
if (actual - 1.0f == 0.0f) {
    // is this ever executed?
}

The function calculate produces the result 10 – 0.2 * 45 = 10 – 9 = 1 – at least for human beings. So, both conditions should be true. However, they are both false.

If we turn on the warning -Wfloat-equal, the compiler will warn us that comparing floating point with == or != is unsafe. The compiler tells us what is wrong, but not how to fix it. I’ll explain why the above “solution” is wrong, why another obvious “solution” is nearly as bad, and why even the Qt function qFuzzyCompare can still be improved.

The Setup

The function calculate produces the result start - decrement * count. The important thing is that each of the subtractions may introduce a rounding error.

float calculate(float start, float decrement, int count)
{
    for (int i = 0; i < count; ++i)
        start -= decrement;
    return start;
}

The function testEquality calls calculate(x, 0.2f, 45) for every value from 9.0 to 1009 in steps of 0.1 and produces actual = x - 9 as the result. It counts how many times the if-condition is true. I’ll try different implementations for the floatCompare function in the next sections.

void testEquality()
{
    const auto total = 10000;
    auto count = 0;
    for (auto i = 0; i < total; ++i)
    {
        auto expected = (i / 10.0f);
        auto actual = calculate(9.0f + expected, 0.2f, 45);
        if (floatCompare(actual, expected))
            ++count;
    }
    QCOMPARE(count, total);
}

The Built-In Equality Operator

The first implementation of floatCompare is the built-in equality operator for floats.

bool floatCompare(float f1, float f2) const
{
    return f1 == f2;
}

Running the test case yields a disastrous result. Only 0.09% of the comparisons yielded the correct result!

FAIL!  : TestCompareFloats::testEquality() 
         Compared values are not the same
   Actual   (count): 9
   Expected (10000): 10000

For the expected value 1.0, the situation looks as follows:

expected = 1, actual = 1.000005007, error = 5.006790161e-06

The actual value has its first non-zero digit in the 6th decimal place. The digits after the decimal point are the rounding error, which is in the order of 1e-06 (10 to the power of -6). The calculation of the actual value introduced some rounding error, whereas the expected value didn’t suffer from a rounding error.

Rounding errors occur, when a number cannot be represented in a finite way. For example, we can represent the fraction 1/5 as a finite decimal number 0.2, but wecannot represent 1/3 as a finite decimal number. It’s 0.3333… with infinitely many 3s after the decimal point.

In the binary system, only fractions with a power of 2 as their denominator like 1/2, 1/4 and 1/8 have finite representations. Other fractions like 1/5 = 0.2 and 1/10 = 0.1 only have infinite binary representations. Computers must round these numbers to the next number with a finite representation. Hence, the binary representations of the actual value 1.000005007 and expected value 1 differ and the condition is false.

Comparisons with Fixed Epsilons

At this point, we probably have some fond memories of our undergraduate course in numerical mathematics. When we check whether actual is equal to 1.0, we actually want to check whether actual is in a small range around 1.0. In general, we check whether actual is in the range from expected - epsilon to expected + epsilon. epsilon is the rounding error, which is a small number. What is a good value for epsilon?

The C++ header <limits> provides a constant std::numeric_limits<float>::epsilon() for this purpose. The “standard” epsilon has the value 1.192092896e-07. This is to small for the problem at hand, where the error is roughly 5.0e-06. Just do the simplest thing that could work and use 1.0e-05 for epsilon. The equality function looks like this.

bool floatCompare(float f1, float f2) const
{
    return qAbs(f1 - f2) <= 1.0e-05;
}

Running the test case yields another surprising result.

FAIL!  : TestCompareFloats::testEquality() 
         Compared values are not the same
   Actual   (count): 126
   Expected (10000): 10000

A meagre 1.26% of all conditions produced the correct the result. That’s still bad, but a lot better than before. The obvious thing is to increase epsilon. Here are the results.

1.0-e06    1.0-e05    1.0-e04    1.0e-03
    10        126        648      10000 

For once, this worked as expected. But, it doesn’t help us to come up with an idea how to improve the comparison function. Let’s look at successful comparisons for epsilon 1.0-e05. All the comparison from 0.0 to 11.3 were successful. All the other comparisons except for the 14 comparisons from 56.3 to 57.3 and from 248.8 to 249.2 failed.

For epsilon 1.0-e04, the good range is from 0.0 to 62.2. We will see the same behaviour for epsilon 1.0-e03, if we set total to 100,000. Then, the condition will only be true 10,219 times (roughly 10.2%). If we compared numbers are greater than 16,777,216, we will have to use an epsilon greater than 1. It seems that we must choose a bigger epsilon the bigger the absolute values of the compared numbers are.

Let us look at small numbers closer and closer to 0. If the compared numbers are smaller than epsilon, say 1.0e-05, their difference must be smaller than epsilon as well. The comparison will always yield true. It seems that we must choose a smaller epsilon the smaller the absolute values of the compared numbers are.

We can summarise our observations as: Epsilon must adapt to the absolute values of the compared numbers. Using adaptive epsilons for comparing floats is the topic of the next section.

Comparisons with Adaptive Epsilons

Using adaptive epsilons seems to be a good idea. Should epsilon change proportional to the absolute value of the bigger number or of the smaller number? We choose the bigger number. Then, the implementation of floatCompare looks like this:

bool floatCompare(float f1, float f2) const
{
   return qAbs(f1 - f2) <= 1.0e-5f * qMax(qAbs(f1), qAbs(f2));
}

If we used qMin instead of qMax, epsilon would be smaller than for the other case. The function floatCompare would return true for a smaller range of numbers. The range for qMin would be included in the range of qMax.

The right-hand side of operator<= is the adaptive epsilon. The adaptive epsilon grows, if the maximum of the compared numbers grows. It shrinks, if the maximum shrinks. So, this version of floatCompare should do better than the previous versions. Running the test bears this out.

FAIL!  : TestCompareFloats::testEquality() 
         Compared values are not the same
   Actual   (count): 9995
   Expected (total): 10000

99.95% of the comparisons yield the correct result. The five wrong comparisons are:

expected          actual              error
0                 4.082918167e-06     4.082918167e-06
0.1000000015      0.1000044793        4.477798939e-06
0.200000003       0.2000040859        4.082918167e-06
0.3000000119      0.3000044823        4.470348358e-06
0.400000006       0.4000040889        4.082918167e-06

The Qt function qFuzzyCompare uses adaptive epsilons as well. It uses qMin instead of qMax and has mulitplied the inequation by the inverse of epsilon.

static inline bool qFuzzyCompare(float p1, float p2)
{
    return (qAbs(p1 - p2) * 100000.f <= 
           qMin(qAbs(p1), qAbs(p2)));
}

Not surprisingly, qFuzzyCompare yields the same result as our latest version of floatCompare – with the same five wrong comparisons.

Improving qFuzzyCompare

I am sure that the five wrong comparisons left over from the last section bother you as much as they bother me. If we used the fixed epsilon 1.0e-05 for the bad five, they would turn good. And, that’s the idea for a pragmatic solution. If two numbers are very close and, hence, the absolute value of their difference is smaller than epsilon, then these two numbers should be considered equal.

bool floatCompare(float f1, float f2) const
{
    if (qAbs(f1 - f2) <= 1.0e-5f)
        return true;
    return qAbs(f1 - f2) <= 1.0e-5f * qMax(qAbs(f1), qAbs(f2));
}

Running the test case yields a 100% score! floatCompare computed all comparisons correctly – including a comparison with 0.

The documentation of qFuzzyCompare advises us to add to 1.0f to the compared numbers or to turn to qFuzzyIsNull, if one of the compared numbers may be 0 or . This is a bit tedious, often hard to apply and easy to forget. The next test confirms the behaviour of floatCompare, qFuzzyIsNull and qFuzzyCompare, if the expected value is 0.

void testIsZero()
{
    auto actual = calculate(9.0f, 0.2f, 45);
    QVERIFY(floatCompare(actual, 0.0f));  // PASS
    QVERIFY(qFuzzyIsNull(actual));        // PASS
    QVERIFY(qFuzzyCompare(actual, 0.0f)); // FAIL
}

We can implement floatCompare using qFuzzyCompare and qFuzzyIsNull.

bool qFloatCompare(float f1, float f2) const
{
    if (qFuzzyIsNull(qAbs(f1 - f2)))
        return true;
    return qFuzzyCompare(f1, f2);
}

Conclusion

Do not under any circumstances use == or != to compare two float or double numbers. The comparison result will almost never be what you expect.

Do not use a fixed-epsilon check like qAbs(f1 – f2) <= 1.0e-05f. The results are only marginally better than with == or !=.

Use the adaptive-epsilon check of qFuzzyCompare with a lot of caution. We saw a very simple example, where qFuzzyCompare produced wrong comparison results – and not just comparisons with 0.

Use this more robust implementation for comparing floating-point numbers.

bool floatCompare(float f1, float f2) const
{
    static constexpr auto epsilon = 1.0e-05f;
    if (qAbs(f1 - f2) <= epsilon)
        return true;
    return qAbs(f1 - f2) <= epsilon * qMax(qAbs(f1), qAbs(f2));
}

It is not entirely fool-proof but it is much more reliable than qFuzzyCompare. We can improve its reliabilty even more, if we adapt epsilon to the accuracy requirements of our specific applications. If two decimal places are good enough, then we should change epsilon to 0.01.

The best solution is not to use floating-point numbers at all. Fixed-point numbers are an excellent alternative. A fixed-point number is roughly an integer with a scaling factor (e.g., 100 for 2 decimal places). Comparisons are integer comparisons. Interestingly, machines use fixed-point numbers to send their values over networks to other machines, display computers, the cloud or PCs. Unfortunately, the receiving code almost always converts them into float and double.

References

I learned a lot from Bruce Dawson’s excellent post Comparing Floating Point Numbers, 2012 Edition. The post is a part of a series about floating-point numbers. The links to the other parts of the series are at the beginning of the post. If you want to know more about floating-point numbers, Bruce’s series is the right starting point.

4 Comments

  1. Marcelo Almeida
    2019/08/28

    Most of your code snippets are indented more than necessary. If you remove this “initial indentation “ your code will be easier to read

  2. Burkhard Stubert
    2019/08/28

    Thanks for the hint. Fixed.

  3. Benjamin Mahr
    2019/08/30

    A really nice article, comparing floating point numbers is a problem indeed. I agree totally and as you already referenced Bruce Dawnson’s excellent article on this topic I would like to add that it is also very good to know about the floating point model in general.

    1. Burkhard Stubert
      2019/08/30

      You are welcome. I see that you have written quite extensively on floating-point numbers on your site as well – including a post about the floating-point model of IEEE 754 numbers. I’ll certainly check out your articles.

Comments are closed.

Scroll to top