Wednesday, October 7, 2009

BigDecimal

Whenever you're working with money in Java, you should be using BigDecimal. There are other applications, but far and away the most common is monetary calculations. The biggest reason is accuracy: floating point values (stored using IEEE standard 754) can't represent common decimal amounts. For that, you need actual arbitrary precision math--the same kind you learned in grade school. That's what BigDecimal provides. Yes, the operations are clumsy-looking and require a lot of typing. Wouldn't you want your bank or your tax office to have full control over the precision of the values they use to calculate your interest or taxes? You owe it to your users to provide the same precision.

The problem is illustrated simply; pick any number with multipliers other than 2, store it, then print it out.

System.out.printf("float: %60.55f\n",0.35f);

float: 0.3499999940395355000000000000000000000000000000000000000

Using doubles helps, but doesn't make the problem go away--it just delays the point where you'll see a problem. The only real solution is to use arbitrary-precision math (or switch to a CPU which works in base 10 rather than base 2).

Wherever you can, make your first conversion from user input or database storage to BigDecimal using strings. The internal conversion exposes how these values are stored in the first place:

System.out.printf("BD by string: %60.55f\n",new BigDecimal("0.35"));
System.out.printf("BD by double: %60.55f\n",new BigDecimal(0.35d));
System.out.printf("BD by float: %60.55f\n",new BigDecimal(0.35f));

BD by string: 0.3500000000000000000000000000000000000000000000000000000
BD by double: 0.3499999999999999777955395074968691915273666381835937500
BD by float: 0.3499999940395355224609375000000000000000000000000000000

Once you have values stored as BigDecimals, keep all your math between two BigDecimal values. You don't gain anything by storing a value in BigDecimal, then multiplying it by a float or a double! You introduce the same problem by doing that that you sidestepped by moving into BigDecimal. I've seen senior developers do the following:

public BigDecimal multiply(BigDecimal bd, double d){
return bd.multiply(new BigDecimal(d));
}

Given the "BD by double" result above, is this really going to produce the results you expect? By doing this, you just gave up whatever advantage you had using BigDecimal in the first place!

If you use division in BigDecimal, you're going to have to round your results. To illustrate why, consider the following:

BigDecimal result = BigDecimal.ONE.divide(new BigDecimal("3"));

Exception in thread "main" java.lang.ArithmeticException:
Non-terminating decimal expansion; no exact representable decimal result.
at java.math.BigDecimal.divide(BigDecimal.java:1594)
at BDDemo.main(BDDemo.java:12)

Any calculation which results in a repeating decimal result will throw this exception. To deal with this problem, we need to discuss the relationship between "precision", "scale" and "rounding". "Precision" is the number of "significant digits" we learned about in science class in high school. The following numbers all have a precision of 4:

123.4
1234
12340
12.34
0.000001234

Unfortunately, BigDecimal has no internal concept of "precision". It stores its result as an integer of arbitrary precision, and then places the decimal point using a concept called "scale"--the number of places to the right of the decimal point. Thus, the scale of "12.34" is 2; the scale of "0.000001234" is 9. When you construct a BigDecimal with a given scale, you're reducing its precision.

In order to deal with the 'repeating' result, you're going to have to know enough about your problem domain to specify a scale, and a rounding mode to determine the result.

Rounding modes are discussed in detail in the BigDecimal javadoc, and I'll discuss them in a future entry, but for now, here's the list:

ROUND_UP round away from zero.
ROUND_DOWN round towards zero.

ROUND_CEILING round towards positive infinity.
ROUND_FLOOR round towards negative infinity.

ROUND_HALF_DOWN round towards "nearest neighbor" unless both
neighbors are equidistant, in which case round down.
ROUND_HALF_EVEN round towards the "nearest neighbor" unless both
neighbors are equidistant, in which case, round towards
the even neighbor.
ROUND_HALF_UP round towards "nearest neighbor" unless both
neighbors are equidistant, in which case round up.

ROUND_UNNECESSARY assert that the requested operation has an exact
result, hence no rounding is necessary.

Note that "ROUND_DOWN" is the same as truncating the result, and by convention, banks and accountants uses ROUND_HALF_EVEN (and has for at least the 3 decades I've been paying attention, if not longer).

Rules for using BigDecimals are heavily tied to your problem domain, but I do have a few rules of thumb as starting places:

1. Don't round at all until you must (a) divide, (b) you have to store or output results.

2. BigDecimal conversion to and from a database should be done as strings. Hibernate, iBatis, and many other tools support this conversion easily and directly.

3. Standardize your representations early; put a small list of rounding mode/scale pairs for your application in an enumeration somewhere so they will be consistent across your application.

4. Talk frankly with your architects, designers, and requirements people about the use of arbitrary-precision math; make sure everyone is on the same page.

Using BigDecimal seems clumsy and wordy when you start. You'll develop style conventions and approaches which work if you take the plunge, and you'll have a lot more confidence in your numeric results once you do.

1 comment:

  1. Thanks for the tip! Here I was trying to use BigDecimal.round(), when what I really wanted was BigDecimal.setScale().

    ReplyDelete