Should Float or Decimal data type be used for dollar amounts?
The answer is easy. Never floats. NEVER!
Floats were according to IEEE 754 always binary, only the new standard IEEE 754R defined decimal formats. Many of the fractional binary parts can never equal the exact decimal representation.
Any binary number can be written as m/2^n
(m
, n
positive integers), any decimal number as m/(2^n*5^n)
.
As binaries lack the prime factor 5
, all binary numbers can be exactly represented by decimals, but not vice versa.
0.3 = 3/(2^1 * 5^1) = 0.3
0.3 = [0.25/0.5] [0.25/0.375] [0.25/3.125] [0.2825/3.125]
1/4 1/8 1/16 1/32
So you end up with a number either higher or lower than the given decimal number. Always.
Why does that matter? Rounding.
Normal rounding means 0..4 down, 5..9 up. So it does matter if the result is
either 0.049999999999
.... or 0.0500000000
... You may know that it means 5 cent, but the the computer does not know that and rounds 0.4999
... down (wrong) and 0.5000
... up (right).
Given that the result of floating point computations always contain small error terms, the decision is pure luck. It gets hopeless if you want decimal round-to-even handling with binary numbers.
Unconvinced? You insist that in your account system everything is perfectly ok?
Assets and liabilities equal? Ok, then take each of the given formatted numbers of each entry, parse them and sum them with an independent decimal system!
Compare that with the formatted sum. Oops, there is something wrong, isn't it?
For that calculation, extreme accuracy and fidelity was required (we used Oracle's
FLOAT) so we could record the "billionth's of a penny" being accured.
It doesn't help against this error. Because all people automatically assume that the computer sums right, and practically no one checks independently.
I wouldn't offer this answer except that you worked so hard to document it and it's been upvoted with no answer after a month. So, here goes. Your only choices appear to be to change the data or change the tool.
Probably, I am clearly doing something wrong and missing the obvious. Could someone please explain to me what I am doing wrong here?
When the tool is broken and the vendor doesn't care, it's mistake to keep trying. It's time to switch. You put a lot of effort into researching exactly how it's broken and demonstrating it violates not only the RFC but the tool's own prior version. How much more evidence do you need?
CSV is a boat anchor too. If you have the option, you're better off using an ordinary delimited file format. For lots of applications, tab-delimited is good. The best delimiter IMO is '\' because that character has no place in English text. (On the other hand it won't work for data containing Windows pathnames.)
CSV has two problems as an exchange format. First, it's not all that standard; different applications recognize different versions, whatever the RFC may say. Second (and related) is that it doesn't constitute a regular language in CS terms, which is why it can't be parsed as a regular expression. Compare with ^([^\t]*\t)*[\t]*$
for a tab-delimited line. The practical implication of the complexity of CSV's definition is (see above) the relative dearth of tools to handle them and their tendency to be incompatible, particularly during the wee hours.
If you give CSV and DTS the boot, you have good options, one of which is bcp.exe
. It's very fast, and safe because Microsoft hasn't been tempted to update it for years. I don't know much about DTS, but in case you have to use it for automation, IIRC there is a way to invoke external utilities. Beware though, that bcp.exe
does not return error status to the shell dependably.
If you're determined to use DTS and to stick with CSV, then really your best remaining option is to write a view that prepares the data appropriately for it. I would, if backed into that corner, create a schema called, say, "DTS2012CSV", so that I could write select * from DTS2012CSV.tablename
, giving anyone who cares a fighting chance to understand it (because you'll document it, won't you, in comments in the view text?). If need be, others can copy its technique for other broken extracts.
HTH.
Best Answer
Well it was pointed out that I needed to add the Scale that was missing.
Or even better do as this suggests and switch to numeric where I can set precision and scale.