Python – Adding the Digits of a Number

floating pointloopspython

I am trying to write a program that asks the user for a decimal number and then calculates the sum of its digits. for example if the number is 123.25 then the sum will be 1+2+3+2+5=13.

I decided to first turn the number to 0.12325 (any number will be formatted this way but I am using the example) and then move the decimal point to the right one place and isolate the digits one by one.

here is what I got (I tried it in python and c++ i get the same problem in both):

number = float(raw_input ("please enter a Decimal number: "))
total = 0

while number >= 1:
    number = number / 10 ## turning 123.25 to 0.12325

while number > 0:
    number = number * 10 ## making 0.12325 to 1.2325
    total = total + int(number)  ## storing the digit on the left of the decimal point
    number = number  - int (number) ## getting rid of the digit left to the decimal point

print total

The problem is that everything works fine until the last digit. When it comes to the point where it's 5.0 than the statement number = number – int(number) gives me 1 instead of zero. It is supposed to be 5.0 – 5 = 0 but it isn't.

This is what I get when i print the number at every stage:

please enter a Decimal number: 123.25
0.12325
0.2325
0.325
0.25
0.5    ##### Here is where the problem starts #####
0.999999999999
0.999999999993
0.999999999929
..... here it goes on for  a while. I spared you the rest .....

More accurately, I printed the number at every step in the loop:

please enter a Decimal number: 123.25

number at start of loop:  0.12325
number after 'number * 10' :  1.2325
number at the end of the loop:  0.2325

number at start of loop:  0.2325
number after 'number * 10' :  2.325
number at the end of the loop:  0.325

number at start of loop:  0.325
number after 'number * 10' :  3.25
number at the end of the loop:  0.25

number at start of loop:  0.25
number after 'number * 10' :  2.5
number at the end of the loop:  0.5

number at start of loop:  0.5
number after 'number * 10' :  5.0
number at the end of the loop:  0.999999999999

All of a sudden it stops being accurate.

Best Answer

This can be done with the following python one-liner:

>>> f = 123.25
>>> sum(int(ch) for ch in str(f) if ch.isdigit())
13
>>>

To break down how this works:

Convert to string as @whatsisname recommends to avoid floating point rounding issues

str(f)

Create a list containing each character

ch for ch in str(f)

Throw out everything that isn't a digit

ch for ch in str(f) if isdigit(ch)

Convert everything in the result to an int

int(ch) for ch in str(f) if isdigit(ch)

Sum everything

sum(int(ch) for ch in str(f) if isdigit(ch))

There are three key points here:

Do not be afraid to convert numbers to strings. The performance hit of doing so is usually inconsequential on modern machines, and this makes manipulating digits much more natural
Python list comprehensions combined with standard functions are extremely useful for anything that involves "do something with this list of something".
When using floating point math, you should never expect exact results except in trivial cases (like 0.0) because of precision issues.

Related Solutions

SQL Algorithms – Is There a Purely SQL Alternative to Looping?

Depending on your version of SQL Server, windowing functions will do what you want. 2008 has limited support but 2012 adds nearly all of the standard. The over clause is used for things exactly like this, it is also available in the express versions as well.

http://technet.microsoft.com/en-us/library/ms189461.aspx

Windowing functions are used to perform high-level aggregation, ranking and statistical analysis. They are often used to split and show similarities in a data set.

The over clause can partition a column and provide a running sum like you describe in a T-SQL Statement.

SELECT [id], [vendor_id], [item], [brand], [inventory_version], SUM([quantity])
    OVER (PARTITION BY [vendor_id], [item]
        ORDER BY [vendor_id], [item], [inventory_version]) AS qty_Totals
FROM inventory_snapshots
ORDER BY [vendor_id], [item], [inventory_version]

Code is untested but should work. What it does is take the recordset, partitions it by the vendor_id and item. This effectively resets the sumation to 0 when that combination changes. We make sure to order the results in a logical manner to show the running totals easily. This will show the progression of item versions per manufacturer starting at the lowest and going straight through to the last.

Hopefully this clears things up a bit.

Avoiding Division by Zero Using Float Comparison

I side with OP's personal theory that it is not a normal practice to allow a computer program to proceed with a division-by-zero operation, or to only perform a minimal check before the division.

The exception is when you are implementing something that is too general - a programming language (such as MATLAB) where you (as the programmer) do not know the context / application / use-case / physical meaning of the mathematical operations it is asked to perform. This may be because the formula it is evaluating is provided by the customer, and you do not know the customer's use-case of that formula. In that case you use a special representation such as Inf or NaN as a placeholder.

If, however, the formula is provided as part of a statistical toolbox, then you should be able to provide an explanation when the situation arises. See the "weighted averaging when the total weight is zero" example below.

There is a way to "invert" a divisor underflow test. Mathematically if b is not zero, and
abs(a) / abs(b) > abs(c) where c is the largest representable floating point value, then
abs(a) > abs(c) * abs(b). However, in practice it requires a more careful implementation than that. You may be able to find a mathematical library function that allows you to pass in (a, b) and it will return whether the division will overflow, underflow, or otherwise have poor precision.

Source code analyzers look for patterns in the code; they are not sophisticated enough to decide whether someone's workaround logic is sufficient for the application's design purpose. (In fact even the average programmer may be unqualified to make that decision.) Source code analyzers are supposed to be augmented with a person qualified to make that decision.

A denominator of zero can occur in a lot of mathematical manipulations: formulas, infinite series (summation of sequence), etc. There are many mathematical methods to calculate the result despite having denominators that approach zero (i.e. not exactly zero, but are smaller than the machine-representable value). These means the formulas are not to be evaluated verbatim - they are transformed using some calculus methods, and for each formula there may be several alternative versions which is chosen to avoid the division-by-zero issue.

Another situation arises in weighted averaging of data. If you perform a query that selects a subset of data, and when:

the sum of weights for the subset of data turns out to be zero, or
when the subset is indeed empty, i.e. the query returns no result

then the proper way to phrase that situation is "insufficient samples (data) for the query", etc.

In basic trigonometry, some representations (slope) are very sensitive to division problems, whereas an alternative representation (bearing, i.e. angle) would not be sensitive. For example, to represent a line on a 2D plane, where vertical and near-vertical lines need to be represented as robustly as horizontal and near-horizontal lines, you can:

have a toggle between lines that are steep vs. those that are not. For lines steeper than 45 degrees, you would use (x / y) instead of (y / x) as the "flipped" slope of the line, so as to avoid the division by small numbers.
Use an alternative representation such as a*x + b*y + c == 0 and store the parameters (a, b, c) with the requirement that (a^2 + b^2) must equal 1.0 for normal case, and 0.0 if the line is degenerate (not-a-line).

It is worth mentioning that degeneracy is unavoidable in many different contexts (and in context-specific ways). For example, if user passes in a "line" from point (x1, y1) to point (x2, y2) and asks to calculate its slope, and it happens that (x1 == x2 and y1 == y2), then there is no slope, because there is no line, because there is only a single point in the user's input.

Best Answer

Related Solutions

SQL Algorithms – Is There a Purely SQL Alternative to Looping?

Avoiding Division by Zero Using Float Comparison

Related Topic