I wrote a program in AVR ASM
for converting 32-bit
unsigned binary numbers to 8 digit
decimals based on the shift-add-3
. (I know that 32-bit
is more than 8 digit, but I only need 8.)
The 32-bit
input is in R16-R19
(low-high).
The 8 digit
output is in R20-R24
(low-high), 2 number / byte, one in the lower nibble, one in the higher nibble.
My problem: It takes ~1500 cycles to compute a 16-bit
number and ~2000 cycles to compute a 32-bit
.
Can anybody suggest me a faster, more professional method for this? Running a 2000 cycle procedure on a ATtiny at 32,768 Khz
is not something I am comfortable with.
Memory usage map:
Definitions:
.def a0 = r16
.def a1 = r17
.def a2 = r18
.def a3 = r19
.def b0 = r20
.def b1 = r21
.def b2 = r22
.def b3 = r23
.def i = r24
.def j = r25
The code:
BinaryToBCD:
clr b0
clr b1
clr b2
clr b3
ldi i, 32
sts 0x0068, i ;(SRAM s8)
BinaryToBCD_1:
clc
rol a0
rol a1
rol a2
rol a3
rol b0
rol b1
rol b2
rol b3
lds i, 0x0068 ;(SRAM s8)
dec i
sts 0x0068, i ;(SRAM s8)
brne BinaryToBCD_2
ret
BinaryToBCD_2:
cpi b0, 0
breq BinaryToBCD_3
mov i, b0
rcall Add3ToNibbles
mov b0, i
BinaryToBCD_3:
cpi b1, 0
breq BinaryToBCD_4
mov i, b1
rcall Add3ToNibbles
mov b1, i
BinaryToBCD_4:
cpi b2, 0
breq BinaryToBCD_5
mov i, b2
rcall Add3ToNibbles
mov b2, i
BinaryToBCD_5:
cpi b3, 0
breq BinaryToBCD_1
mov i, b3
rcall Add3ToNibbles
mov b3, i
rjmp BinaryToBCD_1
Add3ToNibbles:
mov j, i
andi j, 0b00001111
cpi j, 5
in j, SREG
sbrs j, 0
subi i, -3
mov j, i
swap j
andi j, 0b00001111
cpi j, 5
in j, SREG
sbrs j, 0
subi i, -48
ret
Best Answer
This is based on venny's approach (venny called it triangulation), expressed on a "pseudo-C":
Routines add and divide are not needed explanation, imo.