# [Coco] Mod10 Suggestions

William Mikrut wmikrut72 at gmail.com
Sat Feb 18 14:03:41 EST 2017

```You are right -- I looked at is closer.
One thing I need to do is reverse the order of operations.

The LSLA is performed first.
First I need to store the byte and LSLA the next byte.

Otherwise if I flip it from left to right:
(LEAX CCD,PCR
...
LDA ,X+
...

it works perfectly.

On Sat, Feb 18, 2017 at 11:35 AM, William Astle <lost at l-w.ca> wrote:

> Take a closer look. It only does the LSLA on every other digit. It does
> *two* digits  per loop, just like Brett's version.
>
> You can easily pretend all numbers are 16 digits by right justifying the
>
>
> On 2017-02-18 10:06 AM, William Mikrut wrote:
>
>> I like how this works from right to left.
>> The only issue is the LSLA on every number.
>>
>> The algo is to double every other number, starting with the right most
>> digit, and sub 9 if the result is 10 or more.
>>
>> Now if the number is always 16 digits, Brett's 16 bit word seems the
>> easiest way to go.
>> If the number is 13 digits long the 16 bit word method won't work, but I
>> am
>> happy to pretend all numbers are 16 digits!
>>
>> I am going to try to include a couple things you showed me into Brett's 16
>> bit chunk method and try a slightly different routine!
>>
>>
>> On Sat, Feb 18, 2017 at 10:22 AM, William Astle <lost at l-w.ca> wrote:
>>
>> On 2017-02-18 12:43 AM, msmcdoug wrote:
>>>
>>> Actually I'm surprised noone has suggested bcd arithmetic on the result
>>>> to eliminate divide by 10 loop
>>>>
>>>>
>>> BCD would certainly give a predictable overall cycle count. It would
>>> require a significantly different approach, though. The only register you
>>> can use for BCD arithmetic is A and DAA is only useful after ADDA or
>>>
>>> possible complexity. However, upon reflection, the extra cycles to use
>>> BCD
>>> would probably be less than the average cycle time of the modulus loop
>>> combined or checking for digit overflow during the loop.
>>>
>>> I think you could use code that looks something like the following which
>>> is based off Mr. Mikrut's most recent posted code. (warning: mailer codeā¢
>>> follows so it may have errors)
>>>
>>>         ORG \$1200
>>> CCD     RMB 16
>>> RESULT  RMB 1
>>> START   LEAX CCD+16,PCR
>>>         CLRA
>>>         LDB #8
>>> LOOP    PSHS A
>>>         LDA ,-X
>>>         LSLA
>>>         CMPA #10
>>>         BLO LOOP2
>>>         SUBA #9
>>>         DAA
>>>         DAA
>>>         DECB
>>>         BNE LOOP
>>>         ANDA #\$0F
>>>         STA RESULT,PCR
>>> ENDPGM  RTS
>>>
>>> I'm using the stack for a temporary storage location instead of something
>>> PCR relative for code size reasons. You could use the "RESULT variable
>>> for
>>> the temporary to eliminate stack usage. That would probably be slightly
>>> faster at the expense of two more code bytes. This is one of those
>>>
>>> DAA has to be used after every addition and only applies to A. Using BCD
>>> means we can eliminate the mod 10 loop and just mask off the upper digit
>>> (BCD stores two decimal digits in a byte). That gives a constant time for
>>> the "mod 10" result and also only takes 2 bytes (and 2 cycles).
>>>
>>> I have also eliminated the STATUS variable and just store the result. You
>>> can test RESULT for non-zero trivially so there's no need for a separate
>>> STATUS value.
>>>
>>> By my calculation, this version is 32 bytes, requires 1 byte of stack
>>> space, 17 bytes of data space, and runs in a maximum of 351 cycles (and a
>>> minimum of 336 cycles if none of the doubled digits goes above 9). For
>>> this
>>> analysis, I've assumed 8 bit offsets for the PCR references. 16 bit
>>> offsets
>>> in PCR mode are quite a bit more expensive (4 extra cycles and 1 extra
>>> byte).
>>>
>>>
>>> --
>>> Coco mailing list
>>> Coco at maltedmedia.com
>>> https://pairlist5.pair.net/mailman/listinfo/coco
>>>
>>>
>>
>
> --
> Coco mailing list
> Coco at maltedmedia.com
> https://pairlist5.pair.net/mailman/listinfo/coco
>
```