[Coco] Mod10 Suggestions

William Mikrut wmikrut72 at gmail.com
Sat Feb 18 18:06:46 EST 2017


Some slight re ordering of the code and it works perfectly!
48 Bytes total, Less 17 for storage -- 31 program bytes to get the job done.

My original code was 61 program bytes... down to half the size and does the
exact same thing.
Absolutely amazing!


ORG $1200
CCD     RMB 16
RESULT  RMB 1

START   LEAX CCD+16,PCR
CLRA
        LDB #8


LOOP    ADDA ,-X
        DAA
        PSHS A
        LDA ,-X
        LSLA
        CMPA #10
        BLO LOOP2
        SUBA #9
LOOP2   ADDA ,S+
        DAA

        DECB
        BNE LOOP



        ANDA #$0F
        STA RESULT,PCR
ENDPGM  RTS
END START

On Sat, Feb 18, 2017 at 1:03 PM, William Mikrut <wmikrut72 at gmail.com> wrote:

> You are right -- I looked at is closer.
> One thing I need to do is reverse the order of operations.
>
> The LSLA is performed first.
> First I need to store the byte and LSLA the next byte.
>
> Otherwise if I flip it from left to right:
> (LEAX CCD,PCR
> ...
> LDA ,X+
> ...
> ADDA ,X+)
>
>  it works perfectly.
>
>
> On Sat, Feb 18, 2017 at 11:35 AM, William Astle <lost at l-w.ca> wrote:
>
>> Take a closer look. It only does the LSLA on every other digit. It does
>> *two* digits  per loop, just like Brett's version.
>>
>> You can easily pretend all numbers are 16 digits by right justifying the
>> numbers in your buffer and padding with zeros.
>>
>>
>> On 2017-02-18 10:06 AM, William Mikrut wrote:
>>
>>> I like how this works from right to left.
>>> The only issue is the LSLA on every number.
>>>
>>> The algo is to double every other number, starting with the right most
>>> digit, and sub 9 if the result is 10 or more.
>>>
>>> Now if the number is always 16 digits, Brett's 16 bit word seems the
>>> easiest way to go.
>>> If the number is 13 digits long the 16 bit word method won't work, but I
>>> am
>>> happy to pretend all numbers are 16 digits!
>>>
>>> I am going to try to include a couple things you showed me into Brett's
>>> 16
>>> bit chunk method and try a slightly different routine!
>>>
>>>
>>> On Sat, Feb 18, 2017 at 10:22 AM, William Astle <lost at l-w.ca> wrote:
>>>
>>> On 2017-02-18 12:43 AM, msmcdoug wrote:
>>>>
>>>> Actually I'm surprised noone has suggested bcd arithmetic on the result
>>>>> to eliminate divide by 10 loop
>>>>>
>>>>>
>>>> BCD would certainly give a predictable overall cycle count. It would
>>>> require a significantly different approach, though. The only register
>>>> you
>>>> can use for BCD arithmetic is A and DAA is only useful after ADDA or
>>>> ADCA.
>>>>
>>>> I had thought about using BCD but had initially dismissed it due to
>>>> possible complexity. However, upon reflection, the extra cycles to use
>>>> BCD
>>>> would probably be less than the average cycle time of the modulus loop
>>>> combined or checking for digit overflow during the loop.
>>>>
>>>> I think you could use code that looks something like the following which
>>>> is based off Mr. Mikrut's most recent posted code. (warning: mailer
>>>> codeā„¢
>>>> follows so it may have errors)
>>>>
>>>>         ORG $1200
>>>> CCD     RMB 16
>>>> RESULT  RMB 1
>>>> START   LEAX CCD+16,PCR
>>>>         CLRA
>>>>         LDB #8
>>>> LOOP    PSHS A
>>>>         LDA ,-X
>>>>         LSLA
>>>>         CMPA #10
>>>>         BLO LOOP2
>>>>         SUBA #9
>>>> LOOP2   ADDA ,S+
>>>>         DAA
>>>>         ADDA ,-X
>>>>         DAA
>>>>         DECB
>>>>         BNE LOOP
>>>>         ANDA #$0F
>>>>         STA RESULT,PCR
>>>> ENDPGM  RTS
>>>>
>>>> I'm using the stack for a temporary storage location instead of
>>>> something
>>>> PCR relative for code size reasons. You could use the "RESULT variable
>>>> for
>>>> the temporary to eliminate stack usage. That would probably be slightly
>>>> faster at the expense of two more code bytes. This is one of those
>>>> size/speed trade-offs.
>>>>
>>>> DAA has to be used after every addition and only applies to A. Using BCD
>>>> means we can eliminate the mod 10 loop and just mask off the upper digit
>>>> (BCD stores two decimal digits in a byte). That gives a constant time
>>>> for
>>>> the "mod 10" result and also only takes 2 bytes (and 2 cycles).
>>>>
>>>> I have also eliminated the STATUS variable and just store the result.
>>>> You
>>>> can test RESULT for non-zero trivially so there's no need for a separate
>>>> STATUS value.
>>>>
>>>> By my calculation, this version is 32 bytes, requires 1 byte of stack
>>>> space, 17 bytes of data space, and runs in a maximum of 351 cycles (and
>>>> a
>>>> minimum of 336 cycles if none of the doubled digits goes above 9). For
>>>> this
>>>> analysis, I've assumed 8 bit offsets for the PCR references. 16 bit
>>>> offsets
>>>> in PCR mode are quite a bit more expensive (4 extra cycles and 1 extra
>>>> byte).
>>>>
>>>>
>>>> --
>>>> Coco mailing list
>>>> Coco at maltedmedia.com
>>>> https://pairlist5.pair.net/mailman/listinfo/coco
>>>>
>>>>
>>>
>>
>> --
>> Coco mailing list
>> Coco at maltedmedia.com
>> https://pairlist5.pair.net/mailman/listinfo/coco
>>
>
>


More information about the Coco mailing list