[Coco] Devastated. Long term OVCC project falls short

Walter Zambotti zambotti at iinet.net.au
Sat Oct 5 08:26:47 EDT 2019


James.


Regards profiling. Already done that. The VCC and OVVC emulators 
depending on CoCo mhz spend 90% of their time in the CPU exec.  Of 
course breaking down the profiling to individual instructions is more 
complex.

But that is the next step.  I'll concentrate on the instructions that 
get called most but my guess is since almost every instruction calls 
upon MemRead8/16/32 & MemWrite8/16/32 those will be the functions to target.

Here is a link to the profile analysis:

https://drive.google.com/file/d/1iSLqk8x1xi90U5SHGns6HbtwkgwP__Ua/view?usp=sharing

Here are the top few lines:

Flat profile:

Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total
  time   seconds   seconds    calls  ms/call  ms/call  name
  23.08      2.19     2.19 840892115     0.00     0.00  MemRead8
  12.86      3.41     1.22   874786     0.00     0.00  UpdateScreenSDL
   8.06      4.18     0.77  4566595     0.00     0.00  HD6309Exec
   7.17      4.86     0.68                             MmuWrite8
   4.22      5.26     0.40 140686259     0.00     0.00  MemWrite8
   2.21      5.47     0.21 26289713     0.00     0.00  Lda_D
   2.00      5.66     0.19  1206308     0.00     0.01  CPUCycle
   1.84      5.83     0.18 34568159     0.00     0.00  CalculateEA
   1.79      6.00     0.17 66518941     0.00     0.00  MemRead16
   1.48      6.14     0.14 14136205     0.00     0.00  Std_D
   1.48      6.28     0.14 17103978     0.00     0.00  Rol_D
   1.00      6.38     0.10  6429800     0.00     0.00  Puls_M
   0.95      6.47     0.09                             MemRead8_s
   0.95      6.56     0.09                             fMemWrite8
   0.90      6.64     0.09  8337765     0.00     0.00  Adda_M
   0.79      6.72     0.08  8313951     0.00     0.00  Ldb_D
   0.79      6.79     0.08  9188895     0.00     0.00  Sta_D
   0.79      6.87     0.08  7817206     0.00     0.00  Rola_I
   0.74      6.94     0.07 10456927     0.00     0.00  Bpl_R
   0.74      7.01     0.07  5180531     0.00     0.00  Pshu_M
   0.74      7.08     0.07  5502404     0.00     0.00  Addd_D
   0.68      7.14     0.07  7496343     0.00     0.00  Mul_I
   0.63      7.20     0.06                             Bita_D
   0.63      7.26     0.06  3384107     0.00     0.00  AudioOut
   0.63      7.32     0.06  2022184     0.00     0.00  Tfr_M
   0.58      7.38     0.06  5829175     0.00     0.00  Std_X
   0.58      7.43     0.06  4277075     0.00     0.00  Ldd_X
   0.53      7.48     0.05  4086732     0.00     0.00  Beq_R
   0.53      7.53     0.05 23321909     0.00     0.00  MemWrite16
   0.53      7.58     0.05  5094112     0.00     0.00  Coma_I
   0.47      7.63     0.05 13093215     0.00     0.00  Ldd_D
   0.47      7.67     0.05  7883191     0.00     0.00  Aslb_I
   0.47      7.72     0.05  2426965     0.00     0.00  Adca_D

So as I suspected MemRead and MemWrite and way up there. The 6309exec 
(main cpu loop) is one of the top as well.

Walter



On 10/5/19 11:07 AM, James Jones wrote:
> Without profiling data (or complexity theory analysis of algorithms),
> you're shooting in the dark when it comes to trying to speed up code.
> Compile a version for each target with profiling turned on, run it, and
> take a look at the data it generates to see where the code is really
> spending its time, and work on that.
>
> On Thu, Oct 3, 2019 at 11:16 PM Walter Zambotti <zambotti at iinet.net.au>
> wrote:
>
>> Hello all.
>>
>> As you all know I have been working on OVCC which is the portable
>> version of VCC for Linux/OSX and Windows.
>>
>> I have just released version 1.1.0 which has finally added the CPU 6309
>> Turbo feature.
>>
>> The 6309 turbo feature is a reworking of the 6309 cpu emulator written
>> in C rewritten in X86 assembly.
>>
>> I actually targeted this project for VCC 1.43 and started before writing
>> OVCC. So it was put on hold until OVCC was complete.
>>
>> It is now complete.
>>
>> As I was writing the assembly emulator I ran across some major obstacles
>> that I had to overcome.
>>
>> The first was functional testing.  To address this I wrote a
>> verification suite that could execute any instruction in isolation for
>> all its data and condition code combinations and compare the results to
>> another assumed working 6309 emulator (already available in VCC).
>>
>> After many months work all 437 6309 instructions with countless billions
>> of test iterations was completed about two weeks ago. Some instructions
>> that work with 16 x 16 bit data take 4-8 billion iterations to test. A
>> single test of one of these instructions takes about 10-20 minutes on a
>> decent Intel system and there about a hundred or so of these.
>>
>> During the testing the verification suite which was also timing the
>> execution of each instruction was indicating the assembly was twice the
>> performance or efficiency of the C code.
>>
>> However late in the testing stage I remember the C code was not being
>> compiled optimized for debugging purposes.
>>
>> After recompiling the C code with optimization the assembly was only 25%
>> faster. Wow I was taken back but at the same time really impressed with
>> the optimizations of modern compilers.
>>
>> But the real performance test is where the rubber meets the road and
>> until I got the new 6309 emulator into OVCC then I couldn't be sure what
>> the performance gain would be.
>>
>> Well I have now completed the integration of the new 6309 turbo emulator
>> into OVCC.
>>
>> Along the way I had some major hurdles.  At first the new emulator
>> wouldn't work at all even with all the verification testing. How was I
>> going to find which instruction(s) was causing the problem!
>>
>> I had an awful (but incorrect) realization that I was going to have to
>> make a special version of OVCC that ran both the old and new cpu
>> emulators side by side comparing all results after each instruction.
>>
>> I was beside myself for a few days until I remember to divide and concur
>> the problem.
>>
>> Basically each CPU emulator contains a jump table (3 actually) to each
>> of the cpu instructions.  The only difference is the C emulator has C
>> functions and the assembly emulator has assembly functions.
>>
>> The jump table or the CPU executive doesn't care or know that they are
>> one or the other type of function.
>>
>> So I just substituted a hand full of the C functions pointers for
>> assembly function pointers (in the jump table) at a time and waited for
>> OVCC to crash.
>>
>> In the end there were only a few instructions that were causing problems
>> and I managed to get it all going.
>>
>> So what are the results?
>>
>> Disappointing!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>>
>> On Windows the assembly 6309 CPU runs about 1-1.5% SLOWER!
>>
>> On Linux the assembly 6309 CPU runs about 45%-50% SLOWER!!!
>>
>> I can't accurately estimate the amount of time I spent on this project
>> but it is a lot (I guesstimate about 6 months worth).
>>
>> The only recompense is along the way I did learn a lot about X86 machine
>> instructions.
>>
>> And...
>>
>> The new cpu emulator can also gather instruction usage statistics.
>>
>> And, and ...
>>
>> As I was verifying/comparing the 6309 C code to the assembly code I
>> added all the missing 6309 instructions and made some corrections to
>> existing instructions that did not fully comply with the 6309
>> instruction manual.
>>
>> So the 6309 emulation in OVCC is full and complete with all
>> instructions, both the C and assembly.  This would be easily back ported
>> to VCC!!!
>>
>> For now I will leave the 6309 turbo option in OVCC out of interest only.
>>
>> I have also placed an additional indicator on the OVCC message bar that
>> shows whether the 6309 is running in emulation or native mode.  You will
>> see either:
>>
>> 6309E at 0.89mhz  <-- 6309 in emulation mode
>>
>> 6309N at 0.89mhz  <-- 6309 in native mode
>>
>> 6309XE at 0.89mhz  <-- 6309 turbo in emulation mode
>>
>> 6309XN at 0.89mhz  <-- 6309 turbo in native mode
>>
>> Hope I didn't rant for to long!
>>
>> Walter
>>
>>
>> --
>> Coco mailing list
>> Coco at maltedmedia.com
>> https://pairlist5.pair.net/mailman/listinfo/coco
>>


More information about the Coco mailing list