[Coco] Devastated. Long term OVCC project falls short

John E. Malmberg wb8tyw at qsl.net
Fri May 1 09:34:06 EDT 2020


On 10/3/2019 11:06 PM, Walter Zambotti wrote:
> 
> After recompiling the C code with optimization the assembly was only 25% 
> faster. Wow I was taken back but at the same time really impressed with 
> the optimizations of modern compilers.

I have not looked at your source at all, yet.

A lot of this depends on your platform.  My C programming has mostly 
been on Alpha, VAX, and Itanium.  A lot of it porting code from Linux.

A compiler manual should be documenting what each type of optimization 
it is doing and why it does that.  The DEC compiler documentation on 
this was very good.

There are many common C programming "tricks" that effectively shaft the 
ability of the optimizer to properly optimize code, and can really slow 
down the outputted code.

The call frame for external functions and syscalls can be complex to set 
up.  The compiler usually has a list of built-ins that it knows how to 
inline call or other "fast" call techniques.

I have encountered projects that for some reason decided to wrap those 
functions in an external module.  Some of this is done to make handling 
null terminated strings allegedly safer.

In one case I found wrappers to stat(), which on Linux is a very fast 
return from cached memory.  On VMS, though each call required a very 
expensive translation of a Unix path name to VMS path name, and then an 
expensive look up of the file information.

They had wrappers for stat as "file_exists()", "file_size", and I think 
at least one other, and they would usually call them in sequence.
This totally destroyed the performance of the ported code.
They also had external wrappers for all the string functions.

String functions are ones that are typically in-lined by the compiler.
So are the memmove() and related code.

I have also seen platforms where code that you would expect to be 
in-lined is replaced by an internal function call to a library routine.

If you have small routines in a library, it may be better to move them 
to into a header file as a static routine so the compiler can inline 
them, as the inline code can both take up less space than the setup for 
the external call and it removes the overhead of setting up a call frame.

Calling 32 bit routines on a 64 bit build may or may not cause 
differences if the library has a "Thunking" layer that adds a wrapper to 
translate from 64 bit to 32 bit.

As already mentioned, the memory alignment of variables in extern 
structures or structures passed by reference does matter.  The DEC 
compilers can attempt to diagnose this condition.

Some compilers will silently fix up the alignment of internal structures 
by default.

I have seen some compilers are supposed to be able to inline modules 
from external modules.  I am not sure how they are able to do that as 
they would need something to tell them where the external module is, 
unless they are given a list of those modules or some other hint.

C is one of those languages that the specification also has a lot of 
places where it states that the result of some input is "undefined".
Most programmers never see those cases where it bites them because they 
program only on one platform.

Regards,
-John



More information about the Coco mailing list