[Coco] Devastated. Long term OVCC project falls short
John E. Malmberg
wb8tyw at qsl.net
Fri May 1 09:34:06 EDT 2020
On 10/3/2019 11:06 PM, Walter Zambotti wrote:
>
> After recompiling the C code with optimization the assembly was only 25%
> faster. Wow I was taken back but at the same time really impressed with
> the optimizations of modern compilers.
I have not looked at your source at all, yet.
A lot of this depends on your platform. My C programming has mostly
been on Alpha, VAX, and Itanium. A lot of it porting code from Linux.
A compiler manual should be documenting what each type of optimization
it is doing and why it does that. The DEC compiler documentation on
this was very good.
There are many common C programming "tricks" that effectively shaft the
ability of the optimizer to properly optimize code, and can really slow
down the outputted code.
The call frame for external functions and syscalls can be complex to set
up. The compiler usually has a list of built-ins that it knows how to
inline call or other "fast" call techniques.
I have encountered projects that for some reason decided to wrap those
functions in an external module. Some of this is done to make handling
null terminated strings allegedly safer.
In one case I found wrappers to stat(), which on Linux is a very fast
return from cached memory. On VMS, though each call required a very
expensive translation of a Unix path name to VMS path name, and then an
expensive look up of the file information.
They had wrappers for stat as "file_exists()", "file_size", and I think
at least one other, and they would usually call them in sequence.
This totally destroyed the performance of the ported code.
They also had external wrappers for all the string functions.
String functions are ones that are typically in-lined by the compiler.
So are the memmove() and related code.
I have also seen platforms where code that you would expect to be
in-lined is replaced by an internal function call to a library routine.
If you have small routines in a library, it may be better to move them
to into a header file as a static routine so the compiler can inline
them, as the inline code can both take up less space than the setup for
the external call and it removes the overhead of setting up a call frame.
Calling 32 bit routines on a 64 bit build may or may not cause
differences if the library has a "Thunking" layer that adds a wrapper to
translate from 64 bit to 32 bit.
As already mentioned, the memory alignment of variables in extern
structures or structures passed by reference does matter. The DEC
compilers can attempt to diagnose this condition.
Some compilers will silently fix up the alignment of internal structures
by default.
I have seen some compilers are supposed to be able to inline modules
from external modules. I am not sure how they are able to do that as
they would need something to tell them where the external module is,
unless they are given a list of those modules or some other hint.
C is one of those languages that the specification also has a lot of
places where it states that the result of some input is "undefined".
Most programmers never see those cases where it bites them because they
program only on one platform.
Regards,
-John
More information about the Coco
mailing list