[Coco] Update on new Coco 3 game engine

Fedor Steeman petrander at gmail.com
Tue Aug 13 05:28:33 EDT 2013


Holy crap that is awesome!!!

I have long been considering something similar, but never had the time nor
the know-how.

This could become an enormous stimulus for CoCo game development!

Cheers,
Fedor



On 13 August 2013 07:26, Richard Goedeken
<Richard at fascinationsoftware.com>wrote:

> Hello Coco fans!
>
> In my long-term quest to write a side-scrolling arcade/adventure game for
> my daughter, I began earlier this year with one of the hardest parts:
> building a fast enough graphics engine to handle the scrolling and sprites.
>  I figured that if I couldn't get this working well enough, then there
> would be no point in doing all of the work creating the game elements.  I'm
> writing this email because this graphics code is nearly complete, and I
> wanted to share some of the many interesting things that I learned about
> the Coco and the 6809 during this development.
>
> In the design of the graphics engine, there are many decisions to be made
> which trade off between performance and visual quality.  The one major
> advantage that the Coco has over the other 8-bit micros of the era is the
> large available memory pool of 512k.  I wanted to use this to my advantage
> as much as possible, and you will see it in some of the choices that I made.
>
> I decided to use double buffering, which is very common, to eliminate
> tearing and flashing artifacts.  This requires twice as much memory usage,
> and also requires us to redraw twice as many background pixels as we
> otherwise would.  For example, consider the case in which the screen
> background is moving at 1 byte (2 pixels) per frame horizontally.  Buffer 0
> is drawn at a starting point of (0,0).  For the next frame, buffer 1 is
> drawn at (1,0).  For the following frame we will switch back to buffer 0.
>  We need to draw the new pixels for this screen buffer at a starting point
> of (2,0). We have already drawn this buffer at (0,0), so we only need to
> add two columns of bytes (4 pixels) on the right side to paint in the
> missing part of the screen.  So we must draw a column 4 pixels wide, even
> though we have only moved 2 pixels from the previous frame. This is because
> each buffer only gets updated every other frame.
>
> I also decided to use the 256-byte wide screen mode.  This increases
> memory usage for the screens by 60%, but it gives us some good advantages:
>  1. Pixel location calculations are greatly simplified (no need to multiply
> by 160).  2. We do not need to clip sprites on the sides when we draw them,
> because it's okay to draw a little offscreen.  3. Background block
> redrawing can be faster and more consistent in time between frames by
> always drawing the full width of the blocks.
>
> I really wish the GIME designers had provided for byte-level horizontal
> screen positioning.  It is extremely unfortunate that it can only set the
> horizontal scroll position in 2-byte (4 pixel) increments.  The only way to
> make it scroll smoothly with this constraint is scroll A) very fast, and B)
> at a constant speed.  Some games (Crystal City) do this and it looks
> impressive, but this scrolling is faster than I want, and I also would like
> to vary the scrolling speed.  Slower scrolling is too jerky with 2-byte
> positioning.  In software, we can do 2-pixel scrolling by using a pair of
> screen planes (even and odd) for each of the front and back buffers.  One
> screen plane is offset by one byte, and we choose which plane to display on
> the monitor when we are flipping front/back buffers in the vertical
> interrupt by looking at the lowest bit of the X screen start position in
> bytes.  The penalty for this finer scrolling is doubling the video memory
> usage, and about 30% more time to draw the background pixels.
>
> So that's the background scrolling engine.  I posted a demo on this list a
> few months ago.  I recently rewrote the block drawing functions with an
> improved copying algorithm, so that it is now interrupt friendly (this is
> required for sound), and also a little bit faster.
>
> Regarding performance, the amount of time which passes between one field
> of the NTSC video output from the Coco and the next is 16.7 milliseconds.
>  This is our "time budget".  To achieve 60fps operation, we must
> draw/erase/redraw everything necessary, as well as read input
> keyboard/joystick state and do physics calculations in less than this time.
>  Similarly, to run at 30fps we need to finish all these calculations in
> less than 33.4ms.  One thing that I realized is that the computational
> workload for the game can vary greatly depending upon number of objects on
> the screen, the positions of the objects, whether the background is
> scrolling and by how much, etc.  Rather than try to achieve a constant
> frame rate (at which every frame will be bound by the worst case), it is
> better to support a variable frame rate.  This is a common technique used
> in modern games, and in fact I even noticed this is the new Pikmin 3 game
> for the Wii U. Since we already use double-buffering, this can be supported
> with a small penalty when doing the physics calculations.  So my game
> engine does this: I track the number of 60hz fields which pass between
> frame updates, and use this value for updating the game state ('physics'
> calculations).  For example, all objects will move at 3* their nominal
> speed if there were 3 field durations which passed between the last pair of
> frame updates.  For simplicity and performance, I only support 1x, 2x, and
> 3x field times for the variable frame rate.  If it takes more than 3 field
> durations to calculate a frame, then the game will appear to 'lag' or slow
> down.  Otherwise, it will just get a little choppier as it slows down, but
> will appear to run at the same perceptual speed.
>
> The performance of the scrolling engine is pretty good.  Here is a table
> which shows the number of milliseconds required to update the background
> (in terms of bytes for horizontal scrolling, and rows for vertical
> scrolling):
>
> Time (millisec)  -8    -6    -4    -2     0     2     4 6     8
> ------------------------------**------------------------------**-------
> Horizontal     12.3   9.7   7.2   4.3   1.4   4.3   7.2 9.8  12.3
> Vertical       12.0  10.2   8.4   5.0   1.4   5.0   8.3 10.1  12.0
> ------------------------------**------------------------------**-------
>
> The total overhead of the engine with no objects running is 1.4
> milliseconds.  This includes reading 2 axes of one joystick and all the
> screen redraw logic.  One thing that I noticed is that IRQ overhead of the
> 6809 is really high.  The horizontal interrupt is a killer.  The overhead
> for even an FIRQ is 21 cycles (10 cycles to enter, 5 for the LBRA at $FExx,
> and 6 for the RTI).  The fastest routine that I can come up with handle
> both VSync and sound is a minimum of 45 cycles in 9 instructions, and I
> would probably need more than this to dynamically update the screen based
> on row number.  So we need a minimum of 66 cycles for this interrupt
> routine, and here's the kicker: the horizontal interrupt signals arrive
> only 114 cycles apart.  Therefore, using the horizontal interrupt will
> occupy a minimum of 58% of all clock cycles, regardless of frame rate.
>  This is too steep for me, so I will not use this and won't be able to
> split up the screen into horizontal regions, like Nick is doing for Popstar
> Pilot.  I'll run the sound at a lower frequency off of the 12-bit timer,
> and turn off this interrupt source when the sound is not playing.
>
> During the last few months I've made a lot of progress on the sprite
> portion of the graphics engine.  I believe that my design for this is
> novel, and it is about as fast as it could be.  My goal here was maximum
> theoretical performance.  Again, I traded off memory consumption for speed.
>  Part of the challenge with drawing sprites is that the 16 color mode packs
> 2 pixels into a single byte.  If you want to support sprites with 1-pixel
> wide features, you must mask the background bytes with a logical AND, and
> then OR/ADD the results with the sprite pixels before writing back to the
> screen.  The fastest general-purpose sprite routines that I can write
> require 3720 cycles to erase and write (while saving background data for
> later erasing) a 16x16 sprite.  This works out to 14.5 cycles per pixel
> (assuming that all 256 pixels are drawn).
>
> To achieve the maximum possible performance with my sprite engine, I wrote
> a sprite compiler.  This software is a large and complex Python script,
> which reads sprite data from a file and writes out near-optimal 6809
> assembly for drawing and erasing sprites on the screen.  It basically
> paints them, byte by byte.  Even though the sprite compiler includes a lot
> of crazy optimizations, the performance gains that I get on a
> cycle-per-pixel basis are relatively small and mostly attributable to two
> techniques: 1) I don't need to AND mask the bytes/words which will get
> completely overwritten, and 2) I can minimize foreground pixel loads by
> grouping together writes with the same byte/word values.  For the few
> sprites with which I've been testing, the compiled sprite code takes an
> average of 12.9 cycles/pixel to erase and draw, which is only 12% faster
> than the general purpose routine, but the big gain comes from the fact that
> we only draw and erase the bytes which contain non-transparent pixels.
>  When we look at the overall time consumed (rather than cycles per pixel),
> the new sprite engine turns out to be much faster than the general purpose
> routine.  For example, I can draw+erase a 15-pixel diameter ball in under
> 2000 cycles.  This is much faster than the general purpose routine, which
> would take 3720.  I can draw+erase nice outlined 8x16 numeric characters in
> 1000 cycles or less each.
>
> So I'm happy with the performance.  As I mentioned before, the tradeoff is
> increased memory consumption.  For a general-purpose engine, you would use
> probably 2 bytes per pixel to store sprite data (each sprite object would
> have 2 copies to get single-pixel positioning, and each copy would contain
> a mask byte and a foreground pixel byte for each screen byte).  For my
> engine, the generated machine code for drawing and erasing the sprites
> varies, but comes out to about 6 bytes per pixel if you only need
> byte-level positioning (ie, for letters/numbers), or 9 bytes per pixel if
> you want pixel-level positioning.  It's a pretty heavy memory penalty, but
> I think it's worth the speed.  The maximum sprite size is about 62x32, and
> the cool thing is that the sprites can be any shape or size, and will be
> optimized for just the pixels which get written to the screen.
>
> With the high memory consumption (and a scrolling graphics aperature which
> can move anywhere in the physical RAM space), it is desirable to abstract
> the 8k memory page (de)allocation and mapping.  So, one of the very first
> modules of code that I wrote is a simple virtual memory manager which
> tracks the 8k pages which are allocated by different parts of the engine.
>  It automatically moves them when the screen aperature moves to overlap
> with an in-use block.
>
> I'm really excited about this graphics/game engine, because it is
> sufficiently generalized that it could be used for a lot of great games in
> addition to platformers.  It's not suitable for every genre, but it would
> work well for several different game types.  I would love to do a top-down
> racer like Micro Machines.  If it were simple enough, it could look
> beautiful running at 60fps.  This engine is also suitable for horizontal
> shoot-em-ups and top-down or isometric RPG graphic adventure or arcade
> action games.  The sprite functionality could be extracted separately from
> the background scrolling engine and could be used in any type of game.
>
> I also came up with a name for this engine: I call it DynoSprite.  I have
> a few more weeks of work to do on a demo that I will release to show the
> sprite functionality.  With any luck I should have something cool to show
> you soon.
>
> Richard
>
> --
> Coco mailing list
> Coco at maltedmedia.com
> http://five.pairlist.net/**mailman/listinfo/coco<http://five.pairlist.net/mailman/listinfo/coco>
>



More information about the Coco mailing list