[Coco] A Technical Question

John Kent jekent at optusnet.com.au
Fri May 18 01:29:24 EDT 2012



On 18/05/2012 5:05 AM, jdaggett at gate.net wrote:
> The Coco line uses what is called IDMA, Interleaved DMA, to access a 
> shared RAM block for video and data/program storage. This is done 
> using the twophase clock system of the board. The E clock is the 
> controlling clock for periperal access on the MC6809 series of 
> processors. When the E clock is low the microprocessor is not 
> accessing any external memory or peripherals. Therefore the the video 
> controller can access the memory at that time to read data. When the E 
> clock is high then the microprocessor is accessign memory or a 
> peripheral device. This method is okay up to about clock speeds of 
> about 25MHz. After that, standard dram can become to slow. Even static 
> RAM is close to not usable. The reason is that the access time for the 
> ram has to be a minimum of twice that of the E clock cycle. So for a 
> 2MHz E clock, 500nS cycle time, the ram used must be no slower than 
> 250nS access time. To move up to a 25 MHz E clock speed, 40nS cycle 
> period, the ram used must be no slower than 20nS access time. There 
> advantages of what the COCO does and disadvantages also. One 
> disadvantage that any shared video memory access has to be done based 
> on a multiple of the E clock for the 6809. Also a disadvantage is the 
> that 10nS SRAM is rather expensive per megabyte. If you want 2 
> megabyte of 10nS SRAM, you better be prepared to plunk down aobut $25 
> or so. The advantage this system is with slower CPU clock speeds. An 
> example would be with a 2 MHz bus speed for the 6809. The cpu cycle 
> time is 500nS. The CPU access of memory/peripheral would be on the 
> second half of the c-pu cycle. During the first 250 nS you can read as 
> many bytes as you can depending on how fast the ram access is. SRAM at 
> 20nS could read 8 bytes easily during the first 250nS the machine 
> cycle. DRAM could be put into burst mode and read as many bytes as the 
> dram will allow in 250nS, providing the dram is fast enough to read 
> the desired number of bytes. In such a system you would have to pass 
> all the 6809 read data through s controller much like the GIME chip to 
> multiplex between the CPU and video controller. It gets a bit nasty, 
> but is quite doable especially in an FPGA. Then there is the route 
> that most PCs use. Access video during the blanking levels. This 
> requires DDR or QDR dram chips to read enough data fast enoough 
> depending on what the pixel clock is and the pixels per row and the 
> color depth. Most of the 6809 FPGA designs use a single clock and are 
> pretty much limited to about 25MHz bus speed. IDMA would require 20nS 
> SRAM or faster. DDR ram is 3.5nS, but a more complex controller is 
> needed and with the CPU runing at even 20 MHz creates its own problems. 
> james 
> -- Coco mailing list Coco at maltedmedia.com 
> http://five.pairlist.net/mailman/listinfo/coco 
The 1MB SRAM on the Digilent Spartan 3 starter board I think has a 
10nsec access. It's organized as 256K x 32 bits. You have to use a 
multiplexer and demultiplexer on the data bus to convert it to 4 
consecutive bytes. That will introduce propagation delays too. You have 
to deselect SRAM when you change the address if you are doing a write so 
it effectively becomes 20nsec if you use the FPGA clock to deselect it.

Gary Becker I think multiplexed the 16 bit data bus on the Terasic DE1 
board between video and CPU which meant the video could access the RAM 
at twice the data rate. Because the FPGA 6809 is running with a 25MHz E 
clock, you have to insert wait states on the processor while the video 
controller accesses memory. I've implemented a hold signal on my 6809 
design that disables the clock signal to all the registers so it 
effectively introduces a wait cycle. I'm not sure what the RAM access 
speed is on the DE1 board, it might be 15nsec.

I've implemented my FPGA 6809 on an XESS XSA-3S1000 which uses SDRAM. 
The SDRAM can run with a 100MHz clock I think, but I might run it at 
50MHz. With SDRAM though, accesses are normally delayed by say around 6 
clock cycles so you end up running the CPU at a much slower speed. 
Ideally you need to use the block RAM in the FPGA as cache with SDRAM, 
so that the CPU is only held up if there is a cache miss and the cache 
has to be updated. I have not got as far as implementing cache on my 
FPGA 6809 designs.

Ideally you want a Harvard design with separate instruction and data 
cache. It would require a separate instruction and data address and data 
bus with separate instruction and data cache. The data memory accesses 
would then need to be piplelined but could be performed concurrently 
with the next instruction fetch. The only issue though is that on 
conditional branches, the previous data data operation must be complete 
before the branch is taken. IT might be something that could be looked 
at in the future to make the FPGA 6809 run faster.

John.

-- 
http://www.johnkent.com.au
http://members.optusnet.com.au/jekent




More information about the Coco mailing list