[Coco] DW4 problems

Aaron Wolfe aawolfe at gmail.com
Sat Oct 13 04:14:22 EDT 2012


I replied directly to your bug report, but some general info in case
other people see this message...

Generally if you see "OP_WRITE took X ms.  Server loaded or low on
ram?" in the log, it's not actually anything to do with the server
load or ram.
I need to change that message.  If you happen to be running DW4 on an
embedded linux system with a 133Mhz processor and 16MB ram, as I was
testing when that message was put into the monitoring code, it is
often accurate :)  However on a modern PC it usually means something
quite different.

Every command/response exchange between the CoCo and the DW server is
governed by a timeout of ~200ms.  Both the OS9 driver and the DW
server will generally consider an operation to have failed if they
wait that long for data from the other side.  Writing a sector is a
particularly sensitive operation because the DW server doesn't want to
say "OK, it worked" until it knows the sector has been committed to
storage.  However, writing that sector to storage can take a while,
especially if said storage is a remote FTP server or something like
that.  For this reason DW times the write operations, and if writes
take > a certain amount of time, DW switches to a caching strategy
where sectors are first written to memory, the Coco is told "OK", and
then a separate thread worries about actually getting them written to
the source later.  In the event that DW is already using memory cache
and a write *still* takes too long, it throws the "Server loaded or
low on ram?" assuming incorrectly that the delay is it's own fault.

At the time I didn't think about the other way an operation can take
too long, which is simply that the CoCo doesn't send an entire 256
byte sector within the time limit.  However, this is the most likely
case if you see that error on a modern PC.  For whatever reason, an
operation was started and then the DW server didn't get enough bytes
from the CoCo to complete it.

In Bob's case I suspect it's happening due to a clock operation that
was added to the OS9 driver but the DW software (and I thought
removed/fixed, but apparently not in the disk Bob has at least).  This
caused the CoCo to send a clock timestamp every 60 seconds.  Because
the server doesn't know about that operation, it ignores the opcode.
However, the timestamp contains arbitrary values depending on the
current time, and eventually it will contain a valid opcode in that
data such as OP_WRITE.  As soon as this happens, the server expects
sector data, but of course it doesn't get any, and we get the "server
loaded" message.

tl;dr - If you see this message, it's probably due to garbage from the
coco that happens to match the start of a valid operation

-Aaron





On Sat, Oct 13, 2012 at 3:53 AM, Bob Devries <devries.bob at gmail.com> wrote:
> I seem to be getting a lot of time outs using DW4, which cause ERROR 245 on the coco end. Here's some lines from the log:
>
> Sat Oct 13 2012 17:36:09.250  WARN   DWProtocolHandler   dwproto-0-10        Timed out reading from CoCo in OP_WRITE
> Sat Oct 13 2012 17:36:09.250  WARN   DWProtocolHandler   dwproto-0-10        OP_WRITE took 235ms.  Server loaded or low on ram?
>
> DW4 is being run on an Intel P4 2.8GHz with 2GB ram. Besides the usual clutter running in a windows background (AVG etc) the only application running is Outlook Express, set to grab emails every 10 mins.
>
> The connection is like this:
>
> Coco <--> ATEN USB dongle <--> 8-port powered USB hub <--> USB2 Port on PC.
>
> The CRC of DW is $5136C3
>
> Regards, Bob Devries
> Dalby, QLD, Australia
>
> PS: I sent a more extended bug report via the DW4 UI to wherever that ends up.
>
> --
> Coco mailing list
> Coco at maltedmedia.com
> http://five.pairlist.net/mailman/listinfo/coco



More information about the Coco mailing list