Author Topic: FAST VDC ???  (Read 841 times)

0 Members and 1 Guest are viewing this topic.

Offline Hydrophilic

  • 128D user
  • *******
  • Posts: 1225
  • Age: 41
  • Location: Earth... still!
  • Activity:
    2.2%
  • Reputation: 232
  • Gender: Male
  • With us since: 25/01/2007
    YearsYearsYearsYearsYearsYears
    • View Profile
    • H2Obsesson
FAST VDC ???
« on: April 10, 2011, 01:53 PM »
This has been bugging me for a while now, but I've kept it on the "back burner" until I thought of it again due to Tokra's threads on VDC 8x1 and 8x2x2 video modes...
 
So as some of you may know, the VDC is accessed at only 1MHz speed... even though the VDC is the "default" device for 2MHz speed (because the VIC can't go that fast).
 
For some of you, this may come as a suprise.  This suprised me when I heard about it.  In fact, I had to test it.
 
I don't have the test code, but trust me it is true.  You can experiment for yourself.  I'm quite certain this is because the VDC registers are in the $Dxxx region of memory ($d000 ~ $dffff).
 
Another very interesting and related experiment you can try deals with the MMU.  You can access the MMU "special" registers at $ff00~$ff04 at full 2MHz speed.  But if you try to access the equivalant MMU registers at $d500~$d504 you will discover that the CPU slows down to 1MHz.
 
Why?  I'm thinking this is due to /IOACC being generated by PLA (U11 pin 43) whenever it receives /IOCS on pin 38.   Note the /IOACC signal from PLA goes the VIC-IIe, which in turn slows down ("stretches") the 2MHz clock down to 1MHz speed.
 
Of course to talk to the VDC, you need to access $d600 or $d601, which are also in the $Dxxx region.  Thus, "by default" the system slows down to 1MHz.
 
But I was thinking (danger!), that maybe the VDC is like the MMU... perhaps it is fully capable to talk at 2MHz speed, but the PLA slows things down.
 
There are 3 facts in support of my supposition:
   
  • The VDC (unlike the CIAs which really do need 1MHz) is not primarily timed via the 1MHz clock, but instead by its own 16MHz oscillator
  • The VDC chip select (pin 4) is controlled by the 2MHz line (unlike the CIAs which are selected by 1MHz line)
  • The VDC R/W line is controlled by special "fast" line (F R/W in schematics) as opposed to "normal" R/W line of CIAs.
Points 2 and 3 can be verified by opening your C128, or consulting the schematics in the C128 PRG on page 724, or just look at the image I clipped from the schematic in attachement #1.
 
So the "professional" way to implement 2MHz VDC might be to re-write the PLA... but even if you could make your own custom PLA, you still would have to replace the 48-pin monster that is soldered directly to the circuit board!  Not something I would ever want to do...
 
Now the critical PLA input (/IOCS on pin 38) actually comes from a simple 3-to-8 line decoder (U3, type 74LS138).
 
My "hacker" solution is to cut the /IOCS line between U3 and the PLA and insert a simple logic cirucit that calculates NOT /IOCS NAND /CS8563 with the result going to the PLA.  See attachment #2.
 
For those not familiar with logic circuits, let me summarize the idea: use 2MHz for the VDC but use 1MHz for any other access in the $Dxxx region.
 
So has anybody tried this?  Has anyone heard of this being attempted (and the results) ?
 
Surely there are reasons this might not work...  Such as propogation delays through the logic network... or the simple fact the VDC can't talk at 2MHz... but is there any obvious reason it would not work?
 
I have not tried this.  I haven't even installed my 64K VRAM chips!  I'm the kind of guy who thinks "if it ain't broke, don't fix it".  But I thought I would toss out the idea for the more adventurous guys (and girls) out there.
 
Edit
I removed the attached schematic of how I thought this might work, because as RichardC64 points out a few posts below, pin 38 of the PLA is an output, and you don't want to connect two outputs together!  A re-worked version that might work can be found later in this thread...
 
« Last Edit: April 14, 2011, 10:36 PM by Hydrophilic »
I'm kupo for kupo nuts!

Offline BigDumbDinosaur

  • C128 user
  • ******
  • Posts: 757
  • Age: 68
  • Location: Midwest USA
  • Activity:
    0%
  • Country: us
  • Reputation: 64
  • Gender: Male
  • Yuh think donkeys are dumb, try a politician!
  • With us since: 02/01/1970
    YearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYears
    • View Profile
    • BCS Technology Limited
Re: FAST VDC ???
« Reply #1 on: April 11, 2011, 01:48 PM »

This has been bugging me for a while now, but I've kept it on the "back burner"...the VDC is accessed at only 1MHz speed... even though the VDC is the "default" device for 2MHz speed (because the VIC can't go that fast)...I'm quite certain this is because the VDC registers are in the $Dxxx region of memory ($d000 ~ $dffff).

Your supposition is correct.  Any access to the I/O range is with the stretched (1 MHz) clock...
 
Quote
Another very interesting and related experiment you can try deals with the MMU.  You can access the MMU "special" registers at $ff00~$ff04 at full 2MHz speed.  But if you try to access the equivalant MMU registers at $d500~$d504 you will discover that the CPU slows down to 1MHz.

Because it too is in the I/O block.
 
Quote
Why?  I'm thinking this is due to /IOACC being generated by PLA (U11 pin 43) whenever it receives /IOCS on pin 38.   Note the /IOACC signal from PLA goes the VIC-IIe, which in turn slows down ("stretches") the 2MHz clock down to 1MHz speed.

Correct.

Quote
So the "professional" way to implement 2MHz VDC might be to re-write the PLA... but even if you could make your own custom PLA, you still would have to replace the 48-pin monster that is soldered directly to the circuit board!  Not something I would ever want to do...

If would have to be done with a CPLD.  Even then, you'd be in for quite a task ripping out the PLA and replacing it with the CPLD.
 
Quote
My "hacker" solution is to cut the /IOCS line between U3 and the PLA and insert a simple logic cirucit that calculates NOT /IOCS NAND /CS8563 with the result going to the PLA.  See attachment #2...So has anybody tried this?  Has anyone heard of this being attempted (and the results)?

I'm not aware of anyone trying it.  Obviously, this isn't your trivial hardware hack, as you run the risk of converting a functional C-128 into a large door stop.
 
Quote
Surely there are reasons this might not work...  Such as propogation delays through the logic network... or the simple fact the VDC can't talk at 2MHz... but is there any obvious reason it would not work?

Logic prop delays are not that critical at the low speeds at which the 128 runs.  You can replace some silicon with 74F, 74ABT or 74AC if it gives you a warm and fuzzy feeling. 
:)

Quote
I have not tried this.  I haven't even installed my 64K VRAM chips!  I'm the kind of guy who thinks "if it ain't broke, don't fix it".  But I thought I would toss out the idea for the more adventurous guys (and girls) out there.

I'd be leery of trying it, as you don't really know for certain the results.  BTW, the timing specs for the 8563/8568 suggest they can run at an effective 8 MHz bus rate without any problems.  I believe one of the systems for which the 8563 was intended was a low-cost UNIX workstation that was to be run by a Zilog Z8000 MPU, which would have had an elevated clock rate.  Could be wrong on that though.

However, assuming your hack would produce the desired result, I question whether the change will be perceptible to the average user.  Much of the MPU time in driving the display is eaten up in screen kernel processing, since a lot of things are done each time a character is written.  As you may know, I've spent a lot of time studying the screen kernel routines, and in doing so, have seen that the amount of processing involved in actually driving the VDC is relatively small.  The most intensive operations are in scrolling or clearing the screen, and the grunt work for that is done inside the VDC, which runs with its own clock.  So I'd be surprised if the display rate significantly improved.  When you switch from
SLOW to FAST mode, the change in apparent display rate is in the more rapid execution of the screen kernel routines.  That should give you a clue as to what to expect.
x86?  We don't got no x86.  We don't NEED no stinking x86!

Offline Hydrophilic

  • 128D user
  • *******
  • Posts: 1225
  • Age: 41
  • Location: Earth... still!
  • Activity:
    2.2%
  • Reputation: 232
  • Gender: Male
  • With us since: 25/01/2007
    YearsYearsYearsYearsYearsYears
    • View Profile
    • H2Obsesson
Re: FAST VDC ???
« Reply #2 on: April 12, 2011, 12:15 AM »
Thanks for the detailed reply, BDD.
 
Quote
Obviously, this isn't your trivial hardware hack, as you run the risk of converting a functional C-128 into a large door stop.
I don't think it would damage anything.... if it didn't work, you could just reconnect the original lines.
 
Quote
BTW, the timing specs for the 8563/8568 suggest they can run at an effective 8 MHz bus rate without any problems
That's good to know.  Assuming those specs are reliable, then 2MHz should not be a problem at all.
 
Quote
I question whether the change will be perceptible to the average user.  Much of the MPU time in driving the display is eaten up in screen kernel processing...
I agree, it would not make much difference for normal applications.  Which is probably a good reason I've never heard anybody doing it.  It would mainly only benefit VRAM intensive programs, such as GEOS, or some games, or picture viewer...
 
Even then, it would not help much for 'traditional' software.  Consider
Code: [Select]
STX $D600 ;[4 or 5] [4]
BIT $D600 ;[4]      [4]
BPL wait  ;[2]      [2]
STA $D601 ;[4]      [4]
;total    ;[14 or 15][14]
The numbers in brackets are instruction cycles.  The "4 or 5" depends if the 2MHz clock is 'in phase' with the 1MHz clock.  If it is out of phase, 5 will be needed, but then (because all opcodes here have an even cycle count) each subsequent I/O access will be in-phase.  Note, if the VDC is not ready, the BPL instruction will take 3 cycles, and the first instruction will be out-of-phase again, thus requiring 5 cycles again.
 
Anyway, assuming the VDC is ready, a normal C128 would take 14.5 cycles on average and 2MHz-VDC-mod unit would take 14 cycles.  Which is only a 3.6% improvement!
 
However that is using 'conventional' programming.  A trick used by several programs is to write consecutively to the VDC once it is ready.  The number of consecutive writes possible is determined by the video mode and the DRAM refresh register.  It has been reported that over 24 bytes can be written consequtively when in text mode and DRAM refresh of 0.  Using the default setting of 5, I have myself observed that at least 8 bytes can written consecutively.
 
So let's consider that
Code: [Select]
STX $D600  ;[4 or 5] [4]
BIT $D600  ;[4]      [4]
BPL wait   ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
total  ;[104 or 105] [96]
So assuming the VDC is ready and no-page boundry crossing, a stock C128 would take an average of 104.5 cycles, while a modified unit would take 96 cycles... for an improvement of 8.9%...
 
Well that's not very impressive. :'(   So I guess this is not a good idea unless somebody is really desperate too eek out a few percent of speed and/or likes doing hardware mods...
I'm kupo for kupo nuts!

Offline BigDumbDinosaur

  • C128 user
  • ******
  • Posts: 757
  • Age: 68
  • Location: Midwest USA
  • Activity:
    0%
  • Country: us
  • Reputation: 64
  • Gender: Male
  • Yuh think donkeys are dumb, try a politician!
  • With us since: 02/01/1970
    YearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYears
    • View Profile
    • BCS Technology Limited
Re: FAST VDC ???
« Reply #3 on: April 12, 2011, 02:53 PM »
Quote
Obviously, this isn't your trivial hardware hack, as you run the risk of converting a functional C-128 into a large door stop.
I don't think it would damage anything.... if it didn't work, you could just reconnect the original lines.
Assuming the PCB didn't fall apart from the heat of soldering.  You're talking about PCB's that are over a quarter-century old, and were not of the highest quality when new.  In fact, as PCBs go, Commodore's were pretty junky.  They were designed to be cheap, not good.  Traces might start peeling off if you're not careful.

Quote
Anyway, assuming the VDC is ready, a normal C128 would take 14.5 cycles on average and 2MHz-VDC-mod unit would take 14 cycles.  Which is only a 3.6% improvement!
Correct.  As I said, most of the wallclock time is in code execution, not VDC processing.
 
Quote
However that is using 'conventional' programming.  A trick used by several programs is to write consecutively to the VDC once it is ready.  The number of consecutive writes possible is determined by the video mode and the DRAM refresh register.  It has been reported that over 24 bytes can be written consequtively when in text mode and DRAM refresh of 0.  Using the default setting of 5, I have myself observed that at least 8 bytes can written consecutively.
 
So let's consider that
Code: [Select]
STX $D600  ;[4 or 5] [4]
BIT $D600  ;[4]      [4]
BPL wait   ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
INY        ;[2]      [2]
LDA ($FC),Y;[5]      [5]
STA $D601  ;[5]      [4]
total  ;[104 or 105] [96]

So assuming the VDC is ready and no-page boundry crossing, a stock C128 would take an average of 104.5 cycles, while a modified unit would take 96 cycles... for an improvement of 8.9%...
 
Well that's not very impressive.    So I guess this is not a good idea unless somebody is really desperate too eek out a few percent of speed and/or likes doing hardware mods...
You've succeeded in proving my earlier assertion, that the average user will not perceive any improvement.  It might help a bit running a hi-res screen, but in text mode I don't think you'll see anything at all.  The simple fact is the VDC is actually faster than the 8502, even in 2 MHz mode.  Now, if the W65C816S was in there doing 16 bit loads and stores at 20 MHz...  :D
x86?  We don't got no x86.  We don't NEED no stinking x86!

Offline RobertB

  • Forum god
  • ********
  • Posts: 2886
  • Location: Visalia, California
  • Activity:
    2.4%
  • Country: us
  • Reputation: 451
  • With us since: 05/06/2006
    YearsYearsYearsYearsYearsYearsYears
    • View Profile
    • Fresno Commodore User Group
Re: FAST VDC ???
« Reply #4 on: April 12, 2011, 05:20 PM »
Assuming the PCB didn't fall apart from the heat of soldering.  You're talking about PCB's that are over a quarter-century old, and were not of the highest quality when new.  In fact, as PCBs go, Commodore's were pretty junky.  They were designed to be cheap, not good.  Traces might start peeling off if you're not careful.
     And yet, master C= technician Ray Carlsen was able to take my C128DCR and install sockets for all the chips.

          Now that's skill,
          Robert Bernardo
          Fresno Commodore User Group
          http://videocam.net.au/fcug
          July 23-24 Commodore Vegas Expo 2011 - http://www.portcommodore.com/commvex
« Last Edit: April 12, 2011, 05:22 PM by RobertB »

Offline richardc64

  • KIM-1 user
  • **
  • Posts: 13
  • Location: NYC
  • Activity:
    0%
  • Reputation: 107
  • Gender: Male
  • With us since: 22/11/2009
    YearsYearsYearsYears
    • View Profile
    • Past, Present, Future
Re: FAST VDC ???
« Reply #5 on: April 13, 2011, 03:19 AM »

Why?  I'm thinking this is due to /IOACC being generated by PLA (U11 pin 43) whenever it receives /IOCS on pin 38.   Note the /IOACC signal from PLA goes the VIC-IIe, which in turn slows down ("stretches") the 2MHz clock down to 1MHz speed.

Nope, sorry. You got that part backwards. At U3 pin 5, /IOCS is an input, one of two active-low enables that set up U3 to make one of its outputs low, enabling SID, CIAs, VDC, etc. PLA pin 38 is an output.

The PLA doesn't really call the shots, as far as I/O access speed. The MMU lets the PLA know (via MS2 I/OSEL,) when $Dxxx is being addressed as I/O. That, and other conditions, determine when the PLA tells VIC to stretch the clock.
"I am endeavoring, ma'am, to create a mnemonic memory circuit... using stone knives and bearskins." -- Spock to Edith Keeler

Offline Hydrophilic

  • 128D user
  • *******
  • Posts: 1225
  • Age: 41
  • Location: Earth... still!
  • Activity:
    2.2%
  • Reputation: 232
  • Gender: Male
  • With us since: 25/01/2007
    YearsYearsYearsYearsYearsYears
    • View Profile
    • H2Obsesson
Re: FAST VDC ???
« Reply #6 on: April 14, 2011, 03:12 PM »
You're right BDD, I was proving your point... it isn't very usefull.  I also agree about soldering on the circuit board.  It is very easy to lift the copper traces by accident... especially if you're trying to 'unbend' the pins to remove a chip!
 
Quote from: richardc64
You got that part backwards. At U3 pin 5, /IOCS is an input, one of two active-low enables that set up U3 to make one of its outputs low, enabling SID, CIAs, VDC, etc. PLA pin 38 is an output.

After looking at at the schematics again, I agree.

This forum discussion has been able to flesh out an idea that has been stuck in the back of my mind for a few years now.  So thanks everyone for your comments!
 
Edit
The ultimate IO output signal comes from the PLA to tell VIC-II to stretch the 2MHz down to 1MHz.  But the MMU and PLA work together as a team; so the the MMU /IOSEL line (aka MS2, Pin 13 of U7) might be needed.
 
So I attached an updated schematic.  The idea is the same as before, only the details are changed...
 
The /IOACC line going to VIC (U21 Pin 22) is cut.  Then the circuit in the diagram is connected in its place.  I'm not sure if /IOCS from PLA (U11 Pin 38) or the /IOSEL line from MMU (U7 Pin 13) should be used as input.  Perhaps neither will work.
 
I have not tried this nor recommend it.  Not only is it untested, but as BDD pointed out, the improvement will hardly be noticeable.  But I thought it sounded like a fun experiment for anybody who likes doing hardware mods.
« Last Edit: April 14, 2011, 11:40 PM by Hydrophilic »
I'm kupo for kupo nuts!

 



Back to top