03.06.08

A Different Approach

Posted in Apple ][ at 8:56 am by site admin

So I was thinking about GTE last night and come to the realization that the fundamental structure of the blitter needs to be changed. “What’s that!”, you cry. “How can he possibly consider throwing everything out at this point! Why, he’s not even released a functional tool yet!!”. True, I have not gotten the Tool Set finished, but I think it’s important to have a solid 1.0 release that can be incrementally extended. I have no desire to overhaul to code once it gets out into the wild.

That said, the changes I’m going to propose do not require rewriting a significant amount of code. It’s really a reorganization of the sequence of operations. Currently, the core rendering algorithm of GTE is scanline based. A full line of data is blitted and the sprites are composited on top of it. This is synchronized with the Vertical Blank (VBL) in order to avoid flicker. This works.

While converting the code base to a ToolSet, I’ve been consciously trying to reduce the complexity of the code and, consequentially, lowering the amount of overhead. By overhead, I mean all the instructions that need to be executed to maintain data structures, shuffle data to the proper location, etc. It is often the case that a simpler, but less efficient approach can out-perform a theoretically faster method if the overhead is significantly reduced.

The renderer in GTE in theoretically fast because it only draws data once. There is no erasing or restoration of the background data when sprites are composited on top. However a significant amount of overhead is required to decompose the sprites on a per-line basis and integrate their drawing into the blitter inner loop. I thought that there was no way around this problem without introducing flicker into the blitter, but after reviewing TN #70, I have changed my mind.

I think the renderer can be changed to the following

  • Turn shadowing on
  • Blit all lines without sprites
  • Turn shadowing off
  • Blit the lines with sprites
  • Blit the full sprites onto the shadow screen
  • Turn shadowing on
  • Expose the lines via PEI slamming

If you read through TN #70, it documents the amount of time it takes to copy data to the graphics screen with shadowing on or off. Because the IIgs does not need to synchronize the fast and slow sides when shadowing is off, the code can run faster. Still, copying the data via PEI slamming is still much slower than the time saved by avoiding synchronization. In the worst case, we would have to save an additional 320 to 480 cycles per rendered scan line to make up the difference in full screen (320×200) mode.

Fortunately, we get a real win from disentangling the sprite rendering from blitting the background data. Just a cursory check of the code shows that by removing some excess code for VBL synchronization, sprite dispatch and softswitch toggling, we can save over 100 cycles per scan line which is already over a fifth of the required saving. There are similar savings to be had in the sprite rasterizer and I’m sure that the inner loop of the blitter will be able to enjoy some simplification as well. This is a conservative analysis, since it doesn’t take into account the fraction of times that the renderer is required to wait for the VBL, which can stall the code for up to 10 scanlines, or 630 microseconds which corresponds to around 1,500 cycles.

So, to summarize, what appears to be a suboptimal way of rendering may actually be faster by reducing the overhead of the renderer. In addition, we are guaranteed to have flicker-free updates and the frame rate will be more consistent when a fixed number of sprites are on screen due to the removal of VBL synchronization.

All in all, a pretty nice win!

Leave a Comment

You must be logged in to post a comment.