> colour changes only every few rows near the top of the screen. E.g. let those ...

vidarh · on June 17, 2022

> It might look like it's not, particularly on the first level, because you think the graphics look too simple. But sadly that's just how it looks.

I specificaly did not say that Cobra isn't using full colour scrolling. You're probably right that it is.

But what you've described does not require it.

I am saying that Cobra is an example of the style of level design that could easily get savings without sacrificing the graphics - including the accent colours. Cobra is otherwise simple enough that it probably wouldn't be worth it, but that is entirely besides the point I was making.

I did note the windows etc., yes. But the point is that there is no place in the whole game where there's a lot of big or different transitions going on at once, and the number of transitions is very small, and notably a lot of the transitions only occur at specific rows. The graphics as is is very constrained.

The 5m mark is a perfect example of exactly what I'm saying - only about ~11 rows changes colour, and from what I saw that's one of the most extreme transitions in the game. But while it affects many rows, it also uses few colours and so you can offset even some of that cost.

The way it's structured in faux 3D "layers" seems ripe to "decompress" the levels into a set of transition functions inspired by painters algorithm to "paint" bottom up, front to back, and then JSR at an offset into the transition function for the object "behind" if it extends past. You'd want to flatten some of the "layers" (at the cost of increasing the number of transition functions).

Here's what I'd try:

Since the transition functions can hardcode LDA immediate's, and STA absolute gets X indexing for one extra cycle, you can have the transition functions X index and used that to reuse the transition functions for every column. So e.g. instead of this:

   scroll_splat_colour:
      LDA #$00  # colour data for char
      STA $D800 # colour ram
      LDA #$00
      STA $D801

You'd get this for each transition sequence (could be a slice of an object, but in practice you'd probably want to flatten it somewhat, at the cost of more transition functions but reducing the cycle cost because of fewer JSR/RTS pairs):

    some_transition:
       LDA #$col1  
       STA $D800+offset_of_last_row_start, X
       STA $D800+offset_of_second_to_last_row_start,X
       ... and so on until colour change
       LDA #$col2
       .... sta to next row.
       RTS

You then decompress the level into a series of JSR calls per column, right to left + DEX, and a set of offsets. Every time you scroll, you then overwrite the DEX after the last (leftmost) transition with an RTS (and fix up the previous one), and JSR to the first (rightmost) list of bottom-up JSRs.

[Beware the likely errors in counting; long time since I've looked at cycle counts..]

You need to save 12 cycles per transition function for the JSR/RTS plus 2 for the DEX per column plus a handful of cycles to do the setup per frame to break even, plus 1 cycle per STA.

An LDA immediate/STA absolute pair is 6 cycles, so moving, say a 39x20 field adds up to 4680 cycles that way. If "everything" fails and you have one JSR + one DEX for 39 columns + STA absolute X-indexed for all 780 characters, you pay a cost of an extra 1287 cycles plus setup. Let's call it 1320. But this is before taking into account that you now don't need to do this:

    LDA scroll_splat_colour+6
    STA scroll_splat_colour+1
    LDA scroll_splat_colour+11
    STA scroll_splat_colour+6

Assuming you split that in 7 segments of 111 (for 780/7) updates of 4+4 cycles each, for 888 cycles, we're 432 short in the pathological case where every colour is different to the one below it and different to the one to its right.

But each STA absolute X-indexed we save, because colours do not change right to left, saves us 5 cycles. Each LDA we save because colours often repeat bottom to top saves us 2 cycles. If we assume these are evenly spread, we need 62 of each every frame to break even. In other words only 8% need to repeat. Of course if it just breaks even it's pointless.

The only constraint on the artist to do this would be to set a "budget" for transitions to make sure there's always enough repetition to make it worthwhile, but it's certainly doable.

jimsmart · on June 17, 2022

The main issue here is that this doesn't really work for anything but horizontal scrolling. The code I explained works for all directions. One of the games I coded was an 8-way scroller. Another had horizontal sections seamlessly connected to diagonal sections.

Granted I didn't show different version of moving the data through the splat code, for different directions, I only showed a single direction. And I now realise that multi-direction was not mentioned in my OP. Which is most likely why we seem to be talking at odds with one another somewhat here.

The code I showed works for all of these cases (8-way, horizontal-diagonal transitions, pure horizontal or vertical) — sure, one needs extra variations on the actual scroll code, but not the splat code — plus it is easier to reason about, quicker to develop, and doesn't impose artificial constraints on the graphics/map/tile-usage — which means the graphics won't take as many iterations to get right (you can bet the artist will make mistakes in the map, 100% guaranteed — cos without also modifying the tooling to account for all of this, that's what artists always do. Speaking from experience!).

> The only constraint on the artist to do this would be to set a "budget" for transitions to make sure there's always enough repetition to make it worthwhile, but it's certainly doable.

Even for a single-way scroller, it's quite hard to tell if the benefits you claim here really outweigh all of the costs involved (not just CPU costs - but dev time, map design time, adjustments to tooling, etc). IDK if it'd get the go ahead in any of the places I've worked, without a fully working demo.

But yes, sure, your idea may well be more optimal at runtime, for single direction scrollers. But I don't see how it would / could work for all scroll directions. And the additional constraints would likely give the artists nightmares! ;)

And, as you say, worst case is still gonna be ~25% slower.

The biggest problem that one is trying to overcome here is the additional hit to CPU budget every n-frames (usually 8), when the color RAM update is needed. Having something that might, under certain conditions, be slower, could be problematic. Whereas having something that is constant time is, perhaps arguably, always going to be easier to deal with.

But thanks for taking the time to explain. I kinda thought that that's what you were implying.

Apologies I hadn't been clear that the code I showed (immediate load-stores) was a colour RAM splat technique that was applicable to scrolling in any direction. It's kind of a key point really, and I see I completely omitted it :/ I thought it was mentioned, but I now see that the 8-way scrolling conversation was a slightly different thread.

vidarh · on June 18, 2022

Sure, doing it for 8 way would require a different approach and would likely be a lot harder to make work (not impossible, but it would likely constrain level design quite a bit)

To be clear, the one you showed is a great baseline, and likely the best option for most cases unless you hit a wall where you absolutely need to save extra cycles.

Only then it'd be worth exploring something this much more complex.

I guess my main point is that if you hit that wall, then an approach specialised for the level design can give extra savings.

But of course you're probably absolutely right that it'd not fit the budget of a typical C64 game back in the day.

I'd also add that this is far easier to do today, so it's easy for me to propose as an alternative now, but it's unsurprising if nobody used it back then.

E.g. I could throw together a script to do near optimal arrangement (I'm pretty sure finding optimal arrangements reduces to bin packing, which is NP hard) of transition function to maximise the minimum saved cycles for any given level data on screen and run it fairly easily on my laptop, but the number of permutations of "flattening" of layers would have forced you to resort to educated guesses back in the day, and so your savings might be more marginal.

(But now I kinda wish I had the time to put together a working example and see how far it could be pushed)

In terms of the worst case, to be clear for this to be useful at all you'd need to verify it never hit it. Given how many random colour changes would be required to hit it, I think you could guarantee that simply with a few minor constraints on the tile set (e.g. every 2x2 tile on screen with the same colours in all four positions saves you 2xSTA's and 2xLDAs) coupled with a willingness to make small tweaks (e.g. deleting a window from an otherwise "busy" part of a level etc.).

You can relatively cheaply validate it, since calculating the cycles for any given screen contents is easy, so you could validate levels by "scrolling" through the whole thing and calculating a graph of cycle counts at any given point. Of course, this is again much easier to do today when we can hack up a trivial script to run through the level data in seconds than it would have been back in the day.

And of course as you point out, you'd have needed a very good reason to spend the extra time (and impose the extra limitations on the level design) to make such small savings, or you'd just not have been able to justify it.

So unless you had a game so complex you absolutely had no choice but finding ways of cutting it down, it'd have made no sense. Even then I'm guessing back then people would have made the static parts of the screen bigger instead of anything complex like this.

jimsmart · on June 18, 2022

Sure. But arguably, that's a lot of extra work, plus some arbitrary restrictions (tiles of single colour, scroll directions), for a system that, at worst case, is surely going to be doing at least a little bit more work than the simple optimal colour splat technique I described.

Whereas if one has a system where there is no worst-case because/and it runs in constant time, and is as optimal as possible (i.e. the fastest way of updating all the necessary data) — which requires no restraints on the graphics, can work no matter which way the games/player scrolls, doesn't require much extra work on other frames (i.e. when not updating the colour RAM), nor any extra work using custom-made external tooling — it would, with little argument from most folk, clearly seem to be the better choice.

There are reasons why simple solutions often win out over more complex ones.

Particularly when coding on limited hardware. Particularly when under time pressure to publish. Particularly when graphics designer's iteration time has a cost per hour.

> So unless you had a game so complex you absolutely had no choice but finding ways of cutting it down, it'd have made no sense.

I've worked on lots of games projects that never actually got published, some of them were because we came across a wall, and despite trying lots of novel solutions, often many of which were 'outside the box', those issues couldn't be overcome. These things are massive time sinks, and, often, solutions that seem like good ideas, simply aren't.

Sometimes one simply has to accept that you cannot squeeze an elephant into a matchbox! It becomes easier to spot when solutions might not work as one becomes more experienced. But sometimes one might still spend time trying to squeeze the elephant.

Solving the worst case, with the least restrictions, in both the simplest and the most optimal way, often gives one a sensible measure as to what is actually reasonably possible.

— If you can reduce the time needed to do a full-colour update when scrolling arbitrarily (including 8-ways), with unrestricted use of colour design on the maps/tiles, on the C64 platform, then I'm sure there's a reasonable handful of folk that would like to hear your suggestions. Otherwise, technically you're solving another problem, and it's not the one which we were dealing with back then.

vidarh · on June 19, 2022

We absolutely agree it'd be a lot of extra work, hence I agree with you that it'd not have been applicable in most instances back then. Most of the time if you faced a squeeze like that you'd be more likely to agree to reduce the playing field by a row or two instead, or ditch something else. But being freed from those considerations and considering what is actually technically possible is what is interesting to me.

> — If you can reduce the time needed to do a full-colour update when scrolling arbitrarily (including 8-ways), with unrestricted use of colour design on the maps/tiles, on the C64 platform, then I'm sure there's a reasonable handful of folk that would like to hear your suggestions. Otherwise, technically you're solving another problem, and it's not the one which we were dealing with back then.

They are different problems, yes, but plenty of games have gameplay where 8-way scrolling is not what you're dealing with. And additional constraints also often fall out of the level design. To me it's the way those design constraints create opportunities to use additional tricks that are interesting, not the problems you were dealing with back then.

Frankly it's an extra artificial problem, because the state of the art of scrolling on the C64 today involves using "dirty tricks" from the demo-scene like VSP/HSP/AGSP which no "hardscroll" based approach like the ones we're discussing can compete with.