|
part 1/23
23 steps to high resolution on MSX1
|
|
The first thing you must realize about coding 8-bit demos is that the math involved isn’t really that complicated. Still checking out those 1240 texturemapped polygons on screen and wondering how on earth did those 8-bit-wizards manage to optimize it all to run 50 Hz on a 1 MHz machine realtime? Well, I’m sorry if I’m the first one to tell you this, but they did not. Nothing at all on 8-bit demos is neither very realtime nor optimized. Or maybe it’s realtime enough to have a realtime screen dumper, but that’s pushing it. If any optimization at all was performed, it was the precalc that was optimized in size and speed, not the actual code. That’s why I won’t go into any depths of mathematics on these articles. I’m not going to tell you how to write your own 3D-engine on the MSX because I’ve never written one myself. You must realize it doesn’t really matter how fast the precalc-algorithms are, since all they really do is precalc. Of course the faster the routine, the less time you’ll spend waiting. So what’s the catch? Why don’t anyone start coding their own animation-showers for the 8-bits pretending the stuff is very realtime and making people believe they’re actually very skilled coders? Well, first of all, nobody’s still converted Java(tm) to MSX and most of the wannabe industrial programmers are just too lazy to learn any language not attached with the stamp “commercially profitable”. Now who on earth would hire an expert in hacking an over 15-year old computer with an 8-bit-processor already long out of production? Second, it’s not all that easy. You see, the real problem is not coming up with the actual precalced effect, but outputting the data to the screen. For a 32-bit-programmer this will seem simply ridiculous. Picture the scene where you could perfectly render a realistic Doom on your 64×128 offscreen buffer at the rate of 20 Hz, but simply couldn’t draw the darn thing to screen fast enough. We’ll, that’s almost excactly the case with 8-bits. So when the closing 3D-scene from C64-version of 2nd Reality starts rolling, both the superhuman pc-coder and I will be gasping, but for different reasons: He’ll be going ‘how do they draw all those 1024 polygons realtime?’ whilst I’m thinking ‘To hell with the polygons, that has got to be an animation. But how on earth do they output it that fast?’.
MSX-idiosyncrasies
Still I’d be willing to say, that MSX is an extremely awkward machine to code in hires. For starters, you don’t have real vram. Oh yeah, there’s a whole lot 16 kB of it, but it’s not mapped to your regular ram at all. Instead the vram is accessed through ports, one byte at a time; a feature that for reasons beyond my understanding also seems to be implemented on Sega’s Master Sytem, Game Gear and Megadrive-series. The profit of having real vram mapped to your real ram is obvious: speed. On the darker side you do lose some memory, but really, even 48 kB ought to be enough for everybody. Or maybe organise the memory in several 64 kB banks one being the video memory. Anyhow, we’re stuck with this feature and that’s it. The real problem is really not just writing or reading, ram-access isn’t that fast either, but you always read or write the data in the same order. The vram-pointer increases every time you IN or OUT the port; so it’s easy to write 512 sequential bytes to vram, but just as slow writing 256 bytes using every other memory place and filling the gaps with blanks. Or picture having several bobs rotating around the screen. With real vram it would be easy to make the effect fullscreen just outputting the bobs to the desired areas on screen buffer. It’s of course possible to use the same method with MSX, but you’ll have to set the vram-pointer again every time you start to output a new bob. Setting the vram-pointers requires two OUT-operations, so you can easily see, that whilst the number of bobs increases, MSX is left further and further behind of it’s 8-bit-cousin with real vram. Second, you can only write during the vblank. If you try to output too fast during the vretrace, the data will become distorted. Of course you could stuff you’re code with NOP’s, slowing it down intentionally, but hey, the MSX is no monster with it’s 3.5 MHz and slowing down even more from there would be just plain laughable. So we’ll have to make our output during the vblank and that’s it. In practise I’ve found out that 2048 writes is just about the maximum value you can output, if your inner loop is optimized enough. On SCREEN 2 the used amount of vram is 12 kB, so it would take six vblanks just to refresh the entire screen. That wouldn’t be so bad if you could doublebuffer, but for the entire screen you can’t thus creating six raster splits. So summa sumarum: fullscreen hires effects can’t be created on the MSX. But it also applies that you should never say never, since in a future article I will show you how to make fullscreen hires-effects by using character based-graphics. Let’s consider the 2048 bytes once more. SCREEN 3 uses just 1536 bytes for the onscreen data and if you want to take the easy way, you could stick with your legosized blocks for the rest of your life. And there’s nothing wrong with that, the pc-sceners will be very impressed even if your effect is running with 4×4-pixels in fullscreen. I just tend to think that they’ll be even more impressed when the same effects is running with 1×1-pixels and fullscreen. But maybe that’s just me.
Moving on from block city to hires-euphoria
Using the graphics like this takes some extraordinary efforts and that’s why graphics modes in the MSX are nothing more than text modes. That is, you have a ‘font’: a set of 8×8-characters with non-fixed patterns and colours. On these articles I will refer to the character forms as patterns and the table of all the character patterns as pattern table. One pattern row is 8 pixels stored in 8 bits which equals to one byte. One character takes up 8 bytes in the pattern table. The other attribute defining the character is its colour, stored in the colour table. If we’re in SCREEN 1, the colour only takes up one byte, so the entire character is the same colour. In SCREEN 2, each character row has it’s own colour, so one character takes up 8 bytes in the colour table. The colour is always calculated as ‘16 * background_colour + foreground_colour’, where background-colour is used for the pixels on character pattern stored as 0b and foreground-colour for the ones stored as 1b. Now we have our ‘font’, but we still need to know how to place it on screen. Enter our ‘text’, the name table. The name table is a 32×24-sized table with one byte denoted to each ‘letter’ on screen, the ‘letter’ being 8×8 pixels. One character set contains 256 characters, that is, a pattern table of 256*8=2048 bytes and a colour table of the same size in SCREEN 2 or 256 bytes in SCREEN 1. SCREEN 1 has only one character set, so you really could not utilize fullscreen even if you wanted to. SCREEN 1 is ok for some special pattern based effects, but the one colour per row feature of SCREEN 2 is really such a luxury that from now on you can always pressume I’m talking about SCREEN 2 unless I state otherwise. So SCREEN 2 is really fullscreen with every single pixel on the screen changable as desired. You can’t change every colour because of the ‘2 colours per 8 pixels’-restriction, but you can still change every pattern. Now our entire screen is 256×192-pixels which equals 32×24 characters. A bit of calculating reveals that our 256 characters a character set only fills one third of the screen. And as I stated earlier, this is exactly the case with SCREEN 1. But for SCREEN 2 to fill the entire screen you actually have three character sets independent of each other. Let’s call these character sets charset 1, charset 2 and charset 3. charset 1 always takes up rows 0..63 on screen. charset 2 is rows 64..127. charset 3 is rows 128..191. As for the name table (32 characters x 24 characters), charset 1 is the first 8 rows, charset 2 is rows 8..15 and charset 3 rows 16..23. A method like this is needed because our name table is only one byte accurate and there are 768 characters needed for the entire screen. So a write in the name table in rows 0..7 always picks a character from charset 1. If you write something in the rows 8..15 it picks a character from charset 2 and the same way for rows 16..23 and charset 3. For instance character ‘A’ on name table position (16,22) will always be the 66th character of charset 3. The character ‘A’ on the position (16,4) on the other hand will always be 66th character of charset 1. Charsets are divided into patterns and colours just as earlier, so that charset 1 is pattern table and colour table from 0..2047, charset 2 2048..4095 and charset 3 4096..6144. You must realize, that as a default, these charsets have nothing to do with each other. So the characters ‘A’ on the above example need not to be the same characters. By twitching with a few vdp-registers you can actually alter the number of charsets to three, two or one, a feature no-one ever bothered to document, but which is sometimes very useful. So for instance, you could have the charset 1 copied over charset 3, so that the ‘A’s in above example would be the same characters. The most bizarre part about this vdp-feature is that the number of pattern tables actually needn’t be the same as the number of colour tables. So you could for example have three totally independent colour tables but just one pattern table for the entire screen. This can be used for some very fruitful and seemingly impossible demo-effects, but I’ll cover these more in depth in the future, when I write an article on the ‘charset duplication and deproduction’-vdp feature.
More & Faster
Now someone could point out that we should go for SCREEN 1, which has enough memory for 3 pattern tables and 24 colour tables, and best of all, colour table and pattern table base-registers that actually act the way they’re supposed to. Well, that’s a good idea, but remember we only have 256 characters in SCREEN 1. Those 256 characters make 2048 outputs and you could output that much in a frame anyway. So as for now, we’re sticking it with the 2048 outputs. That’s not very much really. You could refresh one pattern table or colour table of one charset, or half of a pattern table and half of a colour table of a charset. Since one charset is only 256×64-pixels — one third of the whole screen —, we will have to get realistic and only refresh either the colour table or the pattern table. Hence the terms chunky based effects and pattern based effects. Chunky based effects have a constant pattern table and pattern based effects a constant colour table for the whole of the screen. In the next issue I’ll be going more in depth on how to actually code chunky and pattern based effects in practice, complete with some source code too. But yet, 256×64 is nobodys idea of full screen and that’s the most we can update, so what is these SCREEN 2-demos are using, magick? Oh no, 8-bits have one God-given gift on their side and that’s character based graphics. Duplicate, reduplicate, copy, replicate and unduplicate as much as you ever please, because that’s what fools the eye. Remember, character ‘A’ is always the 66th character — with appropriate colour and pattern — within the same charset. It’s basically the very same case as when you’re reading this text right now. You can write for instance ‘JGT’ in any part of the screen and it will still look the same. Maybe have the effect running in a 80×192-window and replicate it three times vertically. With just one register twitch you can reduce the number of charsets to one and have horizontally replicating effects too, sort of like a super SCREEN 1. If you check the SCREEN 2-effects coded this far closely, you’ll notice almost every single one uses some sort of replication; either that, or they’re really small. Or they’re impossible, using clever vdp-features, there are a few in the MSX1. Remember, 256×64 or areas of any size multiplying to 2048 updates is the biggest updateble unit on pattern based and chunky based effects. Look at the SCREEN 2-effects again and you’ll notice none of them actually updates an area larger than this ; apart from the impossible ones of course. Actually the super SCREEN 1 I mentioned earlier is a well known feature amongst the MSX-scene, implemented by first setting SCREEN 1 is via bios and then setting the SCREEN 2-modebit on vdp-registers. This, however, is nothing more than one single implementation of ‘..deproduction’-feature. The register-values just happen to be right by accident for one charset in SCREEN 2; there’s a lot more than that in ‘..deproduction’.
Let’s talk physical
Kultajyva
|
|