Remember Me
Need an account? Register here!
Home/NewsForum/CommunityBeyond the PogoCosmic CavernsDuke
Forum Index
Member List
Private Messages

Rate this Article

Plumbing the Depths of Keen
Author: adurdinDate: April 7th, 2002
Views: 7503Rating: 10

Welcome to the first in a series of articles examining the structure and layout of Commander Keen's data files. The aim of these articles is to provide a complete reference to Keen data formats which can be used by people writing utilities to modify Keen, or just as a source of ideas for people writing their own games. These rather technical articles will probably be interspersed with others on more general topics which would appeal to more people. Well let's get down to business. The topic of discussion today is... maps!

Keen Cartography
All the Keen games store their level data in one or more map files; Invasion of the Vorticons has uses one file for each level, while Goodbye, Galaxy!, and Aliens Ate My Babysitter! each store all their maps in one file. I'll begin by examining the format of level files in Vorticons, as it is simpler, and moreover the later formats are based on it.

Marooned on Mars, The Earth Explodes, and Keen Must Die! store their levels in files named LEVELnn.CKe, where nn is a two-digit level number, and e the episode number. Most of the files are just levels in the game, but there are a few special ones: LEVEL80 is the large world map, LEVEL81 is a special map, used in the story, and LEVEL90 is one long level containing the title screen and the backgrounds to the High Scores, Ordering Info, and About ID screens.

I'll now describe the format of these files. In the rest of this article, I'll use the terms byte, word, and dword to describe 8-, 16-, and 32-bit integers respectively.

All the level files are compressed, using a simple word-based RLE scheme: a word 0xFEFE indicates a run of identical words; the next two words indicate the length of the run, and the repeated word. At the highest level then, the level format looks like this:

Uncompressed length: dword
Compressed data: 1 or more words

A basic RLE decompression algorithm would be:

Read length (dword)
Allocate output buffer of length bytes
While there is input and the output buffer is not full:
Read datum (word)
If datum is 0xFEFE Then:
Read count (word)
Read datum (word)
Do count times:
Write datum (word)
Write datum (word)

Similarly, a basic RLE algorithm for compression:

Set length to input length
Write length (dword)
Set count to one
Read lastdatum (word)
While there is input:
Read datum (word)
If datum equals lastdatum Then:
Increment count
If count >= 4 Or lastdatum is 0xFEFE Then:
Write 0xFEFE (word)
Write count (word)
Write lastdatum (word)
Do count times:
Write lastdatum (word)
Set count to one
Set lastdatum to datum

Now we get into the real guts, the actual map data. After decompressing, we're left with a format like this:

Width: word
Height: word
Planes: word
Reserved: 4 words
Plane Size: word
Reserved: 8 words
Plane 0: (Plane Size) bytes
Plane 1: (Plane Size) bytes

The Width and Height specify the width and height of the map in terms of tiles (for example, LEVEL03.CK1 is is 81 tiles wide and 53 high). Planes specify the number of data planes in the level; for Keen Vorticons this is always 2. The next four words are ignored by Keen. The Plane Size is the size in bytes of the data planes; it is equal to Width * Height * (word size), rounded up to the next multiple of 16 bytes (for example, this value in LEVEL03.CK1 is 8592). This is followed by eight more words that are ignored. Then comes the plane data: for each plane, this is Width * Height words, padded to a 16-byte boundary.

Of course, this is fairly meaningless without an explanation of the different planes. Plane 0 is the foreground plane, consisting of a two-dimensional array of tile numbers (one per word) for the tiles that make up the level. Plane 1 is the so-called info plane, and contains one word of extra information per tile in the level. This information includes sprites, message numbers (for the yorp-statues and Vorticon elders), and switch information (for the bridge switches) in the level maps, and level numbers and teleporter information on the world map.

One last point to note about the levels is that there is a two-tile-wide border surrounding each level that is never seen on screen; this is primarily due to a requirement of the smooth-scrolling engine that Keen uses.

The Next Generation
Although Keens Galaxy and Aliens store their level maps in one file, GAMEMAPS.CKe, the format is substantially similar to the format used by Keen Vorticons. Also associated with the maps is a data table in the .exe that I shall refer to as MAPHEAD (indeed, some other games based on the same engine, such as Bio Menace, store this table in a file called MAPHEAD). I'll start by describing the structure and usage of this table:

RLE Flag: word
Header Offsets: 100 dwords

The MAPHEAD contains the flag word (0xABCD in both Galaxy and Aliens) signifying a run in the RLE-compressed level data, followed by an array of 100 dwords being the offset in the GAMEMAPS file of the level header, or zero if the level does not exist. The first of these, level 0, is the world map. Because this MAPHEAD table is stored in the .exe, modifying it is complicated: one of the reasons that level editors for these episodes have taken a long time to appear.

The GAMEMAPS file starts with a signature identifying the version of TED it was made with ("TED5v1.0" in all cases), followed, neither immediately nor necessarily in any order, by the level headers and plane data for each level. In fact, due to a bug in TED5, there are usually several junk bytes between the signature and the start of the important data.

Dealing with the GAMEMAPS file requires getting the offsets of the level headers from MAPHEAD before reading the header. The level header has the following format:

Plane 0 offset: dword
Plane 1 offset: dword
Plane 2 offset: dword
Plane 0 length: word
Plane 1 length: word
Plane 2 length: word
Width: word
Height: word
Level name: 16 bytes (null-terminated)
Signature: 4 bytes "!ID!"

It starts with three dwords specifying the offsets from the start of GAMEMAPS of the compressed plane data, followed by three words giving the lengths (in bytes) of this compressed data. Then come two words giving the width and height of the level respectively (in tiles), and then a null-terminated string with the name of the level (used only by the editor), and a signature marking the end of the header.

Once more the level data is compressed, but this time each plane is compressed separately, and two consecutive compression algorithms are used, also. The first is RLE, identical (except for a different flag) to that used for Keen Vorticons; the second is a simple repeated-sequence removal technique, known as "Carmack compression". When decompressing, you must first de-Carmackise then un-RLE the data to get
the plane data.

The Carmack compression is used to remove repeated sequences of words, replacing them with a reference to the first instance of the sequence. It has two reference types, near and far. A near reference is marked by three bytes: the number of words in the sequence, 0xA7, and the relative offset of the start of the sequence from the start of the reference, counting backwards. Thus the reference 0x03 0xA7 0x06 would effectively mean "repeat the first 3 of the last 6 words". A far reference is marked by four bytes, and is used to reference sequences that are more than 255 bytes from the reference. The first byte is once more the length of the sequence in words. It is followed by 0xA8, and a word specifying the absolute offset of the start of the sequence from the start of the uncompressed data. If a word in the uncompressed input stream had a high byte of 0xA7 or 0xA8, it would be replaced in the compressed data by a reference with a sequence length of zero, with the byte following the reference signature byte being the low byte of the input word. Note that a zero-length reference like this will only ever have a single byte following the signature. As for the RLE data, the first word in the Carmack-compressed data is the length of the uncompressed data in bytes.

I won't present algorithms for Carmack compression and decompression here, because they should be fairly obvious given the description above; for example code, look at the TED5 source code (compression) and the Wolfenstein 3D source code (decompression).

There are three planes in Keen Galaxy and Aliens levels: Plane 0 contains background tiles which the player does not interact with at all; plane 1 contains all the foreground tiles, those that the player can walk on or behind; and plane 2 is the info plane, similar to Keen Vorticons. However, the info plane also has information for the new features in Keen Galaxy, such as special markers to split a level up into different sections, also markers for moving platforms, switches, doors, etc. Keen Galaxy and Aliens levels also have a two-tile-wide border that is never seen.

The End is Nigh
I think that about covers it. As usual, if you have any comments or further questions, don't hesitate to email me:

Note that this information may also apply to some degree to other games; for example Shadow Knights, and some of the Dangerous Dave series use apparently the same file format as Keen Vorticons, while newer games like Bio Menace use the same format as Keen Galaxy.

You must be logged in to post a comment.
Copyright 2003-2010 xtraverse and - All rights reserved