Reverse Engineering Earth 2150, Part 6: Texture Files

So you know all those things I said I was going to do next? I didn’t. I decided to take a look at texture files instead. I know others have already reverse-engineered this format, so this was purely for my own enjoyment.

Deliberately not looking at anyone else’s work, I started with a clean slate, which meant extracting a TEX file and looking at it in my hex editor. I decided to go with the compass texture because there are several known things about it.

It’s a nice, simple texture with regular patterns. Measuring it, it appears at first to be 128x128 pixels, but if you look a little closer, you’ll see it’s composed of 2x2 pixel blocks, meaning it’s been scaled up from 64x64 without filtering. These are useful properties.

So I extract compass.tex from the WD file and up comes the hex editor. I immediately see that the file begins with the string TEX␀, which is pretty obviously a format identifier and can be safely ignored.

Following on from this, there are what appears to be a few 32-bit integers, which fits with how common they were in the PAR file. 1, 38, 64, 64, 7. Aha. See those 64s in there? I bet those are image dimensions, as I suspected this was a 64x64 image.

After confirming that, I check the file size: 21872 bytes. Quite large for 4096 pixels, so I suspect this may be uncompressed, relying on the deflate compression of the wd files to keep the size down. In fact, that’s 5.34 bytes per pixel overall, including whatever headers are present. A 32-bit RGBA pixel is, of course, 4 bytes; if it were made of these, that would take 16384 bytes, leaving 5488 bytes for metadata. Seems a bit large.

Alternatively, if I assume 5 bytes per pixel, that leaves 1392 bytes for metadata, but I have to wonder what that extra byte would be about.

To settle this conundrum, I go back to the hex editor. I deliberately picked a texture with lots of greyscale elements, so there should be runs of three identical values if it’s just RGB pixel values.

Sure enough, I scroll down a bit to offset 870 and see: 07070700 26262600 3F3F3F00. If that doesn’t look like three RGBA pixels, I don’t know what does. The zeroes in the fourth byte of each suggest to me that it’s using a big-endian structure (alpha value last) with 00 being full opacity and FF being full transparency.

I also see a lot of long runs of 00 bytes. Either that’s a large black area, or it regards all-zeroes as fully transparent. The compass texture is red, green, white, and transparency, so I’m thinking the former.

I decide to ignore the alpha channel for now, pick byte 1C as a starting point since it’s right after that 7, read off from there as RGB pixels, and render what I’ve got.

Well. That’s promising. It even looks like my starting offset might be right. Not sure what’s going on with the red fading to white; the in-game compass doesn’t do that.

Let’s see what that fourth value does if I read it as greyscale, which should get me an alpha mask…

Yeah, that looks rather like an alpha mask, at least for the four-spoked bit that rotates. So putting it all together as RGBA gives me, yes, the rotatey cross part of the compass. Looks like I was wrong about 00 being fully opaque, though. FF is fully opaque, it’s just masking only the rotatey part of the compass to visibility here. The circular bit must be split out to a separate background file and is just included here because it was easier on the artist.

But at this point, I’ve only read ~80% of the file. I decide to try reading the next 4096 bytes as a 64x64 8-bit greyscale image, and I get this.

Is that… interlaced? There’s something weird going on with this format. I do see the compass shape in there, but it’s all jumbled up. I try a few different ways of rendering things, like reading two pixels per byte, then stumble upon trying it with double the width.

Aha, look at that. There are a series of progressively smaller images, meaning this texture file is mipmapped. I need to read these as 32-bit images of successively smaller sizes. Starting with the end of the first image at offset 0x401C, I read a 32x32 image, again discarding the fourth byte, and I get a half-size compass. Nice.

At this point I wonder if that 7 in the header is the number of mipmaps, so I tweak my renderer to put them side by side, halving the size for each of 7 iterations. That gives me this:

And that brings us right up to byte 21872, the end of the file.

So now I’m left with two last mysteries: what do those 1 and 38 numbers mean in the header? An hour after starting, I believe I’ve reached the end of what compass.tex can teach me alone, so let’s try something more ambitious.

I dump out LCLUMO3.TEX, the texture file for that unit I used in the screenshot last time, run it through my renderer.

And I get a splattered mess. Lovely. The renderer reports that it got to byte 30763, but the file is 43760 bytes, so I’ve badly misread something. To the hex editor once again!

I see the familiar 40000000 40000000 07000000 there, indicating a 64x64 texture with 7 maps, but it’s later in the header, there are extra fields. One of them is TEX␀ again. Does this file contain multiple textures?

It looks like there’s a 16-byte TEX header, then another 16-byte TEX header, then the image data starts. I read that off, check where it finishes, and… sure enough, another 16-byte TEX header, then more image data. I read both off.

That’s the Moon unit, intact and damaged respectively. Nice. So let’s try and figure out these headers.

I read off the four headers I’ve seen so far, one from compass.tex and the three from LCUMO3.TEX. I notice that they all begin TEX␀ and then a 1. That leaves 8 bytes that are different between the four.

Header Value
compass.tex 26 00 00 03 88 88 00 00
LUMO3.TEX start 02 00 00 43 02 00 00 00
LUMO3.TEX 1st image 06 00 00 03 88 88 00 00
LUMO3.TEX 2nd image 06 00 00 03 88 88 00 00

First thing I notice is that the actual images have something in the form ?6000003 88880000 for theirs. The second is that the very first header in LUMO3.TEX has the value 2 in its last set of 4 bytes. Image count in file? I need more data points.

So next, I wrap my renderer in a loop that will just dump all of the textures to see which ones work, and spit out a CSV file with their header values for good measure. Seems files with 06000003 in the first header field work as expected. Not surprising considering I wrote the renderer to handle one of those. Helpfully, 06020003, 06040003, and 06050003 Also work fine. This suggests to me that anything with 06 in that first byte is a single-texture file. But what do those other bytes mean?

There’s only one with 06020003, and it appears to contain four frames of animation. Its second word is 88880000, same as the compass, something it share with the following two types as well.

Also only one with 06040003, which is a gradient. The filename is nefire so I’m thinking it’s used for a fire effect, or possibly the “getting hot” glow things being hit by lasers acquire.

Then there are three with 06050003, which are a lava texture and two energy shield effects. This suggests to me that the 04 bit in the second byte means that the texture emits light.

It also seems that 16000003 renders correctly, so x6 may be the initial byte that indicates a single image.

With that in mind, I change my script to try to read a multi-image file for all prefixes except those to see what happens. My computer chugs for several minutes on that one and eats up several gigabytes of memory. Considering the entire corpus of uncompressed texture files is 50MB, so… oops. I try again, but this time making sure that the sub-images have that TEX␀ sequence in place as a basic validity check. It spits out the result in a couple of seconds this time.

A lot more stuff is rendering, but some still isn’t. I notice that some things are reporting weird image dimensions, others aren’t finding the TEX␀ header where they expect it. I quickly work out that my logic for working out what’s a single-texture file and what isn’t is incomplete. But I do notice something else: all the single-texture files have 88880000 as their second word, all the others have a small integer. That’s pretty easy to check for.

Run it again, and there are still quite a lot of failures. Some are now reporting that they overran their buffers while trying to read in the image data. My rendering library caught that and correctly raised an exception. I was confused, though; the way I wrote it, the buffer should always be exactly the same size as the amount of data read.

I put some extra debug logging into see what was going on there, and found that when it raised that particular error, it was always several mipmaps in and was trying to read to a 0x0 image. That meant the buffer had size 0, and I was calling the read function with a value of zero - which in Python land means “read the whole rest of the file”. So that explained that, but why was it trying to read too many maps?

I checked how many it was trying to read, and the number was 4 279 272 068. That is an absurdly huge number, pretty close to the largest possible 32-bit integer, which tells me it’s trying to read binary data as an integer. So I check the hexadecimal value: 84 82 10 FF. That’s an RGBA pixel; you can tell by the FF alpha mask in the last byte indicating full opacity. A dull shade of yellow, by the looks of the numbers.

What this must mean is that not all textures have mipmaps, and those that don’t go straight into the pixel data rather than specifying a value of 1. It must be possible to tell which is which from the earlier headers. The second word is 88 88 00 00 when you get here, so that leaves the first one. I write a quick script to read off the first word of every header in every file and whether it’s followed by a sensible mipmap number, which I defined as 0 < sensible < 100.

The script spat out a useful result much right away: only when the first byte is 06, 16, or 26 is there a mipmap count. So I change my renderer to only look for a mipmap count when it sees those bytes, and run again.

This time, only one file failed. The error message: TreeShadows.tex: No magic string in sub header 0 b'\\x02\\x00\\x00\\x00TEX\\x00'. Interesting. That’s offset by 4 bytes from where it’s supposed to be. For some reason, this file has an extra header field. The first word of the header is 20 00 00 C0. I’ve seen files with 20 in the first byte render fine, but this C0 in the fourth is new. That must be what indicates there’s an extra field. But what does it mean? And which one is the number of textures?

I check the values of those two fields: 4 and 2. Okay, easy enough. Now I have to count the number of TEX␀ headers in the file. I do this and… 8? Is it really just a case of multiplying them together? Might be categories and textures per category.

Add handling logic for that, run again. Success! I now have a complete texture library for the game.

My renderer’s up on my git repo as usual if you’re curious, but like I said, this has already been done by others. Now, you might be thinking that all this is just an excuse to avoid writing that documentation I know I really should, and you’d be right.


Posts in this series:

Tags: Reverse Engineering Earth 2150, Earth 2150, Games, Gushing About Tech