ReVVVVVVerse Engineering

Who’s reverse-engineering game data files for fun again? That’s right, it’s me.

I decided to go digging into VVVVVV this time around. Turns out there’s not a whole lot to it. Everything is wrapped up in a file called data.zip, which is just a standard zip file.

Spritesheets are just PNGs. Sound effects are wave files. Levels are XML documents which have been given a .vvv extension for some reason.

Oh, but what’s this? The music’s got a custom archive format. I already have the PPPPPP album, but sure, let’s see if I can reverse this.

So I open up vvvvvvmusic.vvv in my hex editor and take a look. No header, it just opens with a filename: data/music/0levelcomplete.ogg. That seems promising. That’s followed by a longish string of 00 bytes, then four 01 bytes, then four bytes that look like a 32-bit integer, then four more 01 bytes. Then it jumps into another filename. Interesting.

So looking at the other filenames I see in this header, I can see that they’re all null-padded to 48 bytes. The 12 bytes after that are obviously some sort of metadata.

The four bytes in the middle are 1A 47 03 00, which definitely looks like a 32-bit big-endian integer. If it is, that’s a probably a file size or offset within this archive file. This has a decimal value of 214810. I know that the path complete jingle is only about 8 seconds long, and 214810 bytes is a reasonable size for an Ogg Vorbis file that long.

Checking the next file, data/music/1pushingonwards.ogg, I see it has a value in this place of 4935877. If that’s a file size, it’s about four and a half megabytes, which again is a pretty reasonable size for an Ogg file of a few minutes in length.

All right. Let’s skip ahead a bit. I scroll down and see that there’s 16 file names in all, followed by some more blocks with completely null file names, and metadata blocks set to 01010101 01010101 00010101. 01010101 seems to be some sort of placeholder “nothing here” value in this format. Interestingly, byte 08 has a value of 00 for these rather than the 01 the ones with actual filenames have, so I’m tentatively assigning this byte as a “file actually exists” boolean.

After these, there’s a big block of all 00 bytes starting at offset 14A0, for a grand total of 5280 bytes of filename header. Dividing this by the 60 bytes (48 name bytes + 12 metadata) each file slot has, and we get 88. There is space for exactly 88 files in this header structure. Why 88? Who knows?

The null block extends from offset 14A0 to 1E00, giving us a while 5280 bytes of nothing. This is exactly the same length as the first block of header information. Whatever format this is, I’m getting the impression it’s capable of a lot more than it’s being used for here.

Then finally, we get down to the actual file data. I can tell it’s the file data, because the first four bytes from 1E00 are 4F 67 67 53 - this is the magic number that appears at the start of Ogg container files, and encodes the letters OggS. Looks like the file data is uncompressed and unencrypted.

So all that remains is to write myself a quick and dirty script to unpack the files for me:

from struct import unpack
from os.path import dirname
from os import makedirs


NAME_LENGTH = 0x30
HEADER_META_LENGTH = 12
DATA_START = 0x1E00
FILE_SLOTS = 88 # Why 88? We may never know

data = open('vvvvvvmusic.vvv', 'rb')

names = list()
sizes = list()

for _ in range(FILE_SLOTS):
    name = data.read(NAME_LENGTH)
    length, exists = unpack('xxxxi?xxx', data.read(HEADER_META_LENGTH))
    if not exists: continue
    name = name.decode().replace('\0','')
    names.append(name)
    sizes.append(length)

data.seek(DATA_START)

for name, size in zip(names, sizes):
    makedirs(dirname(name), exist_ok=True)
    with open(name, 'wb') as outfile:
        while size > 0:
            size -= outfile.write(data.read(min(size, 0x80000)))
    print(name)

You know you’re dealing with a simple format when you can post the script inline like this and not have it kill the flow of the post.

After running it, cursing, fixing the obvious mistakes, and running it again, it spits out 16 .ogg files in the new data/music directory it’s just created for itself. Play the first one and… ahhh. Level complete.

This file wasn’t hard to unpack and really left more questions than answers. What are the first 4 bytes in the file metadata for? What’s the 5280 null byte block supposed to contain? Why exactly 88 files? The file format hints at being capable of more than it’s being made to do here. Do those extra bytes hint at a choice of compression schemes? Is this a common file format that I just don’t recognise? I’m not even sure how to begin answering those questions.

This was all pointless, of course. I already had the soundtrack album and didn’t need to crack the game files open to get the music. Except… what’s this? There’s a file called predestinedfatefinallevel.ogg that I didn’t expect. Didn’t the final level use a combination of ordinary Predestined Fate and Positive Force for its background music?

Turns out it’s a nice remix of Predestined Fate with heavier synthy bassline. Nice. I get a bonus prize for doing all this after all!

Tags: Games, Reverse Engineering