Yaz0 compression
================

Version 1.01


Yaz0 compression is reportedly used in quite a few Nintendo datafiles. I have seen it in SuperMario Sunshine's .szs files for example, and I heard that it is used in Windwaker and Majoras Mask as well.

The first 16 bytes of a Yaz0-compressed data block are the data header. The first 4 bytes of the header are 'Y', 'a', 'z', '0', so you can easily see in your hex editor that there's a Yaz0 block waiting for you :-) The second 4 bytes are a single uint32 (big-endian of course) that tells you the size of the decompressed data, so you know how large your working buffer has to be. The next 8 bytes are always zero.

Next comes the actual compressed data. Yaz0 is some kind of RLE compression (ShizZie says: "This is somewhat misleading. Run length encoding is only compression on a character-to-character basis, i.e. aaabbbbcccccc encodes to a2b3c5. Yaz0 compresses string-to-string, so it's more like LZMA."). You decode it as follows: First you read a "code" byte that tells you for the next 8 "read operations" what you have to do. Each bit of the "code" byte represents one "read operation" (from left to right, that is, 0x80 first, 0x01 last). If the bit is 1, copy one byte from the input buffer to the output buffer. Easy. If the bit is 0, things are a little bit more complicated, RLE compressed data is ahead. You have to read the next two bytes to decide how long your run is and what you should write to your output buffer.

+---------+---------+
|  a | b  |    |    |
+---------+---------+
 byte1       byte2

 The upper nibble of the first byte (a) contains the information you need to determine how many bytes you're going to write to your output buffer for this "read operation". if a == 0, then you have to read a third byte from your input buffer, and add 0x12 to it. Otherwise, you simply add 2 to a. This is the number of bytes to write ("count") in this "read operation". byte2 and the lower nibble of byte1 (b) tell you from where to copy data to your output buffer: you move (dist = (b << 8)|byte2 + 1) bytes back in your outputBuffer and copy "count" bytes from there to the end of the buffer. Note that count could be greater than dist which means that the copy source and copy destination might overlap. Here's some sourcecode explaining what I am trying to say:


//src points to the yaz0 source data (to the "real" source data, not at the header!)
//dst points to a buffer uncompressedSize bytes large (you get uncompressedSize from
//the second 4 bytes in the Yaz0 header).
void decode(u8* src, u8* dst, int uncompressedSize)
{
  int srcPlace = 0, dstPlace = 0; //current read/write positions
  
  u32 validBitCount = 0; //number of valid bits left in "code" byte
  u8 currCodeByte;
  while(dstPlace < uncompressedSize)
  {
    //read new "code" byte if the current one is used up
    if(validBitCount == 0)
    {
      currCodeByte = src[srcPlace];
      ++srcPlace;
      validBitCount = 8;
    }
    
    if((currCodeByte & 0x80) != 0)
    {
      //straight copy
      dst[dstPlace] = src[srcPlace];
      dstPlace++;
      srcPlace++;
    }
    else
    {
      //RLE part
      u8 byte1 = src[srcPlace];
      u8 byte2 = src[srcPlace + 1];
      srcPlace += 2;
      
      u32 dist = ((byte1 & 0xF) << 8) | byte2;
      u32 copySource = dstPlace - (dist + 1);

      u32 numBytes = byte1 >> 4;
      if(numBytes == 0)
      {
        numBytes = src[srcPlace] + 0x12;
        srcPlace++;
      }
      else
        numBytes += 2;

      //copy run
      for(int i = 0; i < numBytes; ++i)
      {
        dst[dstPlace] = dst[copySource];
        copySource++;
        dstPlace++;
      }
    }
    
    //use next bit from "code" byte
    currCodeByte <<= 1;
    validBitCount-=1;    
  }
}

Thanks to _demo_ for his help.
Have the appropriate amount of fun with this information,
thakis (http://www.amnoid.de/gc/)

Changelog
=========

20050211: Initial release
20050712: Added comment about LZMA