Blocks (400 + 300 pts)

Question:

I recovered as much data as I could. Can you recover the flag?

blocks-data

Write-Up:

Good to Know: There are several instances of this challenge, so various teams may have received different instances (with different flags). However, the solution is essentially the same.

  1. Using the command file blocks-data does note reveal anything useful about the file.

  2. Next, run the command strings blocks-data to see the string contents of the file:

    @
    tabledatadata
    CREATE TABLE "data" (
    `ID`
    INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    `Data`
    BLOB NOT NULL,
    `Cat`
    INTEGER NOT NULL
    indexsqlite_autoindex_data_1data
    Ytablesqlite_sequencesqlite_sequence
    CREATE TABLE sqlite_sequence(name,seq)
    _tablecategorycategory
    CREATE TABLE `category` (
    `ID`
    INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    `Cat`
    TEXT NOT NULL
    ...
    

    It seems like a DB, most probably a SQLite DB (due to its small size).

  3. The command hd data | head reveals that the file header is corrupt.

    Perhaps we can use a valid file header to repair it?

    The script repair-header.sh does exactly that: It generates an empty SQLite DB, and concatenates various prefixes of its header (of length 1 to 100), to data file.

    The script succeeds after a few trials, and generates data.sqlite.

  4. Use SQLite Manager or any SQLite tool of your choice to investigate the DB. You see that it has two tables: category and data.

  5. category contains 8 rows:

    ID  CAT
    ---------
    1   CHRM
    2   IDAT
    3   ICCP
    4   IHDR
    5   TEXT
    6   TIME
    7   PLTE
    8   TRNS
    

    Simply Google a few of these keywords to reveal that they refer to various chunks in PNG data structure.

  6. The data table has three columns: ID, Data, and Cat.

    Perhaps the data table contains the actual contents of the PNG file?

    Each row of the table seems to contain the data related to one chunk of the PNG file, whose type is specified by the Cat column.

  7. A minimal PNG file has the following structure:

    MAGIC BYTES     # = '89504e470d0a1a0a'
    IHDR            # Must appear first
    IDAT
    IEND            # Must appear last
    

    The data table, however, specifies multiple IDAT chunks, a PLTE chunk, a tRNS chunk, and no IEND chunk!

    We'll soon fix that, but let's first take a closer look at the chunk data structure.

  8. Each chunk must have the following structure:

    Length
    Type
    Data
    CRC
    

    A quick look at the data table reveals that none of the chunks have the above structure; they simply contain raw data.

    So, we write a function raw2chunk() to compute the remaining fields for each chunk (See blocks-solution.py).

  9. The contents of the IDAT chunk is scattered among multiple rows. That's typical of PNG files to have multiple IDAT chunks, as it aids in better streaming the image.

    The only problem is to determine the order of the IDAT chunks.

    A huge help is to know that the raw data of IDAT chunks is compressed using zlib.

    So, if we have IDAT chunks, we can try all combinations, and check which combination decompresses correctly.

    However, that strategy is time consuming, so we can come up with some clever tricks:

    • zlib header of PNG files (almost always) starts with the byte 0x78, which indicates deflate compression with a 32-KB window. Only one chunk satisfies this property, which is deemed as the first chunk.

    • All chunks have equal size, except one which is smaller. The smaller chunk is the last chunk.

    • Removing the first and last chunks, only chunks remain. In Python, zlib has a decompressobj() function, which returns an object capable of decompressing partial streams (or erring on invalid streams). We use this capability in a backtracking algorithm, which creates a search tree and prunes those subtrees with no valid child as early as possible. This way, we don't need to try all combinations.

    • Using zlib.decompress() function, we prune those combinations of length which are valid as partial streams but invalid as a whole stream.

    The above algorithm is implemented in find_correct_idat() function (See blocks-solution.py).

    It returns as soon as one valid combination is found. The calculations may take several minutes.

  10. We reconstruct all chunks using the raw2chunk() function. (For IEND, the data field is empty.)

    The chunks are then concatenated in arbitrary order, with the constraint that IHDR and IEND must be first and last, respectively.

    The MAGIC bytes are added, and the file is saved.

  11. As the file is flipped and rotated, the necessary transformations are applied to make it upright.

  12. The following text is written in the file:

SharifCTF{D2E857E07AD021CB52AE76DD9AD9B6DB}

Attachments: