The Realm Files - Vol 3 - The Realm Header

In the previous installment of The Realm Files, I discussed how a Realm database maintains two top nodes as a direct result of its copy on write architecture. To start decoding these two distinct nodes, we need a reliable anchor point, and that anchor is the file header. The file header occupies the first 24 bytes of a Realm database file and contains seven informational entries, as shown in the table below. 

Note: All integer values in the header are stored in little endian byte order.


Offset

Size (Bytes)

Description

0

8

Top Reference 0 – Offset to Top-Level Node

8

8

Top Reference 1 – Offset to Top-Level Node

16

4

Mnemonic (File Signature) – 54 2D 44 42 (TB-D)

20

1

File Format (Top Ref 0)

21

1

File Format (Top Ref 1)

22

1

Reserved Byte (Currently Not Used)

23

1

Flag – Determines which Root Node is currently active

The first 16 bytes of the Realm file header consist of two 8-byte integers known as Top Reference 0 and Top Reference 1. Each value points to the file offset of each top-level node. These offsets provide the starting points for walking the arrays.

At offset 16 is a 4-byte mnemonic, also known as the file signature. In current implementations, this value is 54 2D 44 42 in hexadecimal, which corresponds to the ASCII string "T-DB".

Offsets 20 and 21 store the file format version for each of the two top references. Offset 20 corresponds to the file format of Top Reference 0, while offset 21 corresponds to the file format of Top Reference 1. In current implementations, the file format value is 24.

Note: Typically, the file format for each distinct node will be the same. However, if the last transaction involved upgrading the database to a newer version, the version numbers may differ. One example of this scenario is opening an older-format Realm database in Realm Studio, where the tool upgrades the database to the latest format in order to preview its contents.

Offset 22 is a reserved byte which is currently not in use.

Offset 23 is a flag which determines which root node is currently active. To determine which root node is active, we need look at least significant bit (LSB) of the flag which is far right bit:

  • When the least significant bit is 0, Top Reference 0 is active
  • When the least significant bit is 1, Top Reference 1 is active

Walking Through an Example

Going through an example, the file header for the iOS Replika App gives the following information:

  • Top Ref 0 is at offset 212880
  • Top Ref 1 is at offset 211736
  • The Mnemonic is 54 2D 44 42 (T-DB)
  • The file format is 24 for both Top Refs
  • The reserved byte is 0
  • The flag which determines the currently active node is hex 00. Since the least significant bit is 0, Top Reference 0 is the active node and represents the most recent state of the database.


Now that we have identified our starting points, we can begin decoding both the active node and the inactive node.

In the next post, we will dig into Realm arrays so we can decode the top-level structure. From there, we can start walking the node, identify the ObjectClasses (tables) from the schema, and decode the clusters to find the objects.



 

Comments

Popular posts from this blog

The Duck Hunters Guide - Blog #6 - DuckDuckGo Fireproof Sites (Android)

The Duck Hunters Guide - Blog #3 - DuckDuckGo Open Tab Information (Android)