The Realm Files - Vol 1 - Intro to RealmDB

In this new blog series, "The Realm Files", I’ll be digging into the physical structure of RealmDB. The goal is to give examiners a deeper understanding of how Realm actually works, so they can validate what their tools parse if supported and have a clear methodology to extract the data when the tool does not support it.

Unlike other database formats such as SQLite or LevelDB, the file structure of RealmDB is not well documented in the public domain. While a few academic papers touch on specific applications that utilize Realm databases, and one mobile forensics book includes a chapter on the subject, there are still significant gaps. What is missing is detailed guidance on how to interpret the payload of an array and rebuild the database. This process is not as straightforward as it might first appear.

Below is the link to the Realm Forensics Chapter in the Mobile Forensics - The File Format Handbook which has a great introduction to RealmDB 

Realm Forensics Chapter: Mobile Forensics - The File Format Handbook

I will be publishing articles periodically, each one building on the last, with the ultimate goal of creating a complete resource on RealmDB forensics. Throughout this series, I will be using the RealmDB from the iOS AI companion app Replika as the primary example. Replika is an excellent case study because it relies heavily on RealmDB to store user interactions rather than using a traditional SQLite database. This makes it a practical and relatable dataset for demonstrating the structures, artifacts, and forensic techniques we’ll explore during this series.

To kick off this series, we first need to answer three key questions: What exactly is RealmDB, how can we identify it in the wild, and how can we look at the data stored inside it?

What is RealmDB?

Realm is an open-source object-oriented database optimized for mobile platforms (iOS and Android). It was originally developed by Realm Inc., with its official launch for iOS in 2014 and Android support added in 2015. In 2019, MongoDB acquired Realm and has continued to maintain and expand the platform.

Realm was designed as a replacement for SQLite by taking a different, more efficient approach to data storage. While many applications use Realm today, as of 2025, SQLite remains the more widely adopted database for application data.

As an object-oriented database, Realm stores data as native objects rather than rows in relational tables. Instead of schemas defined with columns and rows, Realm uses class definitions, where each instance represents a persisted object. This mirrors how developers already structure data in object-oriented programming, allowing properties to belong to classes and enabling complex relationships through object references and lists.

SQLite vs RealmDB: Conceptual Comparison

Although SQLite and RealmDB serve a similar purpose in storing and retrieving application data, they do so in very different ways. SQLite is a relational database that uses tables, columns, and rows, while RealmDB is object-oriented and organizes data into classes, properties, and object instances.

The table below highlights these differences:


Basic Explanation of an Object-Orientated Database

The concept of object-oriented databases, and object-oriented programming in general, can be difficult to wrap your head around. Let me try to simplify it.

Think of a relational database like a spreadsheet. Each sheet is a table, the columns are fields, and the rows are records. Everything is structured in grids. Now, think of an object-oriented database like a set of folders. Each folder represents a class, and inside are files that contain the properties of that object. Some folders can contain references to other folders, and some can hold lists of related files. Instead of forcing everything into rows and columns, the data is stored as objects. To make retrieval efficient, the database maintains index structures that map property values to the objects that contain them, allowing it to jump directly to the right folder without flipping through every file one by one.

The diagram below shows how these concepts align:

  • Table = Object Class

  • Columns = Properties

  • Rows/Records = Objects

Another important concept in object-oriented databases is the use of arrays and lists. Instead of relying on foreign keys to connect data, objects can directly reference other objects or maintain lists of them. This creates more natural relationships, since collections of related data are grouped inside the object itself. For example, a Contact object could contain a list of Message objects, and each Message might reference its associated content and metadata. Arrays and lists therefore act as built-in containers for relationships.

How are Objects Retrieved?

In an object-oriented database such as Realm, data is retrieved by querying an Object Class using one of its properties to locate a specific instance. For example, consider a Contact class with the properties contact_id, contact_fname, contact_lname, and contact_number.

If a query specifies contact_id = 1, Realm resolves this request using index numbers rather than scanning through every stored object. Once the object is located, Realm returns the complete record with all of its property values:

  • contact_fname = "Joe"
  • contact_lname = "Carpenter"
  • contact_number = "312-456-****"


How can I identify a RealmDB?

A RealmDB is stored in a single .realm file. When the database is opened, Realm also creates a lock file to coordinate access; this file generally contains no application data and has limited forensic value beyond its creation timestamp, which can indicate access. Realm additionally creates a management folder upon the first connection. While this folder may contain auxiliary state files, they are typically empty and rarely hold content. 

The easiest way to identify a RealmDB file is by its .realm extension. To further confirm, you can check the file header for its unique signature. At offset 16 in the file, Realm stores the 4-byte hex value 54 2D 44 42, which translates to the ASCII string “T-DB”. This signature acts as a reliable indicator that the file is indeed a Realm database.

The below screenshot is taken from imgur app and can be found in Josh Hickman's Public iOS 17 dataset. 

Location: /private/var/mobile/Containers/Data/Application/5F8AB4A9-5DC1-4F5C-8637-182C91001AB0/Documents

Bonus Knowledge: Life360 iOS app also uses a RealmDB which can also be found in the iOS 17 Public Dataset


How do I look at the Database Contents?

In the open-source world, there is really only one option: Realm Studio. This GUI-based tool, developed and maintained by MongoDB, allows users to browse, edit, and query a Realm database similar to DB Browser for SQLite. However, it is important to note that Realm Studio is not designed as a forensic tool. While it can display and query data, it does not preserve forensic integrity or provide validation of findings, so results must always be verified through independent methods (The challenge).

Realm Studio Link: https://github.com/realm/realm-studio


Unfortunately for the Python wizards out there, there is currently no Python module for directly working with RealmDB. The workaround is to use Realm Studio to export the database as a JSON file, which can then be examined with Python. The limitation is that this approach only provides access to the active data and does not allow for recovery or analysis of deleted data.

While there is no Python module for Realm, official Realm SDKs are available in several other languages that allow direct interaction with the database:

  • Swift SDK 
  • Kotlin SDK
  • Java SDK
  • C#/.NET SDK
  • Node.js SDK
  • React Native SDK

Like Realm Studio, these SDKs only allow access to active data and not deleted data. To recover deleted content, low-level parsing at the physical level is required.

On the commercial side, I cannot say with certainty which forensic tools currently support RealmDB or how reliable they are at extracting data. While some tools do claim support, every examiner should be aware of exactly which file formats their tools can process and, most importantly, validate the accuracy of those results.

Conclusion

In conclusion, RealmDB is a powerful and efficient database format that has seen some adoption across mobile applications, but its internals remain far less documented than relational formats like SQLite. Forensic examiners cannot rely solely on tools or GUI viewers, as these only expose active data and may not account for deleted records or deeper structural nuances.

The goal of this series, The Realm Files, is to bridge that gap by breaking down how RealmDB is structured physically and how its artifacts can be validated, extracted, and, where possible, reconstructed. By understanding how objects, arrays and lists are stored and retrieved, we build the foundation for accurate analysis that goes beyond surface-level parsing.

In the next articles, we will move deeper into the physical structure of RealmDB files, examining arrays, references, and how data is laid out at the byte level. This knowledge is essential not only for validation of tool output but also for the development of reliable methodologies to extract evidence when tools provide limited or no support.



Comments

Popular posts from this blog

Introducing SQBite (Alpha) - Python Tool for Extracting Records from SQLite Databases

The Duck Hunters Guide - Blog #6 - DuckDuckGo Fireproof Sites (Android)