A Forensic Look at the Grok Android App

In this post, I’ll be taking a closer look at some of the core artifacts associated with the Grok application on Android devices. Grok is an AI assistant developed by xAI and integrated into the X platform, designed to provide real-time, interactive AI capabilities. Users can interact with Grok using text, voice, or images to generate responses ranging from explanations and summaries to written content, code assistance, and AI-generated images or video. 

From a forensic perspective, these interactions can provide valuable insight into user intent, content creation activity, and AI-assisted communications.

When examining the on-device application artifacts, they are notably limited compared to other AI applications I have looked at, as Grok operates primarily as a cloud-centric AI service. During testing and research, I was able to identify user account information, artifacts associated with the built-in ExoPlayer framework and remnants of content generation prompts.

What I have not been able to identify thus far are complete conversation histories and AI-generated images, indicating that perhaps a significant portion of Grok user activity is processed and retained server-side rather than locally on the device.

Grok application data is stored within the standard Android app sandbox at: /data/data/ai.x.grok/

User Account Information

The user account information used to access Grok can be found in multiple xml files in the shared_prefs folder.

INTERCOM_DEDUPER_PREFS.xml

This is xml file contains:

  • User Account Name
  • Email Address
  • X Username

INTERCOM_SDK_USER_PREFS.xml

  • Email Address
  • User ID

ExoPlayer Artifacts

ExoPlayer is an open-source media playback library for Android applications. In the context of Grok, the application uses this library to render audio and video content within the app. As the app uses this library this can lead to the presence of locally cached media files and other artifacts, even though content is generated or streamed from Grok's cloud-based service.

exoplayer_Internal.db

The first artifact of interest is the exoplayer_internal.db SQLite database that gets created by the ExoPlayer Library and is used to track media playback information. 

exoplayer_internal.db location: data\data\ai.x.grok\databases\

The primary tables of interest within the database are:
  • ExoPlayerCacheFileMetadata<instance_uuid>
  • ExoPlayerCacheIndex<instance_uuid>
The instance_uid represents a unique identifier for an ExoPlayer cache instance. Each cache instance has its own ExoPlayerCacheFileMetadata and ExoPlayerCacheIndex table, which store metadata and indexing information related to cached media. The ExoPlayerVersions table is used to track the different cache instances created by ExoPlayer.

Note: The feature column contains values of 1 or 2, which appear to represent the differentExoPlayer cache components (FileMetadata and Index); however, I have been unable to determine what each value represents.

ExoPlayerCacheFileMetadata<instance_uuid> Tables

The ExoPlayerCacheFileMetadata tables contain the metadata for media that was played in the application:

  • name: The filename for the cached media file
  • length: The size of the cached video in bytes
  • last_touch_timestamp: The last time the ExoPlayer interacted with the cached file stored as a Unix Milliseconds Timestamp. This does not necessarily represent the media generation time!

You can use a simple SQL query to extract and convert the information




The Cached file name can broken into 5 different parts using the period to separate them
  1. Cache ID
  2. Segment Index (If a multi-part media file)
  3. Timestamp when the Cache file was created or last written too (Unix Milliseconds)
  4. Cache Format Version
  5. File Extension
<id>.<segment>.<timestamp>.v<version>.exo

Using the example of: 3.0.1763060312736.v3.exo

Cache ID: 3
Segment Index: 0
Timestamp: 1763060312736 (2025-11-13 18:58:32 UTC)
Cache Format Version: Version 3
File Extension: exo

This information can be extracted directly from the filename field using SQL through a combination of the CAST, SUBSTR, and INSTR string functions to parse the individual components.


ExoPlayerCacheIndex<instance_uuid> Tables

The ExoPlayerCacheIndex tables contains additional information about media 
  • Cache id: Unique Identifier
  • key: The source URL of the media on Grok Servers
  • metadata: A BLOB containing additional metadata

The source URL can be used to distinguish between public content and user generated content.

Public content will have a source URL that starts with: https://imagine-public.x.ai/imagine-public/share-videos/

User generated content starts with: https://assets.grok.com/users/

When the content is user generated, the source URL includes the User ID for the account associated with the Grok account that generated the content. For example, in the URL https://assets.grok.com/users/527d9b95-3eec-4f1f-962b-0cfaef416965/generated/ab6cf18c-8fda-41f6-a34e-d1d00bc03346/generated_video.mp4,  the user id is 527d9b95-3eec-4f1f-962b-0cfaef416965. This can help differentiate activity in a scenario where multiple Grok accounts are used on the same device.

To gather further information about the user account, this user id relates to user id found in the INTERCOM_SDK_USER_PREFS xml file.

You can use a SQL query to extract the information:



What is not included in this output is the metadata column. Unlike XML or JSON blobs that can be read directly, the metadata field is a binary blob. 


In my test dataset, each metadata BLOB had a consistent length of 25 bytes. The only discernible plain-text string embedded within these blobs was exo_len, which appears to represent the length of the cached media resource. 

I haven't been able to identify the exact format, but I have been able to identify the structure and the meaning of the values:

Note: Integer Values are Big Endian
If we interpret the information we get:

Joining the Tables

Joining the information from the 2 tables together unfortunately is not as simple as doing a standard join but it is possible.

The relationship between these two tables is the Cache ID. In the ExoPlayerCacheIndex table, the Cache ID is stored as its own field (id). In the ExoPlayerCacheFileMetadata table, the Cache ID is embedded as the first component of the cached filename (for example, 3.0.1763060312736.v3.exo, where 3 is the Cache ID). To join the tables, the Cache ID must be extracted from the beginning of the filename and cast to an integer, then matched to the ExoPlayerCacheIndex.id value.


*Screenshot below does not display all the columns


Cached Media Files

The media cached files associated with entries in the exoplayer_internal database can be found in the application cache.

At the root of the cache folder, will be a folder with the same name as the User ID. If multiple Grok accounts are used, there will be multiple folders of interest.

Media Cache file Location: data\data\ai.x.grok\cache\<userid>

Note: In my research and testing I did not test how long cached files persist for and what the impacts are of clearing the cache.

Within the User ID folder is a video-cache directory containing numbered subfolders (0–9), which in my research on the ExoPlayer is a standard cache structure used to distribute cached files for performance reasons.

Inside the numbered folders will be a series of .exo files. These are ExoPlayer's on device cache files used to store media data that has been streamed or generated within Grok.

The filenames can be used to match the media with its corresponding information found in the exoplayer_internal database. This will help in identifying content that the user generated vs public media that is displayed in the app.


Forensic Tools should be able to play the exo files, if they can't the media can be exported and reviewed in VLC player or standard Windows Media Player.

Content Generation Prompts

Using the ExoPlayer artifacts, we can potentially identify information about media a user has generated and distinguish them from public media. However, while these artifacts can demonstrate that media was created or accessed, they do not provide sufficient context to determine user intent. To establish intent, the original prompt or input used by the user to generate the media would be required. The only artifact on the local device that I have been able to find that stores anything related to the prompt used to generate content is the androidx.work.workdb SQLite database.

androidx.work.workdb location: data\data\ai.x.grok\no_backup\

The androidx.work.workdb database is used by Android’s WorkManager framework to track scheduled and executed background tasks. In the context of Grok, when a user initiates video generation a background task is executed that coordinates communication with the Grok cloud services.

One of the challenges when examining this SQLite database is that records associated with background tasks are periodically cleaned up once the task has completed. However, the database uses write-ahead logging (WAL), which means there is potential to recover historical task data from WAL frames that have not yet been checkpointed back into the main database file.

The main table of interest in this database is the workspec table. While it contains many fields associated with task scheduling and execution, I am going to focus on the fields that are most relevant to establishing when a content generation task was initiated and what the user prompt was.

The following fields are of interest:
  • id: Unique identifier for each task
  • state: The state of the task
    • 0 = Enqueued
    • 1 = Running
    • 2 = Succeeded
    • 3 = Failed
    • 4 = Blocked
    • 5 = Cancelled
  • worker_class_name: The class name 
    • For video generation the worker class name will be ai.x.grok.image.work.VideoGenerationWorker
  • last_enqueue_time: The time the task was added to WorkManager’s queue, stored as Unix epoch milliseconds. This timestamp is effectively synonymous with when a user submitted a prompt to generate content.
  • Input: A binary blob containing instructions and parameters for the worker
  • output: A binary blob containing the output returned by the worker after task succeeds
When extracting records at the physical level from both the main database file and the write-ahead log (WAL), it is possible to identify multiple versions of the same task as it progresses through the WorkManager workflow. Tasks typically begin in an Enqueued (0) state when the request is received, transition to running (1) while the task is executing, and move to succeeded (2) once the task has completed successfully.

Input BLOB
The input BLOB can be broken into 2 parts:
  1. Input Data: A JSON object containing the user prompt and type of prompt
  2. User ID: The user id that generated the prompt
In my testing, the embedded JSON object starts 12 bytes into the input BLOB and can be carved by locating the closing delimiter } and extracting from the first { through that point. After the JSON terminator, is the text 'input data' followed by the user id. This can carved by looking for the $ marker (0x24) and then parsing the next 36 bytes. This will help determine which user account generate the content.



Parsing the JSON after it has been carved from the BLOB is fairly simple

The Input Array contains the instructions for the task. Depending on the type of content generation the JSON structure can vary

Video Generation from Text
  • imageURL: The url of the image supplied to the generation request
  • postId: The Video ID
  • prompt: The text entered by the user
  • type: Content generation input type

Video Generation from Image
  • imageURL: The url of the image supplied to the generation request
  • mode
    • displayName: Display Name for Grok Mode
    • modeName: Grok Mode
    • type: The class name for the generation mode
  • userAssetId: Further research required 
  • type: Content generation input type


Below is an example of the results of parsing the components from the input blob and extracting the JSON fields:

Part 2

Output BLOB
The output field contains the output data once the task has completed. In my testing, the BLOB contains an embedded JSON object that starts 12 bytes into the output BLOB and can be carved by locating the closing delimiter } and extracting from the first { through that point. After the JSON terminator, is the text 'output_state'.

Note: The output blob will only contain an embedded JSON object if the task state is Succeeded.  


Once the JSON is extracted it can be easily parsed:
  • audioUrls: Further research required
  • customPrompt: The original user prompt
  • hdVideoUrl: Further research required
  • imageURL: The url of the image generated for the request 
  • mode: the content generation mode
    • Text - Generated from Text
    • Normal - Generated from Image
  • modelName: The generation model used
  • uuid: unique identifier
  • videoGenType: The video generation type
  • videoId: Unique id assigned to the generated video
  • videoUrl: The url of the video generated for the request 

Below is an example of the results of parsing the JSON from the output blob and extracting the fields:


Part 2

Conclusion

Analysis of the Grok Android application demonstrates that meaningful user activity can be reconstructed by correlating user account artifacts, ExoPlayer artifacts, and Android WorkManager task records.

User account information, including email address, username, and internal user IDs, can be identified from the application INTERCOM preference files. These identifiers provide critical attribution, particularly in scenarios where multiple accounts may have been used on the same device.

The androidx.work.db database offers the strongest evidence of user intent. The WorkManager records document background tasks responsible for content generation, including video generation requests. When the database is examined at the physical level the records can reveal the embedded input and output data. The input data contains user prompts or source images, while output data includes generated content identifiers and source URLs.

Once generated content is rendered or played within the application, ExoPlayer artifacts provide evidence of local interaction with that media. The exoplayer_internal.db records the original source URLs, cache identifiers, file sizes, and last interaction timestamps, and these records can be directly correlated to WorkManager output URLs. The corresponding .exo cache files further substantiate that the generated media was accessed or prepared for playback on the device.

Unfortunately, in my research thus I wasn't able to find Conversation History, but there is still a lot more research and testing to do.

In late November (2025) I did contribute scripts to the ALEAPP project that will parse the User Account information and some of the ExoPlayer artifacts including associating the Database Records with the cached video files. I plan to update those scripts in the coming months with my new research.

Comments

Popular posts from this blog

The Duck Hunters Guide - Blog #6 - DuckDuckGo Fireproof Sites (Android)

Inside Proton’s Vault: Uncovering Android Proton Drive Artifacts

The Realm Files - Vol 3 - The Realm Header