Skip to content

whitedevil1026/Rdp-Cache-parser

Repository files navigation

RDP Cache Parser logo

RDP Cache Parser

Every pixel tells a story.
A Windows DFIR tool for parsing and reconstructing screen images from Remote Desktop Protocol (RDP) bitmap cache files (Cache####.bin, bcache##.bmc).

Python 3.10+ Apache 2.0 License Tests PySide6


Features

Category Capability
Parsing .bin (Win 7+, 32-bit BGRA) and .bmc (Vista/2008, 8/16/24-bit)
Reconstruction Manual edge-match stitch canvas (authoritative) + automatic edge-matched triage hypothesis
GUI Dark-theme tile viewer, drag-and-drop, manual stitch canvas
OCR Optional text extraction from tiles + forensic keyword detection
Reporting HTML + JSON forensic report with SHA-256 chain-of-custody
Export Individual tiles or full collage (PNG/BMP)
Session Save / load manual stitch canvas state

Reconstruction — what it can and cannot do

RDP cache files store no screen coordinates. Each 64×64 tile is saved with only an 8-byte content hash (for deduplication); the screen x/y positions exist only in transient MemBlt drawing orders during the live session and are never written to disk. Fully-automatic, accurate screen reconstruction is therefore not achievable — it is, by consensus in the DFIR community, an analyst-assisted task.

This tool reflects that reality:

  • The manual stitch canvas is the authoritative reconstruction path. It uses pixel edge-matching to rank candidate tiles for each cell, and every placement is the analyst's documented choice.
  • Automatic reconstruction is provided for triage only. It is a best- effort hypothesis — every output image carries an "INFERRED — NOT TIMESTAMPED" caveat banner and a confidence grade, and must never be presented as a literal screenshot.

The automatic path works as follows:

  • Strip detection — consecutive tiles in .bin files often correspond to adjacent screen positions; tile order within a strip is verified by edge-matching
  • Temporal grouping — separates tiles from different screen states using file-index proximity (gap > 150 → new snapshot)
  • Edge-matched block placement — blocks are positioned by matching their border pixels to one another (A's bottom edge ↔ B's top edge); blocks with no confident edge match fall back to file-order estimation
  • Auto-resolution — detects actual screen width (1280 / 1366 / 1440 / 1920 / 2560 / 3840 px) from block widths; no manual configuration needed
  • OCR-driven slides — when OCR text is available, screen states are ordered into a numbered slide sequence with a slides_manifest.json index

OCR Scanning (optional)

Runs optical character recognition on every non-blank tile and reports:

  • All detected text with source file, tile index, and byte offset
  • IOC hits — tiles matching a built-in forensic keyword list (mimikatz, certutil, psexec, powershell, password, ntds.dit, and 40+ more)
  • Live keyword filter in the GUI for rapid triage

OCR requires the optional easyocr library (see Installation).

Forensic Report

The Generate Report toolbar button produces two files:

  • report.html — human-readable report with case metadata, source file hashes, parse statistics, OCR IOC hits, and reconstructed screen thumbnails
  • report.json — machine-readable structured output for SIEM / SOAR integration

Chain-of-custody fields included: source file path, size, SHA-256 hash, tool version and run timestamp, tile counts, OCR findings with exact byte offsets.


Requirements

  • Python 3.10+
  • PySide6 >= 6.5.0
  • Pillow >= 9.0.0
  • NumPy >= 1.21.0
  • (Optional) easyocr >= 1.7.0 — required only for OCR scanning

Installation

# Core tool (no OCR)
pip install -r requirements.txt

# With OCR support
pip install easyocr

Or install as a package:

pip install .            # core only
pip install ".[ocr]"     # with OCR
pip install ".[dev]"     # with test suite

Usage

python main.py
  1. Drag-and-drop the Cache folder onto the window, or use Open File(s) / Open Folder / Auto-Detect
  2. The tool parses all files and immediately runs smart reconstruction
  3. Browse extracted tiles in the grid (zoom, filter blanks, click to inspect)
  4. Click Reconstruct Screen to open the manual stitch canvas
  5. Click OCR Scan to extract text from all tiles (easyocr required)
  6. Click Generate Report to export an HTML + JSON forensic report
  7. Use Export Tiles / Export Collage to save tile images

Reconstructed screens are written to smart_reconstruction/ next to the cache files automatically after each parse.


Supported File Types

File Windows version Colour depth
Cache0000.binCache0005.bin Windows 7+ 32-bit BGRA
bcache2.bmc Vista / Server 2008 8-bit indexed
bcache22.bmc Vista / Server 2008 16-bit RGB565
bcache24.bmc Vista / Server 2008 24-bit or 32-bit

Cache files are located at:

C:\Users\<username>\AppData\Local\Microsoft\Terminal Server Client\Cache

Important Limitation — EGFX / H.264 Sessions

RDP sessions using H.264/AVC encoding (the EGFX pipeline, enabled by RemoteFX or modern "Experience" quality settings) do not produce traditional bitmap cache tiles. Azure Virtual Desktop and many cloud-hosted desktops also disable bitmap caching.

An empty cache file (or zero tiles parsed) does not rule out RDP activity — it may simply mean the session used the EGFX pipeline.

Traditional bitmap caching is used by mstsc.exe with Legacy graphics settings (uncheck "Use hardware graphics acceleration" and lower visual quality settings).


How It Works

  1. Parse — reads binary file headers, tile headers, and BGRA pixel data. .bmc tiles use Windows DIB (bottom-up) ordering and optional RLE compression; .bin tiles are top-down, uncompressed.

  2. Detect strips — scans for runs of tiles with matching vertical right/bottom edges; full-width strips (n_cols ≈ screen width ÷ 64) anchor each row.

  3. Group temporally — tiles within 150 file-index positions of each other are considered one "screen snapshot"; larger gaps create new groups.

  4. Position blocks — two-pass placement: full-width blocks first, then partial-width blocks into remaining columns using edge-match scores.

  5. Render — each group is composited onto a virtual canvas; one PNG per group.

  6. OCR (optional) — tiles are upscaled 4× (64 → 256 px) before recognition to give easyocr enough pixels for reliable glyph detection.


Running the Tests

pip install ".[dev]"
pytest

The test suite (155 passing, 1 skipped, no easyocr or GUI required) covers parsers, tile model, RLE decoder, edge matcher, cluster, smart reconstruct, OCR engine, OCR confidence filter, audit log, packaging contract, GUI pipeline helpers, run-state reset, sidecar JSON, and forensic report.

The skipped test only runs when easyocr is not installed (confirms the auto-install fallback path); it is automatically skipped once easyocr is present.


License

Apache License 2.0 — see LICENSE and NOTICE.

The tool is free to use for everyone, including enterprises. But under Section 4(d) of the Apache License, if you redistribute it — or integrate its code or its reconstruction / edge-matching / OCR logic into your own tool — you must keep the attribution from the NOTICE file:

RDP Cache Parser — Copyright (c) 2026 whitedevil1026 https://github.com/whitedevil1026/Rdp-Cache-parser

In short: use it freely, but credit the author. Apache 2.0 also grants you an explicit patent license.


Acknowledgments & Credits

This tool was built as original work. The following resources provided reference:

Third-Party Libraries

Library License Usage
PySide6 LGPL v3 GUI framework
Pillow HPND Image processing
NumPy BSD 3-Clause Vectorised edge matching
easyocr (optional) Apache 2.0 Tile text extraction

About

Windows DFIR tool that parses and reconstructs screen images from RDP bitmap cache files (.bin/.bmc) - edge-match stitching, OCR, and forensic IOC scanning.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages