MALDOC: A Modular Red-Teaming Platform for Document Processing AI Agents

MALDOC evaluates document-processing agents against document-layer attacks by combining multi-view extraction, risk-aware planning, controlled injection, and agentic propagation analysis.

Ashish Raj Shekhar* · Priyanuj Bordoloi* · Shiven Agarwal* · Yash Shah · Sandipan De · Vivek Gupta

Arizona State University · *Equal contribution

Preprint · 2026

Paper Project Page Demo Try It Video Code

Figure 1: System overview of MALDOC, covering multi-view extraction, risk-aware planning, injection mechanisms, and agentic propagation scoring.

Overview

Document-Layer Red-Teaming

Document-processing AI agents can be compromised by attacks that exploit discrepancies between rendered PDF content and machine-readable representations. MALDOC provides a modular pipeline to generate document-layer adversarial PDFs and measure downstream failures in agent workflows.

Key result: under the default planner, MALDOC achieves 86% ASR, with task degradation accounting for 72% of successes while preserving human-visible fidelity.

Pipeline

Multi-Stage Attack Generation

The system is organized into four attack stages plus propagation evaluation. Each stage emits structured artifacts for reproducibility and ablation studies.

Stage 1: Multi-view extraction (PyMuPDF byte parsing, Docling OCR, Mistral layout).
Stage 2: Risk-aware planning with cross-layer surface mapping and target binding.
Stage 3: Adversarial payload generation (strategy, mechanism, payload, scope).
Stage 4: Injection engine with Hidden Text Injection (HTI), Visual Overlay Injection (VOI), and Font-Glyph Remapping (FGR).
Propagation scoring: Tool Misfire, Task Degradation, and Resource Inflation.

MALDOC pipeline overview — Top-half view of the MALDOC pipeline: extraction, planning, and injection.

Dataset

Evaluation Setup

MALDOC evaluates on the Document Understanding Dataset (DuDE) with finance, healthcare, and education documents. Agents are implemented using LangGraph to simulate realistic workflows.

6 domain-specific agents with 10 functional tools.
Task types: QA, key-field extraction, and summarization.
Attacks tested under HTI, FGR, and VOI mechanisms.

Agentic propagation scoring overview — Bottom-half view highlighting domain-specific agents and propagation metrics.

Demo

End-to-End Walkthrough

The interactive demo allows users to upload PDFs, configure attack settings, and compare clean versus adversarial execution traces. This supports reproducible evaluation across models.

Demo: creation view — Figure 2: End-to-end demo walkthrough (cropped into four panels).

Demo: stage timeline view — Figure 2: End-to-end demo walkthrough (cropped into four panels).

Evaluation

Propagation Metrics & Stealth

Attack success is defined by any propagation signal relative to clean baselines. MALDOC reports an aggregated 86% attack success rate, with task degradation accounting for 72% of successes. ASR decomposes into QA-only (21.5%), workflow-only (41.0%), and QA+workflow (33.5%). Workflow deviations dominate: 74.5% of successful attacks involve Tool Misfire or state drift.

Stealth is enforced with SSIM-based visual invariance (SSIM = 1.0). In a human spot-check of 30 document pairs, annotators reported no visible differences in 97% of cases.

Results

Quantitative Highlights

Use the grid below for key tables or plots from the paper, such as mechanism-specific QA-F1 degradation, Tool Misfire rates, and detection performance.

Table 1: Semantic edit strategy to injection channel mapping

Table 2: Original-document (O-DOC) performance

Table 3: Adversarial-document (M-DOC) performance

Table 1: semantic edit strategies to injection channels. Table 2: O-DOC performance. Table 3: M-DOC performance. Table 4: planner-agnostic ASR. Table 5: stealthiness evaluation.

Citation

BibTeX

@misc{maldoc2026,
  title   = {MALDOC: A Modular Red-Teaming Platform for Document Processing AI Agents},
  author  = {Shekhar, Ashish Raj and Bordoloi, Priyanuj and Agarwal, Shiven and Shah, Yash and De, Sandipan and Gupta, Vivek},
  year    = {2026},
  note    = {Preprint},
  url     = {https://github.com/shekharashishraj/MalDoc}
}