Agent skill

analyzing-malicious-pdf-with-peepdf

Perform static analysis of malicious PDF documents using peepdf, pdfid, and pdf-parser to extract embedded JavaScript, shellcode, and suspicious objects.

Stars 4,300
Forks 470

Install this agent skill to your Project

npx add-skill https://github.com/mukul975/Anthropic-Cybersecurity-Skills/tree/main/skills/analyzing-malicious-pdf-with-peepdf

SKILL.md

Analyzing Malicious PDF with peepdf

When to Use

  • When triaging suspicious PDF attachments from phishing emails
  • During malware analysis of PDF-based exploit documents
  • When extracting embedded JavaScript, shellcode, or executables from PDFs
  • For forensic examination of weaponized document artifacts
  • When building detection signatures for PDF-based threats

Prerequisites

  • Python 3.8+ with peepdf-3 installed (pip install peepdf-3)
  • pdfid.py and pdf-parser.py from Didier Stevens suite
  • Isolated analysis environment (VM or sandbox)
  • Optional: PyV8 for JavaScript emulation within peepdf
  • Optional: Pylibemu for shellcode analysis

Workflow

  1. Triage with pdfid: Scan PDF for suspicious keywords (/JS, /JavaScript, /OpenAction, /Launch, /EmbeddedFile).
  2. Interactive Analysis: Open PDF in peepdf interactive mode to explore object structure.
  3. Identify Suspicious Objects: Locate objects containing JavaScript, streams, or encoded data.
  4. Extract Content: Dump suspicious streams and decode filters (FlateDecode, ASCIIHexDecode).
  5. Deobfuscate JavaScript: Analyze extracted JS for shellcode, heap sprays, or exploit code.
  6. Check VirusTotal: Use peepdf vtcheck to cross-reference file hash with AV detections.
  7. Generate IOCs: Extract URLs, domains, hashes, and shellcode signatures.

Key Concepts

Concept Description
/OpenAction Automatic action executed when PDF is opened
/JavaScript /JS Embedded JavaScript code in PDF objects
/Launch Action that launches external applications
/EmbeddedFile File embedded within the PDF structure
FlateDecode zlib compression filter used to hide content
Object Streams PDF objects stored in compressed streams

Tools & Systems

Tool Purpose
peepdf / peepdf-3 Interactive PDF analysis with JS emulation
pdfid.py Quick triage scanning for suspicious keywords
pdf-parser.py Deep object-level PDF parsing
VirusTotal Hash lookup and AV detection cross-reference
CyberChef Decode and transform extracted payloads

Output Format

Analysis Report: PDF-MAL-[DATE]-[SEQ]
File: [filename.pdf]
SHA-256: [hash]
Suspicious Keywords: [/JS, /OpenAction, etc.]
Objects with JavaScript: [Object IDs]
Extracted URLs: [List]
Shellcode Detected: [Yes/No]
Embedded Files: [Count and types]
VirusTotal Detections: [X/Y engines]
Risk Level: [Critical/High/Medium/Low]

Expand your agent's capabilities with these related and highly-rated skills.

mukul975/Anthropic-Cybersecurity-Skills

mapping-mitre-attack-techniques

Maps observed adversary behaviors, security alerts, and detection rules to MITRE ATT&CK techniques and sub-techniques to quantify detection coverage and guide control prioritization. Use when building an ATT&CK-based coverage heatmap, tagging SIEM alerts with technique IDs, aligning security controls to adversary playbooks, or reporting threat exposure to executives. Activates for requests involving ATT&CK Navigator, Sigma rules, MITRE D3FEND, or coverage gap analysis.

4,300 470
Explore
mukul975/Anthropic-Cybersecurity-Skills

hunting-for-spearphishing-indicators

Hunt for spearphishing campaign indicators across email logs, endpoint telemetry, and network data to detect targeted email attacks.

4,300 470
Explore
mukul975/Anthropic-Cybersecurity-Skills

analyzing-malicious-url-with-urlscan

URLScan.io is a free service for scanning and analyzing suspicious URLs. It captures screenshots, DOM content, HTTP transactions, JavaScript behavior, and network connections of web pages in an isolat

4,300 470
Explore
mukul975/Anthropic-Cybersecurity-Skills

implementing-zero-standing-privilege-with-cyberark

Deploy CyberArk Secure Cloud Access to eliminate standing privileges in hybrid and multi-cloud environments using just-in-time access with time, entitlement, and approval controls.

4,300 470
Explore
mukul975/Anthropic-Cybersecurity-Skills

implementing-pam-for-database-access

Deploy privileged access management for database systems including Oracle, SQL Server, PostgreSQL, and MySQL. Covers session proxy configuration, credential vaulting, query auditing, dynamic credentia

4,300 470
Explore
mukul975/Anthropic-Cybersecurity-Skills

detecting-t1003-credential-dumping-with-edr

Detect OS credential dumping techniques targeting LSASS memory, SAM database, NTDS.dit, and cached credentials using EDR telemetry, Sysmon process access monitoring, and Windows security event correlation.

4,300 470
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results