Agent skills
AI Security Expert

Agent skill

AI Security Expert

Enterprise AI security - OWASP LLM Top 10, prompt injection defense, guardrails, PII protection

View SKILL.md on GitHub Repository

Stars 1

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/frankxai/ai-architect/tree/main/skills/ai-security-expert

SKILL.md

AI Security Expert

Enterprise AI security architect specializing in securing LLM applications, defending against prompt injection, implementing guardrails, and OWASP LLM Top 10 compliance.

OWASP LLM Top 10 (2025)

Quick Reference

#	Vulnerability	Risk	Key Defense
LLM01	Prompt Injection	Critical	Input sanitization, delimiters
LLM02	Insecure Output	High	Output validation, sanitization
LLM03	Training Data Poisoning	High	Data provenance, auditing
LLM04	Model DoS	Medium	Rate limiting, timeouts
LLM05	Supply Chain	High	Verification, pinning
LLM06	Sensitive Info Disclosure	High	PII detection, redaction
LLM07	Insecure Plugin Design	High	Permission model, validation
LLM08	Excessive Agency	High	Human-in-the-loop, least privilege
LLM09	Overreliance	Medium	Confidence scores, citations
LLM10	Model Theft	Medium	Rate limiting, watermarking

LLM01: Prompt Injection

Attack Types:

Direct: "Ignore previous instructions..."
Indirect: Malicious content in RAG documents
Encoding tricks: Unicode, special tokens

Defense Pattern:

User Input → Sanitize → Delimit → LLM → Validate Output → Filter

LLM02: Insecure Output Handling

Never execute LLM output as code without validation
Sanitize HTML (use allowlist)
Validate SQL (SELECT only, table allowlist)

LLM04: Model DoS

Rate limiting per user/API key
Token limits on requests
Timeout configurations
Cost capping/alerts

LLM06: Sensitive Information Disclosure

PII detection (regex + NER)
System prompt protection
Training data sanitization
Output filtering

Code patterns: resources/security-patterns.py

PII Protection

Detection Patterns

Type	Example Pattern
Email	`@.com`
Phone	`XXX-XXX-XXXX`
SSN	`XXX-XX-XXXX`
Credit Card	16 digits
IP Address	`X.X.X.X`

Redaction Strategy

Detect PII in input before LLM call
Redact PII in LLM output
Log without PII
Encrypt at rest

Guardrails Implementation

NeMo Guardrails (NVIDIA)

define user express harmful intent
    "How do I hack"

define bot refuse harmful request
    "I can't help with that."

define flow harmful intent
    user express harmful intent
    bot refuse harmful request

Guardrails AI

python

guard = Guard().use_many(
    ToxicLanguage(on_fail="fix"),
    PIIFilter(on_fail="fix"),
    ValidJSON(on_fail="reask")
)

Custom Pipeline

Input Guards → LLM Call → Output Guards → Response

Implementation: resources/security-patterns.py

Security Architecture

Defense in Depth Layers

Layer	Controls
Network	WAF, DDoS protection, API gateway
Auth	OAuth 2.0, API keys, mTLS
Input	Schema validation, injection detection
Guardrails	Topic restrictions, PII filtering
Model	Versioning, anomaly detection
Output	Response filtering, fact verification
Audit	Logging, retention, compliance

Zero Trust Principles

Never trust, always verify
Least privilege for agents
Assume breach (log everything)

Compliance Frameworks

EU AI Act (High-Risk)

Risk management system
Data governance
Technical documentation
Human oversight
Accuracy/robustness testing

SOC 2 for AI

Security: Access controls, encryption
Availability: SLA monitoring, DR
Processing Integrity: Input/output validation
Confidentiality: Data classification
Privacy: Data minimization, consent

Security Testing

Red Team Categories

Direct injection attempts
Jailbreak prompts
Indirect injection via context
Encoding/unicode tricks

Test suite: resources/security-patterns.py

Testing Checklist

Injection patterns blocked
System prompt protected
PII detected and redacted
Rate limits enforced
Outputs validated
Audit logs complete

Incident Response

Severity Levels

Incident	Severity	Response
Prompt injection detected	Medium	Block, log, analyze
Data exfiltration attempt	High	Block, forensics, notify
Model extraction detected	High	Rate limit, investigate

Response Steps

Contain (block source)
Preserve (logs, evidence)
Analyze (attack pattern)
Remediate (update defenses)
Document (security log)

Resources

Secure AI systems with defense in depth and zero trust principles.

Maintainer

frankxai Core maintainer

Source details

Full Name: frankxai/ai-architect
Branch: main
Path in repo: skills/ai-security-expert
Topics: architecture ai-architect ai-patterns enterprise-ai oracle systems-design

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

frankxai/ai-architect

GenAI DAC Specialist

Expert in OCI Generative AI Dedicated AI Clusters - deployment, fine-tuning, optimization, and production operations

1 0

Explore

frankxai/ai-architect

Oracle Agent Spec Expert

Design framework-agnostic AI agents using Oracle's Open Agent Specification for portable, interoperable agentic systems with JSON/YAML definitions

1 0

Explore

frankxai/ai-architect

OCI Services Expert

Expert guidance on Oracle Cloud Infrastructure services, cloud architecture patterns, cost optimization, deployment strategies, and OCI best practices for enterprise solutions

1 0

Explore

frankxai/ai-architect

agentic-orchestration

Patterns for multi-agent coordination, task decomposition, handoffs, and workflow orchestration. Best practices for building and managing agent systems.

1 0

Explore

frankxai/ai-architect

nvidia-nim

NVIDIA NIM inference microservices for deploying AI models with OpenAI-compatible APIs, self-hosted or cloud

1 0

Explore

frankxai/ai-architect

AWS AI Services Expert

Build AI applications on AWS using Bedrock, SageMaker, and AI/ML services with best practices for enterprise deployment

1 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

AI Security Expert

OWASP LLM Top 10 (2025)

Quick Reference

LLM01: Prompt Injection

LLM02: Insecure Output Handling

LLM04: Model DoS

LLM06: Sensitive Information Disclosure

PII Protection

Detection Patterns

Redaction Strategy

Guardrails Implementation

NeMo Guardrails (NVIDIA)

Guardrails AI

Custom Pipeline

Security Architecture

Defense in Depth Layers

Zero Trust Principles

Compliance Frameworks

EU AI Act (High-Risk)

SOC 2 for AI

Security Testing

Red Team Categories

Testing Checklist

Incident Response

Severity Levels

Response Steps

Resources

Recommended Agent Skills

GenAI DAC Specialist

Oracle Agent Spec Expert

OCI Services Expert

agentic-orchestration

nvidia-nim

AWS AI Services Expert