Agent skill
document-docx
Create/edit .docx files with styles, tables, and templates. Use when asked to generate Word reports, contracts, proposals, or extract text.
Stars
50
Forks
11
Install this agent skill to your Project
npx add-skill https://github.com/vasilyu1983/AI-Agents-public/tree/main/frameworks/shared-skills/skills/document-docx
SKILL.md
Document DOCX Skill - Quick Reference
This skill enables creation, editing, and analysis of .docx files for reports, contracts, proposals, documentation, and template-driven outputs.
Modern best practices (2026):
- Prefer templates + styles over manual formatting.
- Treat
.docxas the editable source; treat PDF as a release artifact. - If distributing externally, include basic accessibility hygiene (headings, table headers, alt text).
Quick Reference
| Task | Tool/Library | Language | When to Use |
|---|---|---|---|
| Create DOCX | python-docx | Python | Reports, contracts, proposals |
| Create DOCX | docx | Node.js | Server-side document generation |
| Convert to HTML | mammoth.js | Node.js | Web display, content extraction |
| Parse DOCX | python-docx | Python | Extract text, tables, metadata |
| Template fill | docxtpl | Python | Mail merge, template-based generation |
| Review workflow | Word compare, comments/highlights | Any | Human review without OOXML surgery |
| Tracked changes | OOXML inspection, docx4j/OpenXML SDK/Aspose | Any | True redlines or parsing tracked changes |
Tool Selection
- Prefer
docxtplwhen non-developers must edit layout/design in Word. - Prefer
python-docxfor structural edits (paragraphs/tables/headers/footers) when formatting complexity is moderate. - Prefer
docx(Node.js) for server-side generation in TypeScript-heavy stacks. - Prefer
mammothfor text-first extraction or DOCX-to-HTML (best effort; may drop some layout fidelity).
Known Limits (Plan Around These)
.doc(legacy) is not supported by these libraries; convert to.docxfirst (e.g., LibreOffice).python-docxcannot reliably create true tracked changes; use Word compare or specialized OOXML tooling.- Tables of Contents and many fields are placeholders until opened/updated in Word.
Core Operations
Create Document (Python - python-docx)
python
from docx import Document
from docx.shared import Inches, Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH
doc = Document()
# Title
title = doc.add_heading('Document Title', 0)
title.alignment = WD_ALIGN_PARAGRAPH.CENTER
# Paragraph with formatting
para = doc.add_paragraph()
run = para.add_run('Bold and ')
run.bold = True
run = para.add_run('italic text.')
run.italic = True
# Table
table = doc.add_table(rows=3, cols=3)
table.style = 'Table Grid'
for i, row in enumerate(table.rows):
for j, cell in enumerate(row.cells):
cell.text = f'Row {i+1}, Col {j+1}'
# Image
doc.add_picture('image.png', width=Inches(4))
# Save
doc.save('output.docx')
Create Document (Node.js - docx)
typescript
import { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell } from 'docx';
import * as fs from 'fs';
const doc = new Document({
sections: [{
properties: {},
children: [
new Paragraph({
children: [
new TextRun({ text: 'Bold text', bold: true }),
new TextRun({ text: ' and normal text.' }),
],
}),
new Table({
rows: [
new TableRow({
children: [
new TableCell({ children: [new Paragraph('Cell 1')] }),
new TableCell({ children: [new Paragraph('Cell 2')] }),
],
}),
],
}),
],
}],
});
Packer.toBuffer(doc).then((buffer) => {
fs.writeFileSync('output.docx', buffer);
});
Template-Based Generation (Python - docxtpl)
python
from docxtpl import DocxTemplate
doc = DocxTemplate('template.docx')
context = {
'company_name': 'Acme Corp',
'date': '2025-01-15',
'items': [
{'name': 'Widget A', 'price': 100},
{'name': 'Widget B', 'price': 200},
]
}
doc.render(context)
doc.save('filled_template.docx')
Extract Content (Python - python-docx)
python
from docx import Document
doc = Document('input.docx')
# Extract all text
full_text = []
for para in doc.paragraphs:
full_text.append(para.text)
# Extract tables
for table in doc.tables:
for row in table.rows:
row_data = [cell.text for cell in row.cells]
print(row_data)
Styling Reference
| Element | Python Method | Node.js Class |
|---|---|---|
| Heading 1 | add_heading(text, 1) |
HeadingLevel.HEADING_1 |
| Bold | run.bold = True |
TextRun({ bold: true }) |
| Italic | run.italic = True |
TextRun({ italics: true }) |
| Font size | run.font.size = Pt(12) |
TextRun({ size: 24 }) (half-points) |
| Alignment | WD_ALIGN_PARAGRAPH.CENTER |
AlignmentType.CENTER |
| Page break | doc.add_page_break() |
new PageBreak() |
Do / Avoid (Dec 2025)
Do
- Use consistent heading levels and a table of contents for long docs.
- Capture decisions and action items with owners and due dates.
- Store docs in a versioned, searchable system.
Avoid
- Manual formatting instead of styles (breaks consistency).
- Docs with no owner or review cadence (stale quickly).
- Copy/pasting without updating definitions and links.
Output Quality Checklist
- Structure: consistent heading hierarchy, styles, and (when needed) an auto-generated table of contents.
- Decisions: decisions/actions captured with owner + due date (not buried in prose).
- Versioning: doc ID + version + change summary; review cadence defined.
- Accessibility hygiene: headings/reading order are correct; table headers are marked; alt text for non-decorative images.
- Reuse: use
assets/doc-template-pack.mdfor decision logs and recurring doc types.
Optional: AI / Automation
Use only when explicitly requested and policy-compliant.
- Summarize meeting notes into decisions/actions; humans verify accuracy.
- Draft first-pass docs from outlines; do not invent facts or quotes.
Navigation
Resources
- references/docx-patterns.md - Advanced formatting, styles, headers/footers
- references/template-workflows.md - Mail merge, batch generation
- references/tracked-changes.md - Tracked changes: what is feasible, and what is not
- references/accessibility-compliance.md - WCAG 2.2 AA, reading order, alt text, EU EAA
- references/cross-platform-compatibility.md - Rendering across Word, Google Docs, LibreOffice
- references/document-automation-pipelines.md - CI/CD batch generation, quality gates
- data/sources.json - Library documentation links
Scripts
scripts/docx_inspect_ooxml.py- Dependency-free OOXML inspection (including tracked changes signals)scripts/docx_extract.py- Extract text/tables to JSON (requirespython-docx)scripts/docx_render_template.py- Render adocxtpltemplate (requiresdocxtpl)scripts/docx_to_html.mjs- Convert.docxto HTML (requiresmammoth)
Templates
- assets/report-template.md - Standard report structure
- assets/contract-template.md - Legal document structure
- assets/doc-template-pack.md - Decision log, meeting notes, changelog templates
Related Skills
- ../document-pdf/SKILL.md - PDF generation and conversion
- ../docs-codebase/SKILL.md - Technical writing patterns
Fact-Checking
- Use web search/web fetch to verify current external facts, versions, pricing, deadlines, regulations, or platform behavior before final answers.
- Prefer primary sources; report source links and dates for volatile information.
- If web access is unavailable, state the limitation and mark guidance as unverified.
Didn't find tool you were looking for?