New DevExpress PDF Document API: Generate and Edit Tagged PDF Documents
- 9 minutes to read
A tagged PDF document defines a structure tree, a logical representation of the element hierarchy. Assistive technologies, such as screen readers, use the structure tree to improve accessibility for users who rely on assistive technology.
This article describes how to manage a structure tree in a new or existing PDF document:
- Create a Structure Tree for a New Document
- Map Custom Structure Tags to Standard Roles
- Edit the Structure Tree of an Existing Document
- Remove Elements from a Tree
Create a Structure Tree for a New PDF Document
The PdfDocument.StructureTree property returns the structure tree of a PDF document. Each node of the structure tree is a StructureElement object. Call the StructureTree.AddChildElement(StructureTypeDescriptor) method to add a child element to the structure tree. The following structures list available type descriptors that you can pass as parameters.
- Pdf17StructureTypeDescriptor - Lists the structure types defined in the PDF 1.7 specification.
- Pdf20StructureTypeDescriptor - Lists the structure types defined in the PDF 2.0 specification.
The following code snippet creates a new tagged PDF document.
using DevExpress.Docs.Pdf;
using DevExpress.Drawing.Printing;
using System.Drawing;
using System.IO;
using (PdfDocument document = new PdfDocument())
{
Page page = document.Pages.Add(DXPaperKind.A4);
// Create the root Document element.
StructureElement doc = document.StructureTree
.AddChildElement(Pdf17StructureTypeDescriptor.Document);
// Add a section to the document.
StructureElement section =
doc.AddChildElement(Pdf17StructureTypeDescriptor.Sect);
// Add a heading.
StructureElement heading =
section.AddChildElement(Pdf17StructureTypeDescriptor.H1);
heading.AddTextFragment(page, new TextFragment
{
Text = "Invoice",
Location = new PointF(50, 800),
Font = new TextFont("Arial", TextFontStyle.Bold),
FontSize = 24
});
// Create a table with layout attributes.
StructureElement table =
section.AddChildElement(Pdf17StructureTypeDescriptor.Table);
table.Attributes.Add(new LayoutAttribute
{
Placement = LayoutPlacement.Block,
WritingMode = WritingMode.LeftToRight
});
table.Attributes.Add(
new TableAttribute { Summary = "Invoice items" });
StructureElement header =
table.AddChildElement(Pdf17StructureTypeDescriptor.THead);
StructureElement headerRow =
header.AddChildElement(Pdf17StructureTypeDescriptor.TR);
// Add table header cells.
StructureElement th1 =
headerRow.AddChildElement(Pdf17StructureTypeDescriptor.TH);
th1.Attributes.Add(
new TableAttribute { Scope = TableScope.Column });
th1.AddTextFragment(page, new TextFragment
{
Text = "Item",
Location = new PointF(50, 700),
Font = new TextFont("Arial", TextFontStyle.Bold),
FontSize = 12
});
StructureElement th2 =
headerRow.AddChildElement(Pdf17StructureTypeDescriptor.TH);
th2.Attributes.Add(
new TableAttribute { Scope = TableScope.Column });
th2.AddTextFragment(page, new TextFragment
{
Text = "Qty",
Location = new PointF(250, 700),
Bold = true
});
StructureElement th3 =
headerRow.AddChildElement(Pdf17StructureTypeDescriptor.TH);
th3.Attributes.Add(
new TableAttribute { Scope = TableScope.Column });
th3.AddTextFragment(page, new TextFragment
{
Text = "Price",
Location = new PointF(320, 700),
Bold = true
});
StructureElement th4 =
headerRow.AddChildElement(Pdf17StructureTypeDescriptor.TH);
th4.Attributes.Add(
new TableAttribute { Scope = TableScope.Column });
th4.AddTextFragment(page, new TextFragment
{
Text = "Total",
Location = new PointF(400, 700),
Bold = true
});
// Add table body with a data row.
StructureElement body =
table.AddChildElement(Pdf17StructureTypeDescriptor.TBody);
StructureElement dataRow =
body.AddChildElement(Pdf17StructureTypeDescriptor.TR);
StructureElement td1 =
dataRow.AddChildElement(Pdf17StructureTypeDescriptor.TD);
td1.AddTextFragment(page, new TextFragment
{
Text = "Product A",
Location = new PointF(50, 670)
});
StructureElement td2 =
dataRow.AddChildElement(Pdf17StructureTypeDescriptor.TD);
td2.AddTextFragment(page, new TextFragment
{
Text = "2",
Location = new PointF(250, 670)
});
StructureElement td3 =
dataRow.AddChildElement(Pdf17StructureTypeDescriptor.TD);
td3.AddTextFragment(page, new TextFragment
{
Text = "$25.00",
Location = new PointF(320, 670)
});
StructureElement td4 =
dataRow.AddChildElement(Pdf17StructureTypeDescriptor.TD);
td4.AddTextFragment(page, new TextFragment
{
Text = "$50.00",
Location = new PointF(400, 670)
});
// Add a paragraph with the total amount.
StructureElement pTotal =
section.AddChildElement(Pdf17StructureTypeDescriptor.P);
pTotal.AddTextFragment(page, new TextFragment
{
Text = "Total: $50.00",
Location = new PointF(50, 620),
Bold = true
});
// Save the tagged document.
using (FileStream stream =
File.Create("TaggedInvoice.pdf"))
{
document.Save(stream);
}
}
Map Custom Structure Tags to Standard PDF Roles
The StructureTree.RoleMap property allows you to map custom structure tags to standard PDF roles. The role map ensures accessibility and PDF/UA compatibility without changing document layout or tagging conventions.
The following code snippet maps custom structure tags to standard PDF roles in the structure tree of a new PDF document.
using DevExpress.Docs.Pdf;
using DevExpress.Drawing.Printing;
using System.Drawing;
using System.IO;
using (PdfDocument document = new PdfDocument())
{
Page page = document.Pages.Add(DXPaperKind.A4);
// Create structure elements with custom tags.
var customHeading = new Pdf17StructureTypeDescriptor("CompanyTitle");
var customBody = new Pdf17StructureTypeDescriptor("ContentBlock");
StructureTree structureTree = document.StructureTree;
// Map custom tags to standard PDF roles.
structureTree.RoleMap["CompanyTitle"] = "H1";
structureTree.RoleMap["ContentBlock"] = "P";
structureTree.RoleMap["Disclaimer"] = "Span";
StructureElement doc =
structureTree.AddChildElement(Pdf17StructureTypeDescriptor.Document);
// Use the custom "CompanyTitle" tag mapped to H1.
StructureElement heading = doc.AddChildElement(customHeading);
heading.AddTextFragment(page, new TextFragment
{
Text = "Acme Corporation",
Location = new PointF(50, 800),
Font = new DXFont("Arial", DXFontStyle.Bold),
FontSize = 24
});
// Use the custom "ContentBlock" tag mapped to P.
StructureElement body = doc.AddChildElement(customBody);
body.AddTextFragment(page, new TextFragment
{
Text = "Welcome to Acme Corporation's annual report.",
Location = new PointF(50, 750)
});
using (FileStream stream = File.Create("TaggedWithRoleMap.pdf"))
{
document.Save(stream);
}
}
Edit the Structure Tree of an Existing PDF Document
Use the StructureTree.Elements collection to review and edit the structure tree of an existing PDF document. Call the StructureElementCollection.Find method to locate a specific element. You can modify element properties or add new child elements.
The following code snippet adds a new heading element to the structure tree of an existing PDF document:
using DevExpress.Docs.Pdf;
using System.Drawing;
using System.IO;
using (PdfDocument pdfDocument =
new PdfDocument(File.OpenRead("Document.pdf")))
{
// Get the structure tree.
StructureTree structureTree =
pdfDocument.StructureTree;
// Find the first section element.
StructureElement section =
structureTree.Elements.Find(
e => e is StructureElement se
&& se.Type == "Sect")
as StructureElement;
if (section != null)
{
// Add a new heading element to the section.
StructureElement heading =
section.AddChildElement(
Pdf17StructureTypeDescriptor.H2);
Page page = pdfDocument.Pages[0];
heading.AddTextFragment(page, new TextFragment
{
Text = "New Heading",
Location = new PointF(50, 700),
Font =
new DevExpress.Drawing.DXFont("Arial", 18),
Bold = true
});
// Save changes to a new file.
using (FileStream stream =
File.Create("UpdatedDocument.pdf"))
{
pdfDocument.Save(stream);
}
}
}
Remove Elements from a Structure Tree
Call StructureElementCollection.Remove or StructureElementCollection.RemoveAt to remove an element from the structure tree.
The following code snippet removes an element from the structure tree of an existing PDF document:
using DevExpress.Docs.Pdf;
using System;
using System.IO;
using (PdfDocument pdfDocument =
new PdfDocument(File.OpenRead("Document.pdf")))
{
StructureTree structureTree =
pdfDocument.StructureTree;
StructureItem target = structureTree.Elements
.Find(e => e is StructureElement se
&& se.Descriptor.Value == "Private", out StructureElement parent);
if (target != null)
{
// Remove the element from the structure tree.
parent.Elements.Remove(target);
}
// Save the updated document.
using (FileStream stream =
File.Create("UpdatedDocument.pdf"))
{
pdfDocument.Save(stream);
}
}