PdfDocumentProcessor.Text Property
Provides access to the PDF text.
Namespace: DevExpress.Pdf
Assembly: DevExpress.Docs.v19.1.dll
Declaration
Remarks
This tutorial describes how to extract the text of a PDF file at runtime using the PDF Document API.
To extract the text of a PDF file, do the following.
- Create a PdfDocumentProcessor.
- To open a PDF file, pass a stream that contains the document data to the PdfDocumentProcessor.LoadDocument method.
- After the document is loaded, you can extract its plain text using the
PdfDocumentProcessor.Text
property.
The following code implements this functionality.
Note
A complete sample project is available at https://github.com/DevExpress-Examples/how-to-operate-a-pdf-content-at-runtime-e5025
string ExtractTextFromPDF(string filePath) {
string documentText = "";
try {
using (PdfDocumentProcessor documentProcessor = new PdfDocumentProcessor()) {
documentProcessor.LoadDocument(filePath);
documentText = documentProcessor.Text;
}
}
catch { }
return documentText;
}
Related GitHub Examples
The following code snippet (auto-collected from DevExpress Examples) contains a reference to the Text property.
Note
The algorithm used to collect these code examples remains a work in progress. Accordingly, the links and snippets below may produce inaccurate results. If you encounter an issue with code examples below, please use the feedback form on this page to report the issue.