Skip to main content
A newer version of this page is available. .

How to: Extract Text from a Document

Important

The Universal Subscription or an additional Document Server Subscription is required to use this example in production code. Please refer to the DevExpress Subscription page for pricing information.

This tutorial describes how to extract the text of a PDF file at runtime using the PDF Document Processor.

To extract the text of a PDF file, do the following.

  1. Create a PdfDocumentProcessor.
  2. To open a PDF file, pass a stream that contains the document data to the PdfDocumentProcessor.LoadDocument method.
  3. After the document is loaded, you can extract its plain text using the PdfDocumentProcessor.Text property.

The following code implements this functionality.

string ExtractTextFromPDF(string filePath) {
    string documentText = "";
    try {
        using (PdfDocumentProcessor documentProcessor = new PdfDocumentProcessor()) {
            documentProcessor.LoadDocument(filePath);
            documentText = documentProcessor.Text;
        }
    }
    catch { }
    return documentText;
}
See Also