PdfDocumentProcessor.GetText(PdfDocumentArea) Method
In This Article
Retrieves the text found in the specified document area.
Namespace: DevExpress.Pdf
Assembly: DevExpress.Docs.v24.2.dll
NuGet Package: DevExpress.Document.Processor
#Declaration
public string GetText(
PdfDocumentArea area
)
#Parameters
Name | Type | Description |
---|---|---|
area | Pdf |
A Pdf |
#Returns
Type | Description |
---|---|
String | The content retrieved from the specified area. |
#Remarks
The overloaded GetText method uses the page coordinate system. Refer to the following help topic for more details: Coordinate Systems.
Pass the PdfTextExtractionOptions object as the method parameter to extract text without clipping the content to the crop box.
The code sample below retrieves text from a specific part of the document.
using (PdfDocumentProcessor processor = new PdfDocumentProcessor())
{
processor.LoadDocument("TextExtraction.pdf");
PdfPage page = processor.Document.Pages[0];
PdfRectangle pdfRectangle = new PdfRectangle(page.CropBox.Left / 3, page.CropBox.Bottom, page.CropBox.Right / 3, page.CropBox.Top);
PdfDocumentArea pageArea = new PdfDocumentArea(1, pdfRectangle);
string pageText = processor.GetText(pageArea);
Console.WriteLine(pageText);
}
See Also