How to Search and Highlight Text in PDF Documents
- 5 minutes to read
The ExpressPDFViewer Suite provides an API for performing document search operations both at the PDF Viewer control and internal document representation levels:
The control provides three overloaded TextSearch.Find method variants allowing you to search a specific text string in two different modes – that is, search through the entire document or search only the next (or previous, depending on the passed search options) search result at a time;
The PDF document object has four FindText overloaded method variants, three of which provide the same functionality as the control’s search methods. The additional (fourth) overloaded variant allows you to search the text matches on a specific document page.
Most of the search methods that the PDF Viewer control and the PDF Document object provide are functions returning a TdxPDFDocumentTextSearchResult value that stores the search result status and a text range corresponding to the text match’s actual position within the document (the text range is empty if the text search is unsuccessful).
You can use highlight the returned text range using standard or custom colors by calling the control’s Highlights.Add procedure. A highlight is a colored rectangle with its outline painted over a text range within a document. The PDF Viewer’s Find Panel uses the highlighting system to highlight the text matches it finds with the standard colors depending on the current look & feel settings.
Pass the dxacDefault global constant as the Highlights.Add procedure’s ABackColor and AFrameColor parameters, respectively to highlight a text range with the standard fill and outline colors. For custom highlight colors, pass the required TdxAlphaColor values instead. Since the highlight rectangles are painted over (that is, overlap) the document layer, the background fill color must be transparent.
The following code example searches a specific term (“Dx”) on the third document page and highlights the matches using custom colors:
uses
dxCoreGraphics; // Required for working with TdxAlphaColor values
//...
var
ABackColor, AFrameColor: TdxAlphaColor; // The two colors used to highlight the text matches
APageIndex: Integer; // Stores the target document page index
AResult: TdxPDFDocumentTextSearchResult; // Stores the current search result
AStop: Boolean; // The flag indicating if the search operation must be stopped
begin
if not dxPDFViewer1.IsDocumentLoaded then Exit; // Do nothing if the PDF Viewer has no loaded document
APageIndex := 2; // The index of the page on which the search and highlight operations are performed
ABackColor := dxMakeAlphaColor(100, 200, 100, 100); // Creates a transparent background fill color for document highlights
AFrameColor := dxMakeAlphaColor(255, 200, 100, 100); // Creates a non-transparent outline color for document highlights
dxPDFViewer1.BeginUpdate; // Disables the control's repainting
repeat // Cycles through all potential text matches
AResult := dxPDFViewer1.Document.FindText("Dx", TdxPDFDocumentTextSearchOptions.Default, APageIndex); // Searches the next text match on the specified page
case AResult.Status of // Chooses an appropriate action depending on the search result status
tssFound: // If a text match is found
begin
AStop := AResult.Range.PageIndex <> APageIndex; // Ensures that the found match is on the target page (that is, sets the AStop flag to True if the match is on the next page)
if not AStop then // If the search operation is in progress
dxPDFViewer1.Highlights.Add(AResult.Range, ABackColor, AFrameColor); // Highlights the found text range
end;
tssNotFound,tssFinished: // Stop searching once the entire document has been processed
AStop := True;
end;
until AStop; // The search cycle continues while the AStop flag is false
dxPDFViewer1.Highlights.Visible := bTrue; // Makes all highlights visible
dxPDFViewer1.EndUpdate; // Re-enables the control's repainting and displays all text highlights
end;