GroupDocs.Parser for Java

Extract hyperlinks from RTF with Java

Easily pull out web links and hyperlinks from PDFs, Word files, Excel sheets, and other documents using GroupDocs.Parser in your Java environment.

Maven Download

Start Free Trial

How to extract hyperlinks from Rtf in Java

GroupDocs.Parser simplifies hyperlink extraction from RTF files in Java applications with these basic steps:

Open the RTF file using an instance of Parser.
Ensure hyperlink extraction is available for the file format.
Extract all hyperlinks using the appropriate method.
Loop through the results and process each link as needed.

Copy

// Load the file that may contain hyperlinks using the Parser
try (Parser parser = new Parser("input.rtf")) {

    // Check whether the document format supports hyperlink parsing
    if (!parser.getFeatures().isHyperlinks()) {
        System.out.println("Hyperlink extraction is not available for the file");
        return;
    }

    // Extract and use hyperlink data from the document
    Iterable<PageHyperlinkArea> hyperlinks = parser.getHyperlinks();

    for (PageHyperlinkArea h : hyperlinks) {
        System.out.println(h.getText());
        System.out.println(h.getUrl());
    }
}

<dependencies>
<dependency>
<groupId>com.groupdocs</groupId>
<artifactId>groupdocs-parser</artifactId>
<version>24.9</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>repository.groupdocs.com</id>
<name>GroupDocs Repository</name>
<url>https://repository.groupdocs.com/repo/</url>
</repository>
</repositories>

click to copy

copied

Comprehensive document parsing tools

Along with extracting hyperlinks, GroupDocs.Parser enables you to collect other useful content like plain text, embedded media, and structured data for use in automated workflows.

Hyperlink extraction and document analysis

Accurate link detection

Capture all types of hyperlinks from different document layouts, including clickable text and hidden URLs.

Works with documents and web content

Pull links from PDF, DOCX, XLSX, HTML, and image files that contain embedded hyperlinks.

Custom extraction behavior

Refine how hyperlinks are extracted using options like page ranges, link types, or content filters.

Example: extracting hyperlinks from a PDF with custom options

This sample demonstrates how to extract all links from a PDF file using link extraction settings.

Java

//  Open the PDF using the Parser class
try (Parser parser = new Parser("input.docx"))
{
    // Verify that hyperlink support is enabled for this document
    if (!parser.getFeatures().isHyperlinks()) {
        return;
    }

    // Apply options to filter links
    PageAreaOptions options = new PageAreaOptions(new Rectangle(new Point(380, 90), new Size(150, 50)));

    // Use the parser to get hyperlink data
    Iterable<PageHyperlinkArea> hyperlinks = parser.getHyperlinks(options);

    // Iterate through the links and handle them accordingly
    for (PageHyperlinkArea h : hyperlinks) {
        System.out.println(h.getText());
        System.out.println(h.getUrl());
    }
}

About GroupDocs.Parser for Java API

GroupDocs.Parser is a robust content extraction API designed for Java developers. It offers tools to extract hyperlinks, structured data, images, and text from popular formats like DOCX, XLSX, PDF, HTML, and more—all without needing any external plugins.

Learn more