GroupDocs.Parser for Java

Retrieve table data from XML using Java

Seamlessly detect and extract tables from formats like PDF, DOCX, and XLSX with GroupDocs.Parser in your Java workflows.

Maven Download

Start Free Trial

How to retrieve tables from Xml in Java

To parse tables from XML documents using GroupDocs.Parser, follow these easy steps in your Java environment:

Create a Parser instance and load the target XML file.
Verify that the file supports structured table extraction.
Use the API to retrieve table elements from the document.
Leverage the extracted data in analytics, reporting, or automation systems.

Copy

// Load the input document with Parser that includes table elements
try (Parser parser = new Parser("input.xml"))
{
    // Verify that the document type allows table recognition
    if (!parser.getFeatures().isTables()) {
        System.out.println("Add logic for files that don’t support tables");
        return;
    }

    // Define rules for interpreting table structure
    TemplateTableLayout layout = new TemplateTableLayout(
            java.util.Arrays.asList(new Double[]{50.0, 95.0, 275.0, 415.0, 485.0, 545.0}),
            java.util.Arrays.asList(new Double[]{325.0, 340.0, 365.0, 395.0}));

    // Set parameters to extract tables
    PageTableAreaOptions options = new PageTableAreaOptions(layout);

    //  Run table extraction on the loaded document
    Iterable<PageTableArea> tables = parser.getTables(options);

    //  Process each extracted table from the result
    for (PageTableArea t : tables) 
    {
    }
}

<dependencies>
<dependency>
<groupId>com.groupdocs</groupId>
<artifactId>groupdocs-parser</artifactId>
<version>24.9</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>repository.groupdocs.com</id>
<name>GroupDocs Repository</name>
<url>https://repository.groupdocs.com/repo/</url>
</repository>
</repositories>

click to copy

copied

Advanced content extraction tools

Beyond reading tables, GroupDocs.Parser supports capturing plain text, visual elements, embedded metadata, and structured objects to enhance document processing tasks.

Extracting structured content and tabular data

Precise table parsing across formats

Support for extracting tables from standard document types like PDF, Word, Excel, and HTML with high accuracy.

Read tabular structures from diverse sources

Retrieve table data from spreadsheets, documents, and reports while preserving the structure and alignment.

Customizable table extraction settings

Control layout detection, manage headers and footers, and fine-tune extraction with flexible configuration options.

Sample: extract tables from an Excel document

This example shows how to extract and

loop through table content in an Excel (XLSX) file using GroupDocs.Parser.

Java

//  Initialize Parser with the Excel file style=color:#888;font-style:italic>try (Parser parser = new Parser("input.pdf")) { // Exit if table extraction isn’t supported for this document style=color:#888;font-style:italic>    if (!parser.getFeatures().isTables()) { return; } // Apply rules to locate table layout style=color:#888;font-style:italic>    TemplateTableLayout layout = new TemplateTableLayout( java.util.Arrays.asList(new Double[]{50.0, 95.0, 275.0, 415.0, 485.0, 545.0}), java.util.Arrays.asList(new Double[]{325.0, 340.0, 365.0, 395.0})); // Configure settings for table extraction style=color:#888;font-style:italic>    PageTableAreaOptions options = new PageTableAreaOptions(layout); // Invoke the extraction process style=color:#888;font-style:italic>    Iterable<PageTableArea> tables = parser.getTables(options); // Loop over all parsed table structures style=color:#888;font-style:italic>    for (PageTableArea t : tables) { // Iterate over each row within the table style=color:#888;font-style:italic>        for (int row = 0; row < t.getRowCount(); row++) { // Process each cell in the current row style=color:#888;font-style:italic>            for (int column = 0; column < t.getColumnCount(); column++) { // Access and read the current cell's content style=color:#888;font-style:italic>                PageTableAreaCell cell = t.getCell(row, column); if (cell != null) { // Output the textual value of each table cell style=color:#888;font-style:italic>                    System.out.print(cell.getText()); System.out.print(" | "); } } } } }

`Introduction to GroupDocs.Parser for Java API`

GroupDocs.Parser is a feature-rich content extraction API for Java platforms. It allows developers to accurately parse tables, text, graphics, links, and structured data from PDFs, Word documents, Excel sheets, PowerPoint presentations, and more—without requiring third-party plugins.

Learn more

`Ready to get started?`

Download GroupDocs.Parser for free or get a trial license for full access!

Maven Download
Start Free Trial

`Useful resources`

Explore documentation, code samples, and community support to enhance your experience.


Documentation
API reference
Code samples
Free support
Paid support

`Document types supported for table extraction`

GroupDocs.Parser provides reliable table detection across multiple file types. Here’s a list of the most widely supported document formats for extracting tables.

Parse PDF(Portable Document Format)
Parse DOCX(Office 2007+ Word Document)
Parse PPTX(Open XML presentation Format)
Parse XLSX(Open XML Workbook)
Parse TXT(Text file)
Parse RTF(Rich Text Format)
Parse EPUB(Open eBook File)

Retrieve table data from XML using Java

How to retrieve tables from Xml in Java

Advanced content extraction tools

Precise table parsing across formats

Read tabular structures from diverse sources

Customizable table extraction settings

Sample: extract tables from an Excel document

Java

Introduction to GroupDocs.Parser for Java API

Ready to get started?

Useful resources

Document types supported for table extraction

Temporary license tips

`Introduction to GroupDocs.Parser for Java API`

`Ready to get started?`

`Useful resources`

`Document types supported for table extraction`

`Temporary license tips`