How to Extract Tables from DOC files via .NET API?

Table is the collection of cells arranged in rows and columns. Tables play a very important role in storing as well as organizing detailed or complicated data allowing the users to easily read and view it. Tables can be used in many ways, such as making lists, comparing information, align data, group information, highlight trends or patterns in data and many more. GroupDocs.Parser for .NET is a useufly API that allows software programmers to develop solution for extracting tables, text and images from various kinds of supported documents formats, such as such as PDF, Emails, Ebooks, Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), Emails (EML, MSG) formats and many more. The .NET API has included several important features for working with tables, such as extract all tables from a documents, extract table from a particular page, get table cell data, get total number of a table rows and columns, get row height, print data of a table and may more.

Extract tables from DOC in .NET

GroupDocs.Parser for .NET makes it easy for C# developers to extract tables from a DOC file by implementing a few easy steps.

Instantiate Parser object for the initial document;
Check if the document supports table extraction;
Instantiate PageTableAreaOptions and TemplateTableLayout classes to set the layout of tables
Call GetTables method and obtain collection of PageTableArea objects;

Learn more about the tables extraction

How to extract tables from DOC file using C# example code

// Extract tables from DOC file using GroupDocs.Parser API
// Create an instance of Parser class
using (Parser parser = new Parser(filePath)) {
    // Check if the document supports table extraction
    if (!parser.Features.Tables) {
        Console.WriteLine("Document isn't supports tables extraction.");
        return;
    }
    // Create the layout of tables
    TemplateTableLayout layout = new TemplateTableLayout(
        new double[] { 50, 95, 275, 415, 485, 545 },
        new double[] { 325, 340, 365, 395 });
    // Create the options for table extraction
    PageTableAreaOptions options = new PageTableAreaOptions(layout);
    // Extract tables from the document.
    IEnumerable<PageTableArea> tables = parser.GetTables(options);
    // Iterate over tables
    foreach (PageTableArea t in tables) {
        // Iterate over rows
        for (int row = 0; row < t.RowCount; row++) {
            // Iterate over columns
            for (int column = 0; column < t.ColumnCount; column++) {
                // Get the table cell
                PageTableAreaCell cell = t[row, column];
                if (cell != null) {
                    // Print the table cell text
                    Console.Write(cell.Text);
                    Console.Write(" | ");
                }
            }
            Console.WriteLine();
        }
        Console.WriteLine();
    }
}

System Requirements

GroupDocs.Parser for .NET APIs are supported on all major platforms and operating systems. Before executing the code below, please make sure that you have the following prerequisites installed on your system.

Operating Systems: Microsoft Windows, Linux, MacOS
Development Environments: Microsoft Visual Studio, Xamarin, MonoDevelop
Frameworks
Download the latest version of GroupDocs.Parser for .NET from Nuget