Table is the collection of cells arranged in rows and columns. Tables play a very important role in storing as well as organizing detailed or complicated data allowing the users to easily read and view it. Tables can be used in many ways, such as making lists, comparing information, align data, group information, highlight trends or patterns in data and many more. GroupDocs.Parser for .NET is a useufly API that allows software programmers to develop solution for extracting tables, text and images from various kinds of supported documents formats, such as such as PDF, Emails, Ebooks, Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), Emails (EML, MSG) formats and many more. The .NET API has included several important features for working with tables, such as extract all tables from a documents, extract table from a particular page, get table cell data, get total number of a table rows and columns, get row height, print data of a table and may more.
GroupDocs.Parser for .NET makes it easy for C# developers to extract tables from a DOC file by implementing a few easy steps.
// Extract tables from DOC file using GroupDocs.Parser API
// Create an instance of Parser class
using (Parser parser = new Parser(filePath)) {
// Check if the document supports table extraction
if (!parser.Features.Tables) {
Console.WriteLine("Document isn't supports tables extraction.");
return;
}
// Create the layout of tables
TemplateTableLayout layout = new TemplateTableLayout(
new double[] { 50, 95, 275, 415, 485, 545 },
new double[] { 325, 340, 365, 395 });
// Create the options for table extraction
PageTableAreaOptions options = new PageTableAreaOptions(layout);
// Extract tables from the document.
IEnumerable<PageTableArea> tables = parser.GetTables(options);
// Iterate over tables
foreach (PageTableArea t in tables) {
// Iterate over rows
for (int row = 0; row < t.RowCount; row++) {
// Iterate over columns
for (int column = 0; column < t.ColumnCount; column++) {
// Get the table cell
PageTableAreaCell cell = t[row, column];
if (cell != null) {
// Print the table cell text
Console.Write(cell.Text);
Console.Write(" | ");
}
}
Console.WriteLine();
}
Console.WriteLine();
}
}
GroupDocs.Parser for .NET APIs are supported on all major platforms and operating systems. Before executing the code below, please make sure that you have the following prerequisites installed on your system.
.NET documents parse & table scanning API for file formats and images. Extract data for some of the popular file formats as stated below.
(Microsoft Word 2007 Marco File)
(Office 2007+ Word Document)
(Microsoft Word Template Files)
(Microsoft Word 2007+ Template File)
(Microsoft Word Template File )
(Open eBook File)
(Hyper Text Markup Language)
(MHTML Web Archive)
(Web Page Archive Format)
(OpenDocument Presentation Format)
(OpenDocument Spreadsheet)
(OpenDocument Text File Format)
(OneNote Document)
(OpenDocument Standard Format)
(OpenDocument Standard Format)
(Portable Document Format)