GroupDocs.Parser for Java is a text, image and metadata extractor API, supporting more than 50 popular document types to help building business applications with features of parsing raw, structured & formatted text. It also supports parsing documents using predefined templates and allows extracting complex data from invoices and other typical documents with speed and accuracy. GroupDocs.Parser for Java enables you to extract text and metadata from password protected files of all popular formats including Word processing documents, Excel spreadsheets, PowerPoint presentations, OneNote, PDF files and ZIP archives.
Following is an overview of GroupDocs.Parser for Java:
GroupDocs.Parser for Java supports following document file formats:
GroupDocs.Parser for Java supports following Operating Systems, Frameworks & Package Managers:
With GroupDocs.Parser for Java, you can apply various formatters to the Text and HTML. You can pull text with Plain Text Formatter for both Simple and ASCII. You can also get Text with HTML Formatter and apply formatting to paragraph, hyperlink, font, headings, lists and tables.