Convert PDF to XML, Set Image Prefix & Fonts in PDF to HTML Conversion


Well-known member
Mar 10, 2010
Programming Experience
What?s new in this release?

The latest version of Aspose.Pdf for .NET (8.9.0) has been released. One of the features we implemented for Aspose.Pdf for .NET early on was converting an XML file to PDF. The API also provides the facility to transform custom XML files to PDF with the help of XSLT. With the release of Aspose.Pdf for .NET 8.9.0, we have introduced a new feature which lets you convert PDF documents to XML features. XML (MobiXML). As usual, we try to keep things simple, so this feature can be accomplished with two simple lines of code. Aspose.Pdf for .NET also offers the feature to determine if source PDF file is signed. As shown on the blog page, you can verify if the source PDF file is signed without providing the signature name. PDF to HTML conversion is one of our prominent features and with every new release we make it more powerful by adding new capabilities. Along with new features, there are many enhancements and bug fixes related to PDF to DOC conversion, customization in saving and referencing images during PDF to HTML conversion, PDF to image conversion, PDF to PDF/A conversion, image to PDF conversion and much more. The list of important new and improved features are given below

- Pdf to XML Conversion
- Provide support to create and fill rows dynamically
- HtmlSaveOptions class should have the property to specify Image save folder path
- Certify a PDF document
- PDF to HTML - add callback to images during conversion
- PDF to SVG - Save all images/assets in a separate folder
- PDF to HTML - specify prefix for images
- PDF to HTML - set a URL prefix for fonts in the style.css file
- PDF to HTML - set prefix for URLs of SVG files
- Determine if the source PDF is signed
- PageLayout should support "Two Page Up"
- PDF to DOC/DOCX conversion imporvement
- Opportunity to customize saving and referencing images during PDF->to->HTML conversion needed
- PDF to HTML - specify the pages range
- XML to PDF now properly working
- Empty PDF file is generating when converting aspx to PDF is now fixed
- During HTML to PDF conversion <Font> tag's "size" attribute is ignored or is applied in wrong way is now fixed
- PDF to Image: conversion is corrected
- PDF to PDFA1b: Unable to convert eForms to PDFA1b is now fixed
- Removing stamp from PDF files is now enhanced
- Problem is resolved while converting signed PDF to PDF/A1b
- PDF to DOCX: Conversion issues resolved under line and strike off line.
- PDF to PNG: information lost is fixed
- PdfExtractor object extracts rotated image is now fixed
- PDF to TIFF conversion returns blank image is now fixed
- CreateWebLink() incorrectly places link above matching text is now fixed
- PDF to DOC - Strikeout and double strikeout chars are now converted correctly
- Hebrew text is now appearing in form field
- PDF to JPEG - Resultant images issues are resolved
- Enhanced creating PDF file containing images
- HTML to PDF - formatting issues are resolved in resultant file
- PDF to PNG: Text paragraphs missing are fixed
- HTML to PDF - Image is now appearing in resultant PDF when using file: tag for path
- Page break creates a corrupt output document is now fixed
- PDF to Word Conversion - Underlined Text Recognition Issue is resolved
- Page deletion order of a Document results in different output
- TIFF to PDF - Resultant PDF issue is resolved
- PDF to PDFA1a conversion results a corrupt output PDFA1a file is now fixed
- Page extraction issue is resolved
- Setting user password disables print button is now fixed
- PDF to HTML converter : bad right hand adjustment of text is corrected
- PDF to HTML converter - table of contents format: page numbers is now aligned
- PDF to HTML converter - right hand edges of words in text columns not aligned, so justification is corrected
- PDF to HTML converter - bullets shifted from their texts is fixed
- Splitting the PDF to pages produces same size files
- Annotation is removed when creating NUP of PDF file is fixed
- HTML to PDF: Table is now rendered correctly
- PDF to TIFF conversion result bad quality output is enhanced
- All Icons in PDF file are appearing as single object is now fixed
- Unable to remove form fields with same name
- Image to PDF: mirror image issues is resolved
- Setting PDF creator information is now corrected
- Encrypted PDF file recognition issue is fixed
- ComboBox field is now filled
- HTMl to PDF - UL tags are now displayed in PDF

Other most recent bug fixes are also included in this release

Newly added documentation pages and articles

Some new tips and articles have now been added into Aspose.Pdf for .NET documentation that may guide you briefly how to use Aspose.Pdf for performing different tasks like the followings.

- Convert PDF to XML
- PDF to HTML - Set Prefix for URLs of SVG Files

Overview: Aspose.Pdf for .NET

Aspose.Pdf is a .Net Pdf component for the creation and manipulation of Pdf documents without using Adobe Acrobat. Create PDF by API, XML templates & XSL-FO files. It supports form field creation, PDF compression options, table creation & manipulation, graph objects, extensive hyperlink functionality, extended security controls, custom font handling, add or remove bookmarks; TOC; attachments & annotations; import or export PDF form data and many more. Also convert HTML, XSL-FO and MS WORD to PDF.

More about Aspose.Pdf for .NET

- Homepage of Aspose.Pdf for .NET C#
- Online Demo for Aspose.Pdf for .NET
- Download Aspose.Pdf for .NET
- Read online documentation of Aspose.Pdf for .NET
- Post your technical questions/queries to Aspose.Pdf for .NET Forum
- Receive notifications about latest news and supported features by subscribing to Aspose.Pdf for .NET blog

Contact Information
Aspose Pty Ltd, Suite 163,
79 Longueville Road
Lane Cove, NSW, 2066
Aspose - Your File Format Experts
Phone: 888.277.6734
Fax: 866.810.9465