PDF to HTML

Convert PDF to HTML using .NET REST API.

The PDF documents are a popular choice for sharing information due to their consistent formatting across different devices and platforms. But when it comes to displaying content on a website, PDFs may not always be the most user-friendly option. However, the conversion of PDF files to HTML format opens up a world of possibilities for web developers and content creators. Another reason for this conversion is that the content presentation and accessibility are crucial factors for online success as it becomes indexable by search engines.

In this article, we will delve into the details on how to convert PDF to HTML using .NET REST API.

REST API for PDF to HTML Conversion

Accomplishing the PDF to HTML conversion is made easy with the powerful capabilities of Aspose.PDF Cloud SDK for .NET. This API allows you to seamlessly integrate PDF conversion functionality into your .NET applications and workflows. With just a few lines of code, you can effortlessly convert PDF documents to HTML format, making them suitable for web display and interaction. The REST API provides a wide range of features to control the conversion process. You may also customize the output HTML by specifying default font name, document type, layout, image resolution and various other configurations.

In order to begin with this conversion process, first we need to add the SDK reference in our project and for that purpose, please search Aspose.PDF-Cloud in NuGet packages manager inside Visual Studio IDE and click the Add Package button. You also need to obtain your client credentials from cloud dashboard. In case you do not have an existing account, simply create a free account by following the instructions specified over quick start.

Convert PDF to HTML using C# .NET

Now we need to execute the following code snippet to perform the conversion so that we can render PDF to website.

PDF to HTML

PDF to HTML conversion preview.

Given below are quick details regarding above stated code snippet.

PdfApi pdfApi = new PdfApi(clientSecret, clientID);

Firstly, create an object of PdfApi class while passing client credentials as input arguments.

var pdfFile = System.IO.File.OpenRead(inputFile);

Read the content of PDF file from local drive.

pdfApi.PutPdfInRequestToHtml("converted.html",documentType: "Html5", splitIntoPages: true, rasterImagesSavingMode: "AsPngImagesEmbeddedIntoSvg", outputFormat: "Zip" , file: pdfFile);

Call the API to convert the PDF from input stream to HTML format. During conversion, we have specified the value to save each PDF page to an individual HTML file.

Please visit PutPdfInRequestToHtml for a complete list of arguments support by this API call and their related details.

The input PDF document used in the above example can be downloaded from Binder1.pdf.

PDF to HTML Online using cURL Commands

Converting PDF to HTML using cURL commands in combination with Aspose.PDF Cloud is also a versatile and efficient approach. By leveraging the power of cURL commands, you can easily integrate the Aspose.PDF Cloud API into your applications and automate the PDF to HTML conversion process. Furthermore, using cURL commands allows an easy interaction with RESTful endpoints, enabling seamless communication and data exchange. So in order to display PDF in HTML browser, we simply need to convert PDF files to HTML by calling few cURL commands, and it significantly reduces the development time and effort.

The first step in this approach is the generation of a JWT access token. So, please execute the following command:

curl -v "https://api.aspose.cloud/connect/token" \
 -X POST \
 -d "grant_type=client_credentials&client_id=bb959721-5780-4be6-be35-ff5c3a6aa4a2&client_secret=4d84d5f6584160cbd91dba1fe145db14" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -H "Accept: application/json"

Now we need to execute the following cURL command which loads the PDF file from cloud storage, converts the whole document to HTML format and saves the output as .ZIP archive on local drive (name specified with -o argument).

curl -v "https://api.aspose.cloud/v3.0/pdf/{inputPDF}/convert/html?compressSvgGraphicsIfAny=false&documentType=Html5&fixedLayout=true&splitIntoPages=false&rasterImagesSavingMode=AsPngImagesEmbeddedIntoSvg&removeEmptyAreasOnTopAndBottom=true&flowLayoutParagraphFullWidth=true" \
-X GET \
-H  "accept: multipart/form-data" \
-H  "authorization: Bearer {accessToken}" \
-o "Converted.zip"

Replace inputPDF with the name of an input PDF document available in cloud storage, and accessToken with JWT access token generated above.

Conclusion

In conclusion, converting PDF to HTML with Aspose.PDF Cloud API provides a comprehensive and versatile solution. Whether using .NET REST API for seamless integration or cURL commands for efficient conversion, the Aspose.PDF Cloud SDK offers extensive features and customization options. These approaches ensure accurate rendering of PDF content into responsive HTML, preserving the layout and formatting for an optimal user experience. Additionally, the capability to embed PDF content in HTML pages allows a versatile and interactive web applications, enhancing accessibility and user engagement.

We highly recommend visiting the following blogs: