PDF to HTML – Convert PDF to HML in C#

PDF to HTML

Most people use Portable Document Format (PDF) owing to the fact that document formatting is preserved when viewing these files on any platform. The fidelity of the document is not compromised when viewing either on a Desktop or mobile platform. However, in order to view the PDF file, you need to use a specific viewer application which may incur installation and setup costs. In order to overcome such shortcomings, the HTML format can be one of the viable solutions because all platforms do have a web browser and HTML is the native language for the web browsers. Therefore, the documents can be viewed without any additional effort and costs.

In this article, we are going to discuss the following topics in details

PDF processing REST API

We have developed Aspose.PDF Cloud, which is based on REST architecture and offers the capabilities to create, edit as well as transform PDF file formats to other Supported Document Formats including HTML, JPEG, DOCX, PPTX, XLSX, SVG, etc. In this article, our focus is on the steps on how to perform PDF file conversion to HTML format within C# .NET code. This can be accomplished using Aspose.PDF Cloud SDK for .NET which is a wrapper around Aspose.PDF Cloud so that you get all the features of REST API within your .NET application.

Installation

In order to use the Cloud SDK, the first step is to install it over the local system. For the user’s convenience, we have uploaded the SDK on NuGet and GitHub. So, in order to install it from the NuGet library, please execute the following command in the NuGet package manager:

Install-Package Aspose.Pdf-Cloud

You may also consider installing the SDK directly within the Visual Studio project as the NuGet package. All you need to do is expand the project tree in Solution Explorer, right-click the Packages folder and select Manage NuGet Packages… option from the context menu.

Manage NuGet packages.
Image 1:- Manage NuGet packages.

Now search Aspose.PDF Cloud in the search field, enable the checkbox beside the package name and click Add Package button.

Aspose.PDF Cloud SDK
Image 2:- Aspose.PDF Cloud SDK in NuGet packages.

Notice that Aspose.Pdf-Cloud.dll appears under the Packages folder.

Aspose.Pdf-cloud.dll
Image 3:- Aspose.Pdf-cloud.dll under solution explorer.

Licensing

Our payment plan is quite flexible and our customers do not need to pay any upfront cost. They are only required to pay as per their usage which can be as low as $0.005 / API Call. You may visit the pricing page for further details. However, before you opt for licensing, you may create a free account by visiting Aspose.Cloud dashboard and test our APIs up to 150 free document processing calls.

You may Sign In using your existing GitHub or Google account or, click on the Create a new Account button and provide the required information. Then login to the dashboard using credentials and expand the Applications section from the dashboard and scroll down towards the Client Credentials section to see Client ID and Client Secret details.

Client credentials
Image 4:- Client credentials on the dashboard.

HTML to PDf – output in response

With fewer code lines, you can easily convert an HTML available in Cloud storage to PDF format. The resultant file will be returned as a stream context and later, it can be saved. In order to perform this conversion, we are going to use the GetPdfInStorageToHtml API. Please follow the steps given below to perform the PDF to HTML conversion operation.

  • The first step is to create a String variable defining Client ID and Client Secret details
  • Secondly, create an instance of PdfApi while passing Client ID and Client Secret variables are arguments
  • Thirdly, read the content of PDF file and load in Stream instance
  • Then upload the PDF file to Cloud stroage using UploadFile(…) method of PdfApi class
  • Finally, call the GetPdfInStorageToHtml(…) method to perform the conversion

HTML to PDF – result in cloud storage

Another approach can be the conversion of PDF files to HTML format and saving the output in cloud storage. Please follow the steps given below to accomplish this requirement:

  • Firstly, create an instance of PdfApi class by passing Client ID and Client Secret details as arguments
  • Seconldy, read the input PDF file to Stream instance and specify output file name with .zip extension
  • Thirdly, upload the PDF file to Cloud storage using UploadFile(…) method
  • Penultimate, call the PutPdfInStorageToHtml(…) method which takes input PDF filename and resultant file name as arguments
  • Finally, print the response code in console

Convert local HTML to PDF – output in Cloud storage

In this section, we are going to discuss the steps on how we can load the local HTML file, perform the conversion and save the output in Cloud storage.

  • The first step is to create an instance of PdfApi while passing Client ID and Client Secret as arguments
  • Secondly, define string variables for input PDF and resultant .zip to be stored in cloud storage
  • Thirdly, load the input PDF into stream instance
  • Finally, call the PutPdfInRequestToHtml(…) method which takes resultant file name and stream holding PDF as arguments. The output is saved in Cloud storage

Conclusion

In this article, we have discussed the steps on how to convert a PDF file to HTML in various approaches including the PDF file already stored in Cloud storage or loading it from the local system. We have learned that the conversion can be performed with fewer code lines without compromising the fidelity of the document. Please be noted that Aspose.PDF Cloud SDK for .NET is distributed under an MIT license and its complete source code is available for download over GitHub. In case you encounter any issue while using the API, or you have any related query, please feel free to contact through the Free product support forum.

Related Links

We also recommend visiting the following links to learn more about