word to html

Convert Word to HTML in Java

In our daily life, we have to deal with Microsoft Word(DOC/DOCX) documents both for personal and official purposes. Similarly, we may have a requirement to share these documents over the internet and in order to open/view these documents, the recipient requires particular applications i.e. MS Word, OpenOffice etc. Furthermore, some restrictive environments may not have the permissions to instal any additional applications, so in such scenarios, the conversion of Word to HTML can be a viable solution. With this approach, we can easily open a Word document in web browser (without installing any additional software). So this article is going to explain the steps on how to Convert Word to HTML using Java Cloud SDK.

Word to HTML Conversion REST API

Aspose.Words Cloud is a REST based solution offering the capabilities to programmatically create, edit and transform MS Word documents to variety of supported formats. Now as per the scope of this article, we are going to use Aspose.Words Cloud SDK for Java which enables us to utilize all the word document conversion capabilities in Java application. So in order to use this SDK, we need to add its reference in our Java project by including the following information in pom.xml (maven build type project).

<repositories> 
    <repository>
        <id>aspose-cloud</id>
        <name>artifact.aspose-cloud-releases</name>
        <url>http://artifact.aspose.cloud/repo</url>
    </repository>   
</repositories>

<dependencies>
    <dependency>
        <groupId>com.aspose</groupId>
        <artifactId>aspose-words-cloud</artifactId>
        <version>22.12.0</version>
    </dependency>
</dependencies>

The next important step is to obtain your client credentials from Cloud Dashboard. If you are not already registered, you need to first register a free account via a valid email address and then obtain your credentials.

Convert Word to HTML in Java

We are going to discuss the steps and their related details on how to convert Word to HTML using Java code snippet.

  • Create an WordsApi object where we pass personalized credentials as arguments
  • Now load the input Word document content using readAllBytes(…) method and get returned value in byte[] array
  • The next step is to create an object of ConvertDocumentRequest class, which takes input Word file, HTML format and resultant file name as arguments
  • Finally, call the method convertDocument(…) to perform Word to HTML conversion. After successful conversion, the resultant HTML document is stored in cloud storage
word to html

Image:- Word to HTML Document conversion preview

The sample Word document used in above example can be downloaded from test_multi_pages.docx.

DOCX to HTML using cURL Commands

The REST APIs provide an ease of access via cURL commands on any platform. So in this section, we are going to discuss the details on how to convert DOCX to HTML using cURL commands. So the first step is to generate the JWT access token (based on client credentials) using following command.

curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=bb959721-5780-4be6-be35-ff5c3a6aa4a2&client_secret=4d84d5f6584160cbd91dba1fe145db14" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

Now we need to execute the following command to perform Word to HTML conversion where the input Word document is expected to be available in cloud storage and after the conversion, we are going to save the resultant HTML document on local drive.

curl -v -X GET "https://api.aspose.cloud/v4.0/words/test_multi_pages.docx?format=html" \
-H  "accept: application/octet-stream" \
-H  "Authorization: Bearer <JWT Token>" \
-o "newOutput.html"

We can also save the resultant file directly in cloud storage and for that reason, we simply need to provide value for outPath parameter (as shown below)

curl -v -X GET "https://api.aspose.cloud/v4.0/words/test_multi_pages.docx?format=html&outPath=output.html" \
-H  "accept: application/octet-stream" \
-H  "Authorization: Bearer <JWT Token>"

Conclusion

Now that we have reached the end of this article, we have learned the details on how to programmatically convert Word to HTML using Java. We have also seen the options of converting DOCX to HTML via cURL commands. For quick test purposes, you may also try accessing the API through SwaggerUI within a web browser and at the same time, you may consider exploring the Product Documentation which is an amazing source of information.

In case you need to download and modify the source code of the Cloud SDK, it is freely available on GitHub (published under MIT license). Lastly, in case you encounter any issues while using the API or you have any related query, you may consider approaching us for a quick resolution via free product support forum.

Please visit the following links to learn more about: