PDF is one of the widely used file formats for information sharing. It’s popular due to the fact that it preserves document fidelity on all platforms and all devices (desktop, mobile, etc). However, if we need to make any changes to the PDF file, we need to use specific applications to open and edit PDF documents. But for a large number of updates, the conversion of PDF files to Word documents is one viable solution. Also, for bulk conversion, a programming SDK is an effective solution. In this article, we are going to discuss the conversion of PDF to Word using Python SDK.
Word Processing API
Aspose.Words Cloud is our award-winning REST-based API offering the capabilities to create, edit and transform Word files HTML, JPEG, PNG, and other supported file formats. At the same time, it also supports the capabilities to load PDF documents and render them to MS Word (DOCX, DOC, DOT, RTF, DOCM) or OpenDocument (ODT, OTT). In order to perform this conversion, no third-party software download or installation is required, and perform all the conversion using our document processing engine in Cloud. Now in order to implement the document conversion operation within the Python application, you need to try using Aspose.Words Cloud SDK for Python, which is a wrapper around Cloud API.
pip install aspose-words-cloud
If you are using PyCharm IDE, you may directly add the SDK as a dependency in your project.
File -> Settings -> Project -> Python Interpreter -> asposewordscloud
Convert PDF to Word in Python
Please follow the instructions below to perform the conversion of the PDF file to Word format.
- First we need to create ApiClient object while passing ClientID and ClientSecret details as arguments
- Secondly, create an instance of WordsApi while passing ApiClient instance as argument
- Thirdly, upload PDF file to Cloud storage using UploadFileRequest(..) method
- Now create an object of SaveOptionsData object where we define docx as export format
- Next step is to create an instance of SaveAsRequest which takes PDF file name and SaveOptionsData object as arguments
- Finally, call the save_as(..) of WordsApi class to perform the conversion operation
PDF to Word using cURL Command
Like other REST APIs, Aspose.Words Cloud can also be accessed via cURL commands. But before we proceed towards API access, we need to generate a JWT access token based on Client Credentials specified over Aspose.Cloud dashboard. Please execute the following cURL command to generate the JWT access token.
curl -v "https://api.aspose.cloud/connect/token" \ -X POST \ -d "grant_type=client_credentials&client_id=4ccf1790-accc-41e9-8d18-a78dbb2ed1aa&client_secret=caac6e3d4a4724b2feb53f4e460eade3" \ -H "Content-Type: application/x-www-form-urlencoded" \ -H "Accept: application/json"
Now we can use the following command to convert PDF files available in Cloud storage to Word format. In the following command, we have used the -o parameter to save output on the local drive.
curl -X GET "https://api.aspose.cloud/v4.0/words/awesome_table_in_pdf.pdf?format=docx" \ -H "accept: application/octet-stream" \ -H "Authorization: Bearer <JWT Token>" \ -o Converted.docx
Please use the following command if you need to directly save the output word document in Cloud storage. Please notice the outPath request parameter in the following command.
curl -X GET "https://api.aspose.cloud/v4.0/words/awesome_table_in_pdf.pdf?format=docx&outPath=newResultant.docx" \ -H "accept: application/octet-stream" \ -H "Authorization: Bearer <JWT Token>"
In this article, we have explored the amazing capabilities of Aspose.Words Cloud regarding PDF to Word format conversion. In order to test the API, you may directly access it within a web browser using the Swagger interface. Furthermore, the Cloud SDK is developed under the MIT license, so its complete source code is available over the GitHub repository.
In case you encounter any issues while using the API or you have any related queries, please contact us via a free product support forum.
We recommend visiting the following links to learn more about