In this article, we are going to discuss the conversion of PDF to PDF/A using Python code. We know that PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. Furthermore, the PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking (as opposed to font embedding) and encryption. However, it contains everything needed to display it and nothing which could negatively impact the display. So in this article, we are going to discuss the steps on how to convert PDF to PDF/A using Python.
PDF Processing SDK
Aspose.PDF Cloud SDK for Python is a wrapper around Aspose.PDF Cloud (a REST-based API), offering the capabilities to create, edit and transform PDF files to popular file formats including PDF_A_3A, XLSX, PPTX, DOCX, HTML, SVG, JPEG, etc formats. With few lines of code, perform all the conversions with great fidelity. No additional software download or installation is required.
Now in order to use the SDK, we need to first install it over the system. It is available for free download over PIP and GitHub repository. Now execute the following command on the terminal/command prompt to install the latest version of SDK on the system.
pip install asposepdfcloud
Free Cloud Dashboard Account
After the installation, the next major step is a free subscription to our cloud services via Aspose.Cloud dashboard. The purpose of this subscription is to only allow authorized persons to access our file processing services. If you have GitHub or Google account, simply Sign Up or, click on the Create a new Account button and provide the required information. Now login to the dashboard using credentials and expand the Applications section from the dashboard and scroll down towards the Client Credentials section to see Client ID and Client Secret details.
PDF to PDF/A using Python
Please follow the instructions specified below to convert the PDF file to PDF/A format. Please note that you get the option to select either of the PDF/A compliance formats during conversion (PDF/A-1a, PDF/A-1b, PDF/A-3a).
- Firstly, we need to create an instance of ApiClient class while providing Client ID Client Secret as arguments
- Secondly, create an instance of PdfApi class which takes ApiClient object as input argument
- Thirdly, create varibales holding names of input PDF file and resultant PDF/A file
- Finally, call the put_pdf_in_storage_to_pdf_a(..) method of PdfApi class to convert PDF to PDF/A and save resultant file to cloud storage
Convert PDF to PDF/A using cURL Command
The REST APIs can easily be accessed via command-line terminals while using the cURL commands. Since Aspose.PDF Cloud is developed as per REST architecture, so we can also access the Cloud API via command prompt and convert PDF files to PDF/A format. However, a pre-requisite here is to generate a JSON Web Token (JWT) based on your individual client credentials specified over Aspose.Cloud dashboard. It is mandatory because our APIs are only accessible to registered users. Please execute the following command to generate the JWT token.
curl -v "https://api.aspose.cloud/connect/token" \ -X POST \ -d "grant_type=client_credentials&client_id=bbf94a2c-6d7e-4020-b4d2-b9809741374e&client_secret=1c9379bb7d701c26cc87e741a29987bb" \ -H "Content-Type: application/x-www-form-urlencoded" \ -H "Accept: application/json"
Now that the JWT token is generated, please execute the following command to convert PDF file already available in cloud storage and convert it to PDF/A_1a format. As a resultant file is returned as a response stream, so we can save it to the local drive while using the -o argument.
curl -v -X GET "https://api.aspose.cloud/v3.0/pdf/awesomeTable.pdf/convert/pdfa?type=PDFA1A" \ -H "accept: multipart/form-data" \ -H "authorization: Bearer <JWT Token>" \ -o Converted.pdf
In this article, we have discussed the steps and details on how conveniently we can transform a PDF file to PDF/A format. We have seen that with few code lines, this whole operation can be accomplished, and also, with the help of the cURL command, we can transfer the PDF document to PDF/A_1a format. Please note that you can perform up to 150 document conversion/processing requests under a free license and once you are satisfied with our services, you may opt for a license subscription which can be as low as $0.005 / API call.
The complete source code of Apsose.PDF Cloud SDK for Python is available for download under MIT license over GitHub.
We also recommend visiting the following links to learn more about