PDF forms are among the famous file types where you may enter information, select items from drop-down boxes, and also check fields as needed. PDF currently supports two different methods for integrating data and PDF forms i.e. AcroForms (also known as Acrobat forms) and XML Forms Architecture (XFA) forms. In order to fill the forms, you may either complete the form online or save a copy of the form on your computer and use Adobe Acrobat Reader to fill in the form. Once the form has been filled, you would be interested in using the data filled inside the documents and for this purpose, you may need to export it to XML, FDF, and XFDF formats. In this article, we are going to discuss the details on how we can programmatically export PDF to XML using Python as well as other supported formats.

PDF Processing API

Aspose.PDF Cloud is an award-winning REST API offering the capabilities to create, edit and transform various file formats to PDF format. At the same time, it also supports the feature to export PDF files to XLSX, PPTX, DOCX, EPUB, HTML, etc, and various other support formats. It also enables you to work with PDF forms where you can Get all Form Fields from the PDF Document, Create a Form Field in a PDF Document, Update a Form Field in a PDF Document, etc. Since the API is REST-based, so it can be accessed on any platform and in any application i.e. Desktop, Mobile, Web, Hybrid, etc. Now in order to further facilitate our users, we have created programming language-specific SDKs so that you get all the PDF processing capabilities within the language of your choice.

Now in order to implement PDF processing capabilities in Python application, we have created Aspose.PDF Cloud SDK for Python which is a wrapper around Aspose.PDF Cloud. Therefore, when using this SDK, you get all the PDF processing capabilities within your Python application. Now in order to get started with this SDK, the first step is its installation. The SDK is available for free download over PIP and GitHub repository. Simply execute the following command on the terminal/command prompt to install the latest version of SDK on the system.

pip install asposepdfcloud

Free Cloud Dashboard Account

The next major step is a free subscription to our cloud services via Aspose.Cloud dashboard. The purpose of this subscription is to only allow authorized persons to access our file processing services. If you have GitHub or Google account, simply Sign Up or, click on the Create a new Account button and provide the required information. Now login to the dashboard using credentials and expand the Applications section from the dashboard and scroll down towards the Client Credentials section to see Client ID and Client Secret details.

Client credentials

Image 1:- Client Credentials on Aspose.Cloud Dashboard.

Export PDF to XML using Python

Please follow the instructions specified below to export PDF to XML using Python. We have provided two APIs to accomplish this requirement

API Type URL Description
/pdf/{name}/export/xml GET GetExportFieldsFromPdfToXmlInStorage Export fields from from PDF in storage to XML file.
/pdf/{name}/export/xml PUT PutExportFieldsFromPdfToXmlInStorage Export fields from PDF in storage to an XML file in storage.
  • Firstly, create an instance of ApiClient class while providing Client ID Client Secret as arguments
  • Secondly, create an instance of PdfApi class which takes ApiClient object as input argument
  • Now call the put_export_fields_from_pdf_to_xml_in_storage(..) method to export PDF form data to XML and save resultant file to Cloud storage
PDF to XML export preview

Image 2:- PDF data exported as XML.

Export PDF to FDF using Python

Please follow the instructions below to export PDF form data to FDF format and save the output in Cloud storage.

  • Create an instance of ApiClient class while providing Client ID Client Secret as arguments
  • Now create an instance of PdfApi class which takes ApiClient object as input argument
  • Finally, call the put_export_fields_from_pdf_to_fdf_in_storage(..) method to export PDF form data to FDF and save resultant file to Cloud storage

Please visit the following links to download the source FormData-Filled.pdf and exported exportedData.fdf.

Export PDF to XFDF using Python

XFDF file is an Acrobat Forms Document file that stores information usable by a PDF file, so the XFDF file inserts that data directly into the PDF. Therefore, we may have a requirement to export data from one PDF form and fill in other PDF forms. So in this section, we are going to discuss the steps on how to export PDF form data to XFDF format.

  • Create an instance of ApiClient class while providing Client ID Client Secret as arguments
  • Now create an instance of PdfApi class which takes ApiClient object as input argument
  • Finally, call the put_export_fields_from_pdf_to_xfdf_in_storage(..) method to export PDF form data to XFDF and save resultant file to Cloud storage

The sample output generated in the above code can be downloaded from exportedData.xfdf.

Export PDF to XFDF

Image 3:- PDF data exported to XFDF

PDF to XML using cURL Commands

The REST APIs can also be accessed via cURL commands and they can be accessed using the terminal application on any platform. Since Aspose.PDF Cloud is developed as per REST architecture, so it can also be accessed via the cURL command. However, before performing any operation, we need to generate a JSON Web Token (JWT) based on your individual client credentials specified over Aspose.Cloud dashboard. It is mandatory because our APIs are only accessible to registered users. Please execute the following command to generate the JWT token.

curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=bbf94a2c-6d7e-4020-b4d2-b9809741374e&client_secret=1c9379bb7d701c26cc87e741a29987bb" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

Once the JWT token is generated, please execute the following command to export PDF form data to XML format.

curl -v -X PUT "https://api.aspose.cloud/v3.0/pdf/FormData-Filled.pdf/export/xml?xmlOutputFilePath=FormDataExported.xml" \
-H  "accept: application/json" \
-H  "authorization: Bearer <JWT Token>" \
-d{}

Conclusion

In this article, we have explored the details and steps on how to export PDF form data to XML, FDF, and XFDF formats. These requirements can be accomplished using Python code snippets as well as using the cURL commands. We also recommend exploring the Developer Guide to learn about other exciting features offered by the API. Furthermore, the complete source code of Apsose.PDF Cloud SDK for Python has been made available for download over GitHub. In case you encounter any issues while using the API or you have any further queries, please feel free to contact us via the Free product support forum.

We also recommend visiting the following links to learn more about