PDF Form Parser, Form Recognizer, PDF Filling with REST API

One of the major advantages of the PDF format is the “What You See Is What You Get” (WYSIWYG) approach. Therefore, PDF format has high standards of rendering and viewing. Whatever appears on your screen, will appear in the same manner on other users’ devices. No matter which Operating system, screen resolution, or software the person is using. So given all PDF is consistent across all platforms no matter what device you are viewing documents on.

Other than viewing, the fidelity of contents is also ensured when printing the PDF files. Once the data has been filled in documents, the security of data is even more important. So in order to ensure the data integrity and the right person shall have access to data. When considering using electronic documents as evidence, they must be in an unaltered original version. Then such electronic documents or data messages become admissible in court. Also, you get a plethora of PDF form parser applications for further processing.

Now considering an HTML format, which is similar to PDF in terms of cross-platform functionality. But again they need to be testing on various platforms and browsers to ensure that they function well on them. Not just viewing, but we also need to test the printing feature. In order to validate the claim against each browser, a huge amount of testing needs to be employed. When looking at the data security aspect of HTML forms, security always remains to be an issue.

PDF AcroForms
Adobe XFA forms
Aspose.PDF Cloud
Read Form Fields
Create or Replace PDF form fields

PDF AcroForms

This technique involves adding the form fields as an overlay on top of the image of a form. Adobe later introduced XFA Forms (sometimes called Designer Forms) with PDF 1.5 and Acrobat 6 in 2003. Both XFA Forms and AcroForms are supported in Acrobat 6 and above. Whereas, at the moment, AcroForms is widely supported on many third-party PDF viewer applications. AcroForms are PDF files that contain form fields. Data can be entered into these fields by the end-users or the author of the form. Internally AcroForms are annotations or fields applied to a PDF document. They can easily be filled using a Forms Data Format (FDF) file (a formatted ASCII file containing key: value pairs).

Adobe XFA forms

XFA Forms (XML Forms Architecture) represents a significant change in direction for Adobe from the popular FDF and XFDF methodologies. XFA Forms utilize XML throughout. Since XML is the backbone for all structured documents, there are distinct drawbacks when opting for XFA Forms. XFA Forms should not be confused with XForms, the W3C standard for XML-based forms. Adobe’s XFA Forms is a closed standard that competes with the fully open W3C XForms standard. While both are XML-based the XForms standard only specifies the data and not the appearance of the form. Whereas, the XFA Forms specify both the form’s appearance as well as the data.

Aspose.PDF Cloud as PDF form Parser

Forms are one of the intuitive ways to get input from end-users. But, when working with a large set of PDF documents, manual data filling and manipulation can be cumbersome. So in order to facilitate our users to programmatically process PDF forms, we have developed an API named Aspose.PDF Cloud API. It empowers the users to create, update as well as manipulate PDF forms using cURL commands. You may also use individually developed SDKs for the programming language of your choice.

Apart from using it as a PDF form parser, it provides the features to

Add text or image watermarks
Concatenate PDF files
Set and update Annotations
Download PDF attachments
Add or retrieve text from PDF
Replace single or multiple text instances
Render the PDF files to other supported formats as mentioned below

EPUB, HTML, LaTeX, MHT, PCL, DOC, DOCX, MOBIXML, PDFA, PPTX, SVG, TIFF, XLS, XLSX, XML, XPS, PS, XSLFO, PCL, BMP, EMF, GIF, JPEG, PNG, and TIFF.

All of the above-mentioned operations can be performed without installing any specific software.

Read Form Fields

With fewer code lines, you can read the details about the PDF form fields (PDF form parser). So when reading the fields, we can read all the fields inside the document, or get an option to specify a particular page number with fields, or even get access to a specific field by providing its name. Furthermore, in case you do not want to use any specific programming language, then perform a field manipulation operation using the cURL command over the command prompt. Given below are the details of form fields that can be processed using PDF REST API:

Listbox
Combobox
Checkbox
Radiobutton
Textbox
Signature

The GetDocumentTextBoxFields method provides the feature to read text fields from PDF documents. It takes one parameter of an input file name and returns the list of fields along with their attributes.

In order to read the form field details, please try using the following cURL command.

cURL command

// First get the Access Token
// Get Client Id and Client Secret from https://dashboard.aspose.cloud/

curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d 'grant_type=client_credentials&client_id=CLIENT_ID&client_secret=CLIENT_SECRET' \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

curl -X GET "https://api.aspose.cloud/v3.0/pdf/FormDataTextBox.pdf/fields/textbox" \
-H  "accept: application/json" \
-H  "authorization: Bearer <JWT Token>"

Request URL

https://api.aspose.cloud/v3.0/pdf/FormDataTextBox.pdf/fields/textbox

[C#.NET]

The sample file used in the above examples can be downloaded over FormDataTextBox.pdf.

In order to read text fields from certain pages of the document, please try using GetPageTextBoxFields method which requires input file name and pageNumber as an argument.

In case the requirement is to get details related to a particular text field, then please try using GetTextBoxField method which accepts input file name and fieldName as an argument.

Create or Replace PDF Form fields

The API provides great capabilities to add or replace existing fields in the document. The method PostTextBoxFields provides the capabilities to add new textBox fields to the PDF document. In order to accomplish this requirement, we need to provide input file names and field array defining properties of the fields to be added.

In order to replace the existing textBox field, please try using PutTextBoxField method which accepts input document name, fieldName to be replaced, and field property defining properties of a new field to be added.

Please try using our API and in case you encounter any issue, please feel free to post your queries in Aspose.PDF Cloud product support forum.

PDF AcroForms#

Adobe XFA forms#

Aspose.PDF Cloud as PDF form Parser#

Read Form Fields#

cURL command#

Request URL#

Create or Replace PDF Form fields#