Form Parser, Form Recogniser, PDF Filling, Automate form processing with web REST API

Fillable PDF form icon

One of the major advantages of the PDF format is the “What You See Is What You Get’ (WYSIWYG) approach and therefore, PDF format has high standards of rendering and viewing. Whatever appears on your screen, will appear in the same manner on other users’ devices. No matter which Operating system, screen resolution, or software the person is using (provided the viewing software properly follows Adobe standard specifications). So given all the above, the viewing of PDF is consistent across all platforms no matter what device you are viewing documents on. Other than viewing, the fidelity of contents is also ensured when printing the PDF files. Once the data has been filled in documents, the security of data is even more important to ensure the data integrity and the right person shall have access to data. When considering using electronic documents as evidence, they must be in an unaltered original version of the electronic document or data message for it to be admissible in court.

Now considering an HTML format, which is similar to PDF in terms of cross-platform functionality, but again they need to be testing on various platforms and browsers (even individual browser versions) to ensure that they function well on them. Not just viewing, but we also need to test the printing feature and ensure they are printed properly on various platforms. And in order to validate the claim that they have been designed correctly for each browser, a huge amount of testing needs to be employed. When looking at the data security aspect of HTML forms, depending on how the form is saved or exported, security always remains to be an issue.

PDF AcroForms

This technique involves adding the form fields as an overlay on top of the image of a form. Adobe later introduced XFA Forms (sometimes called Designer Forms) with PDF 1.5 and Acrobat 6 in 2003. Both XFA Forms and AcroForms are supported in Acrobat 6 and above. Whereas, at the moment, AcroForms is widely supported on many third-party PDF viewer applications. AcroForms are PDF files that contain form fields. Data can be entered into these fields by the end-users or the author of the form. Internally AcroForms are annotations or fields applied to a PDF document and can easily be filled using a Forms Data Format (FDF) file(a formatted ASCII file which contains key: value pairs defining the field names and associated values) that are used to populate a form.

Adobe XFA forms

XFA Forms (XML Forms Architecture) represents a significant change in direction for Adobe from the popular FDF and XFDF methodologies. XFA Forms utilize XML throughout. Since XML is the backbone for all types of structured documents, there are distinct drawbacks to be considered when opting for XFA Forms. XFA Forms should not be confused with XForms, the W3C standard for XML-based forms. Adobe’s XFA Forms is a closed standard that competes with the fully open W3C XForms standard. While both are XML-based the XForms standard only specifies the data and not the appearance of the form, XFA Forms specify both the form’s appearance as well as the data.

Aspose.PDF Cloud

Forms are one of the intuitive ways to get input from end-users but, when working with a large set of PDF documents, manual data filling and manipulation can be cumbersome and in order to facilitate our users to programmatically process PDF forms, we have developed an API named Aspose.PDF Cloud REST API which empowers the users to create, update as well as manipulate PDF forms using cURL commands or you can use individually developed SDKs for the programming language of your choice. Apart from PDF forms processing, it provides the feature to add text or image watermarks, concatenate PDF files, Set and update Annotations, download PDF attachments, add or retrieve text from PDF, replace single or multiple text instances or render the PDF files to other supported formats such as EPUB, HTML, LaTeX, MHT, PCL, DOC, DOCX, MOBIXML, PDFA, PPTX, SVG, TIFF, XLS, XLSX, XML, XPS, PS, XSLFO, PCL, BMP, EMF, GIF, JPEG, PNG, and TIFF.

All of the above-mentioned operations can be performed without installing any specific software.

Read Form Fields

With fewer code lines, you can read the details about the PDF form fields. So when reading the fields, we can read all the fields inside the document, or get an option to specify a particular page number with fields, or even get access to a specific field by providing its name. Furthermore, in case you do not want to use any specific programming language, then perform a field manipulation operation using the cURL command over the command prompt. Given below are the details of form fields that can be processed using Aspose.PDF Cloud

  • Listbox
  • Combobox
  • Checkbox
  • Radiobutton
  • Textbox
  • Signature

The GetDocumentTextBoxFields method provides the feature to read text fields from PDF documents. It takes one parameter of an input file name and returns the list of fields along with their attributes.

In order to read the form field details, please try using the following cURL command.

cURL command

// First get the Access Token
// Get Client Id and Client Secret from https://dashboard.aspose.cloud/

curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d 'grant_type=client_credentials&client_id=CLIENT_ID&client_secret=CLIENT_SECRET' \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

curl -X GET "https://api.aspose.cloud/v3.0/pdf/FormDataTextBox.pdf/fields/textbox" \
-H  "accept: application/json" \
-H  "authorization: Bearer <JWT Token>"

Request URL

https://api.aspose.cloud/v3.0/pdf/FormDataTextBox.pdf/fields/textbox

[C#.NET]

The sample file used in the above examples can be downloaded over FormDataTextBox.pdf.

In order to read text fields from certain pages of the document, please try using GetPageTextBoxFields method which requires input file name and pageNumber as an argument.

In case the requirement is to get details related to a particular text field, then please try using GetTextBoxField method which accepts input file name and fieldName as an argument.

Create or Replace PDF Form fields

The API provides great capabilities to add or replace existing fields in the document. The method PostTextBoxFields provides the capabilities to add new textBox fields to the PDF document. In order to accomplish this requirement, we need to provide input file names and field array defining properties of the fields to be added.

In order to replace the existing textBox field, please try using PutTextBoxField method which accepts input document name, fieldName to be replaced, and field property defining properties of a new field to be added.

Please try using our API and in case you encounter any issue, please feel free to post your queries in Aspose.PDF Cloud product support forum.