A REST API Solution to Convert PDF to XML with Python

Share on FacebookTweet about this on TwitterShare on LinkedIn

XML is most widely used language for data sharing between humans and computers in this digital era. It provides portable and well-structured information, that makes it easier for applications and devices of all kinds to use, store, transmit, and display data. And in your daily routine, you came across the needs to convert different file formats to XML for data sharing or processing. As you know, PDF is most reliable file format used to exchange and distribute documents. So in this post, I will give you a walk through how to convert PDF to XML with Python using Aspose.PDF Cloud.

Aspose.PDF Cloud is a complete PDF file processing REST API solution, the choice of many Fortune 100 companies across 114 countries. It enables you to create, convert, split, merge, annotate, sign, stamp, watermark & protect PDF files on any platform without installation of any third-party plugin or software. It converts PDF documents to various industry standard file formats. However, in this post we will focus on PDF to XML conversion with Aspose.PDF Cloud SDK for Python. The API is not limited to Python SDK, but SDKs for other popular programming languages are available as well.

Let’s get started…

Step 1

First thing first, install Aspose.PDF SDK for Python package from PyPI.

pip install asposepdfcloud

Step 2

Free sign up with aspose.cloud to get your AppSID and AppKey.

Step 3

Create a Python module and copy paste following code in it. We have uploaded the source PDF document to Aspose default storage and converted PDF to XML in this code.

Step 4

Run the code in your favorite IDE, the output file is saved to Aspose default storage.

Looking forward to your feedback. Feel free to drop us a comment sharing your thoughts about Aspose.PDF Cloud API. Or let us know if you have any suggestions or if you need any particular features which you expect our REST API to have.

Posted in Aspose.PDF Cloud Product Family | Tagged , , , | Leave a comment

Introducing Aspose.Cells Cloud API Version V3

Share on FacebookTweet about this on TwitterShare on LinkedIn

Aspose is pleased to announce a new version of Aspose.Cells Cloud. Aspose.Cells Cloud 19.9 includes new API Version V3, built-in storage APIs along with many other improvements those make it more reliable and stable spreadsheet REST API solution. For complete details of new features and enhancement, please check the release notes of this version. Read on to see what’s new.

V3 API Version

The major change in this release is introduction of API version V3. We’ve implemented Aspose.Cells Cloud as a microservice with new API Version V3. It has more optimized and refined internal architecture. The new base URI will be as follows. Please note legacy v1.1 API version will remain available with older releases, but all new changes and updations will be made in latest API Version(V3) in the future.

https://api.aspose.cloud/v3.0/cells/

For improved security, we have introduced JWT(JSON Web Token) authentication in this release. The OAuth2 and URL signing authentication methods are not supported anymore in API Version V3. Let us show you how to get JWT Access Token.

Get Authentication Token

Storage APIs

This version made easy to work with cloud storage. Now you do not need to use a separate storage API for the purpose. The API includes methods for performing different storage related operations for a better user experience and unification

  • File API – Methods for upload, download, copy, move, delete files: input documents and rendering results, in the cloud storage
  • Folder API – Methods for create, copy, move, delete folders in the cloud storage
  • Storage API – Methods for getting storage information

How it Works

Let me give you a quick overview that how new API Version V3 works and difference between V3 and V1.1 by converting XLSX document to PDF. I’m using cURL for consuming REST APIs. However, Aspose.Cells Cloud also provides SDKs of all popular programming languages via GitHub and external Package Managers, so you can directly implement Aspose.Cells Cloud in your favorite platform with ease.

Here we go, we will follow these steps:

  • Get Access token for authentication
  • Upload source XLSX document to Storage
  • Convert XLSX to PDF
  • Download PDF document to local drive

Aspose.Cells Cloud V1.1

Aspose.Cells Cloud V3

We Want to Hear from You

Feel free to drop us a comment sharing your thoughts about the new version of Aspose.Cells Cloud API. Or let us know if you have any suggestions or if you need any particular features which you expect our REST API to have. And if you’ve not already had a chance to try our REST API, simply start a free trial today. All you need is to sign up with the aspose.cloud. Once you’ve signed up, you’re ready to try the powerful file processing features offered by aspose.cloud.

Posted in Aspose.Cells Cloud Product Family | Tagged , , | Leave a comment

Add Text Box Field in PDF Document with Aspose.PDF Cloud

Share on FacebookTweet about this on TwitterShare on LinkedIn

We’re happy to announce Aspose.PDF Cloud 19.9. With new release, the functionality of the API with PDF form fields is improved. Now you can add a Text Box field to PDF document, read all the Text Box fields from the PDF document, page or by name and update Text Box field in a PDF document with simple HTTP requests. In subsequent sections, I’ll give you and overview of these exciting features. You can check the release notes for the new version to get a complete list of new features and fixes.

Working with Text Box Field

The Text Box field allows the user to input variable information on PDF document. For example, information that is not constant or that cannot be predetermined with radio button choices, such as a name, department, or phone number. You can also create a text area where users can add multiline comments by setting Multiline property true.

Aspose.PDF Cloud API supports following operations with Text Box fields.

Now, let me show you how easily you can add a Text Box field in a PDF document and how to read the Text Box field from PDF documents? I am using Aspose.PDF Cloud SDK for .NET in this post. If you’re using some other programming language, then you can check SDK of your choice from our GitHub repository. It contains the SDKs of all popular programming language with complete source code of the SDK along with the working examples.

To use Aspose.PDF Cloud SDK for .NET, we just need to install it from the NuGet Package Manager in our project and here we go.

Add Text Box field

Get Text Box fields from PDF document

Get Text Box fields from PDF page

Get Text Box field by name

To learn more about Aspose.PDF Cloud using a free trial, all you need is to sign up with the aspose.cloud. Once you’ve signed up, you may go through the following useful resources of Aspose.PDF Cloud.

We are looking forward to your comments below or post a question or suggestion in the support forum. It helps us to continually improve and refine our API.

Posted in Aspose.PDF Cloud Product Family | Tagged , , , | Leave a comment

Convert scanned PDFs to searchable PDF using cURL

Share on FacebookTweet about this on TwitterShare on LinkedIn

PDF is the defacto file type to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Aspose.PDF Cloud provides a number of operations that work seamlessly with your existing PDF documents, allowing you to convert to and from PDF formats, extract document information and manipulate your PDF documents on cloud storage of your choice.

There are two main types of PDF documents – those that are created electronically using PDF creation software and those that are created from a scanner or other photo-imaging equipment. PDF creation software actually builds a PDF document that has an internal structure, denoting characters, fonts and position – although the raw information makes little sense to the human eye. A scanned PDF is basically just a flat image of a document – hence, scanning a page of text results in a picture of words being represented on the screen. In order to take information from this sort of scanned PDF, OCR technology is required so that each character can be optically recognized and then represented. 

Aspose.PDF Cloud provides a powerful inbuilt OCR engine that allows you to recognize and extract text tokens from PDF Documents. Using Aspose.PDF Cloud you can embed OCR layers in a PDF Document, allowing you to search and index your scanned PDF Documents.

Aspose.PDF Cloud OCR support

Aspose.PDF Cloud provides the below API for OCR support with PDF documents

Type Resource URL Description
PUT /pdf/{name}/ocr Generate OCR layer for images in the input PDF document

The above resource accepts the following arguments

Parameter NameDecription
nameThe PDF document to add OCR layer to
lang Language for OCR engine

The language parameter supports recognition of the following language codes eng (English), ara (Arabic) , bel (Belarusian), ben (Bengali), bul (Bulgarian), ces (Czech), dan (Danish), deu (German), ell (Greek), fin (Finnish), fra (French), heb (Hebrew), hin (Hindi), ind (Indonesian), isl (Icelandic), ita (Italian), jpn (Japanese) , kor (Korean), nld (Dutch), nor (Norwegian), pol (Polish), por (Portuguese), ron (Romanian), rus (Russian), spa (Spanish), swe (Swedish), tha (Thai), tur (Turkish), ukr (Ukrainian), vie (Vietnamese), chi_sim (Chinese Simple), chi_tra
(Chinese Traditional) or their combination e.g. eng,rus etc.

Using cURL to add an OCR Layer for embedded images

For testing purposes, we are using a simple PDF with a single image on the first page.

PDF containing text as an image

Read text from OCR Layer

Now that an OCR layer, we can read all text items from the PDF document. You can see the response contains tokens from our embedded image above. Please note this is a partial response.

Have any Question

Feel free to drop us a comment below sharing your thoughts about Aspose.PDF Cloud REST API. Or let’s know if you have any suggestions or if you need any particular features which you expect our REST API to have.

Try It Out

And if you’ve not already had a chance to try our REST API, simply start a free trial today. All you need is to sign up with the aspose.cloud. Once you’ve signed up, you’re ready to try the powerful file processing features offered by aspose.cloud.

Posted in Aspose.PDF Cloud Product Family | Leave a comment

Convert PowerPoint to PDF Document with Aspose.Slides Cloud SDK for .NET

Share on FacebookTweet about this on TwitterShare on LinkedIn

Over the years, Microsoft PowerPoint is still the first choice of users for presentations, since its launch. There are a number of use cases when you want to share or distribute your presentation as a PDF. Aspose.Slides Cloud is Microsoft PowerPoint File Format Solution, that enables you to convert your presentation to PDF on any platform without dependency on Microsoft Office or Adobe Acrobat/DC.

In this post, I will show you how effortlessly you can convert a PPTX file to PDF using Aspose.Slides Cloud SDK for .NET. The API is not limited to .NET, but it also provides SDKs for other popular programming languages, i.e. Java, PHP, Ruby, Python, Node.js and go. Please visit GithHub repository for a complete list of SDKs. We have not only provided complete source code of SDKs and examples at GitHub repository, but also published SDKs on the respective package manager website.

Now, I will show you the sample C# code to convert a Presentation to PDF. We will create a new project and follow these steps:

Install NuGet package

From the command line:

nuget install Aspose.Slides-Cloud

From Package Manager:

PM> Install-Package Aspose.Slides-Cloud

Code

Feel free to drop us a comment sharing your thoughts about Aspose.Slides Cloud REST API. Or let us know if you have any suggestions or if you need any particular features which you expect our REST API to have.

And if you’ve not already had a chance to try our REST API, simply start a free trial today. All you need is to sign up with the aspose.cloud. Once you’ve signed up, you’re ready to try the powerful file processing features offered by aspose.cloud.

Posted in Aspose.Slides Cloud Product Family | Tagged , | Leave a comment

How to Convert Microsoft Word Document to EPUB in Python

Share on FacebookTweet about this on TwitterShare on LinkedIn

Do you need to convert Microsoft Word document to EPUB, the eBook format? The EPUB file format looks good and provides pleasant reading experience on eReader devices such as the Kindle, Nook, Sony Reader and tablets. The Microsoft Word and PDF documents are also good for sharing and distributing online, but these file formats are not eReader friendly. You can use Aspose.Words Cloud to convert Microsoft Word document to EPUB with high fidelity.

Aspose.Words Cloud is an easy to use and powerful REST API solution that works on any platform. It can convert the industry standard file formats to the EPUB format. The content, formatting, images, hyperlinks, metadata and navigation of resultant EPUB functions in any EPUB compliant eReader.

Now, I will show you how easily you can convert a Microsoft Word document to EPUB using the Python SDK. If you are using some other programming language, then you can check SDK of your choice from our GitHub repository. It contains the complete source code of the SDK along with the working examples.

While converting a document to EPUB, you can control the output by related save options. Please check the EpubSaveOptions request parameter for more details. We will follow these steps to convert Microsoft Word document to EPUB:

  • Install Python package
  • Upload source document to Storage
  • Convert document to EPUB

Install Python package

Install aspose-words-cloud with PIP from PyPI by:

pip install aspose-words-cloud

Code

Got a question or a Bug? Please feel free to drop us a comment below or post a question in the support forum. It helps us to continually improve and refine our API.

Posted in Aspose.Words Cloud Product Family | Leave a comment

Duplicate Image Detection with Cloud REST API

Share on FacebookTweet about this on TwitterShare on LinkedIn

Reverse Image Search is a technique that helps you search visually similar images based upon your sample image. There may be a lot of use cases to apply the reverse image search engine. The most common use cases are the following:

  • Search for duplicate image and remove the duplicates
  • Search for content similar images
  • Search for inappropriate content
  • Search for digitally signed images

There are many applications available for image searching based on reverse image search technique. However, if yo’re looking for a REST API solution for reverse image search, then you’ve landed at the right place. Aspose.Imaging Cloud provides a powerful search engine that helps developers to add reverse image search feature in their application on any platform seamlessly. It compares the source image set, containing at least one image, with several other images. As a result of this comparison you get a list of most similar images according to the following conditions:

  • Degree of similarity
  • The minimal threshold of similarity
  • Algorithm of comparison

Currently Aspose.Imaging Cloud search engine supports content-based image search, duplicate image search, image search by custom registered tags, image comparison and similarity detection and Image features extraction operations.

Here, we’ll give you a quick walk through the feature to find duplicate images using Aspose.Imaging Cloud REST API. However, you can check Aspose.Imaging Cloud documentation for a complete list of features and their details.

How to Find Duplicate Images?

Duplicate image detection, this is the most common use of the reverse image search engine. Many customers need to sort out their photo libraries by finding similar photo images and leaving one or several shots while deleting the rest ones.

We’ll show you, how easily you can find duplicate images using Aspose.Imaging Cloud API. We’re using AKAZE algorithm for features detection and RandomBinaryTree algorithm for feature matching in the example. We’ll follow these steps to find the duplicate images:

  • Upload source images to storage
  • Create search context
  • Extract image features
  • Find duplicate images

Upload images to storage

Create search context

Extract image features

Find duplicate images

You can check complete example code from Aspose.Imaging Cloud SDK for .NET GitHub repository.

To learn more about Aspose.Imaging Cloud using a free trial, all you need is to sign up with the aspose.cloud. Once you’ve signed up, you may go through the following useful resources of Aspose.Imaging Cloud.

Got a question or a Bug? Please feel free to drop us a comment below or post a question in the support forum. It helps us to continually improve and refine our API.

Posted in Aspose.Imaging Cloud Product Family | Tagged , , , | Leave a comment

A Quick REST API Solution to Certify a PDF Document

Share on FacebookTweet about this on TwitterShare on LinkedIn

Hello Guys! We’re back with another important release of Aspose.PDF Cloud REST API release, 19.8. In this blog post, I’ll get you familiar with the brand new features introduced in the new release.

This release includes a number of useful features to expand the API usage. Some of the important features, from the list, are support to certify a PDF Document and conversion of Markdown file format to PDF document. In subsequent sections, I will give you and overview of these exciting features. You can check the release notes for the new version to get a complete list of new features and fixes.

How to Certify PDF Document?

You would be wondering what is the difference between signing and certifying PDF document? And why should you certify a document?

Signing a PDF document or form means that you approved the document contents and requested changes. However, when you certify a PDF document or form it ensures its integrity and authenticity. You allow recipients to make only specific changes in published PDF document or form. For example, in the case of PDF forms, you can specify that the recipient can fill in the form fields without invalidating the document. However, if a user tries to add or remove a form field or a page, the certification will be invalidated.

With this release, we took working with PDF Documents and Forms to the next level, by introducing the Certify API. A new parameter docMdpAccessPermissionType will be used to grant access permissions for the PDF. Currently, following access types can be granted:

  • NoChanges
  • FillingInForms
  • AnnotationModification

Let me show how it works. For example, there is a scenario when you want to share some company policy document and do not allow recipients to modify it. You can certify the document and set docMdpAccessPermissionType parameter to NoChanges, so user cannot modify it otherwise certification will be invalided.

Here we, go, check the sample code:

Have you noticed? I have shared C# code snippet example. However, if you are working in some other programming language, no need to worry. We have SDKs for all popular programming languages and you can pick SDK of your choice.

Convert Markdown to PDF with Python

You came across Markdown files in your daily working and its usage is increasing day by day, because it is a lightweight markup language with plain text formatting syntax and easy to maintain. However, there are scenarios when you need to convert it to some fix file format. We have provided Markdown to PDF Conversion feature in Aspose.PDF Cloud REST API to cope with the requirements.

We have introduced following two APIs for the conversion:

GET /pdf/create/markdown

Convert Markdown file to PDF document from storage and return resulting file in response

POST /pdf/{name}/create/markdown

Convert Markdown file to PDF document from storage and save resulting file to storage

Let me give you a quick demonstration of Markdown to PDF conversion using Aspose.PDF Cloud SDK for Python. You may check available SDKs and use the SDK of your choice to include the PDF document processing feature in your application without worrying about underlying API calls.

Tell Us What You Think

We would love you to hear from you, what you think about Aspose.PDF Cloud REST API? If you’ve any suggestions or if you need any particular features which you expect our REST API to have, please feel free to drop us a comment below or at the support forum.

If you’ve not already tried our REST API, we encourage you to head over to Aspose.PDF Cloud with a free trial today. All you need is to sign up with the aspose.cloud  Once you’ve signed up, you may go through the following useful resources of Aspose.PDF Cloud.

Posted in Aspose.PDF Cloud Product Family | Tagged , | Leave a comment

What’s New in Aspose.Imaging Cloud 19.7

Share on FacebookTweet about this on TwitterShare on LinkedIn

Looking for an update on the latest improvement and enhancements to Aspose.Imaging Cloud 19.7? This post will get you familiar with what has been added to the new release of Aspose.Imaging Cloud.

We continue to improve our API; in this release, we have added some new features along with the fixes of the issues reported in previous versions of the API, which make it the first choice of developers to meet all of the modern image processing requirements. We have introduced OTG (OpenDocument graphics template) file format support, a new SDK for Python and many fixes those improved PSD to PDF conversion, EMF to PNG and many other notable improvements. Please check the release notes of this version for a complete list of improvements and fixes.

Let’s give you a preview of new features along with sample code.

How to Manipulate OTG (OpenOffice Graphic Template)

Aspose.Imaging Cloud supports almost all industry standard image formats. With this release, we extend the list to OTG file format, OpenOffice Graphic Template created using the OASIS OpenDocument standard. That contains the default layout and drawing information for a vector graphic; stored in a file package using XML formatting. OTG files can be used as a starting point for creating multiple ODG drawings in OpenDocument-compatible programs, such as Apache OpenOffice (formerly OpenOffice.org).

Now, OTG image cropping, resizing, rotation, flipping and updating tasks can be accomplished with a simple set of HTTP requests in your application. Currently it supports to convert OTG file format to BMP, GIF, JPEG, JPEG2000, PSD, TIFF, WEBP, PNG and PDF.

Let’s demonstrate you how easily you can convert OTG to BMP using Aspose.Imaging Cloud API. We are using cURL command for the purpose, however, you can also use Aspose.Imaging Cloud SDK of your choice. Please refer to the complete list of available SDKs to use Aspose.Imaging Cloud API directly in your favorite platform.

Get JWT Authentication Token

Convert OTG Image to BMP

Image Manipulation in Python

While working with Python, you would be looking for a reliable API for image manipulation in your application. Aspose has solved this problem. We have introduced Aspose.Imaging Cloud SDK for Python for its Python developers, that allows them to work with Aspose.Imaging Cloud REST APIs in Python based platform quickly and easily, gaining all benefits of strong types and IDE highlights. It is a wrapper around REST APIs and supports all the features introduced in Aspose.Imaging Cloud API without worrying about the underlying REST API calls. The distribution is available at GitHub with numerous API test cases provided in this SDK to understand the Cloud API working and implementation of its features. It supports Python 2.7 and later versions.

Installation

You can get complete source code of the SDK from GitHub repository. Moreover, Aspose.Imaging Cloud SDK for Python is also available as released package in the PyPI (Python Package Index). You can bypass the source code repository and depend directly on the released package by installing from PyPI:

pip install aspose-imaging-cloud

Getting Started

Once you are done with installation of packages and dependencies in your project, you can easily call the API in your Python based code to consume the API features. Here is the sample code to demonstrate the working of Aspose.Imaging Cloud API using Python SDK. Please follow the installation procedure and then run the following Python code:

Tell Us What You Think

Got a question or a Bug? Please feel free to drop us a comment below or post a question in the support forum. It helps us to continually improve and refine our API.

Still haven’t tried Aspose.Imaging Cloud? The free trial is right here waiting for you to give it a try and explore the power of the Comparison REST API. All you need is to sign up with the aspose.cloud.

Posted in Aspose.Imaging Cloud Product Family | Tagged , , , , , | Leave a comment

Importing and Exporting PDF Form Data with Aspose.PDF Cloud 19.7

Share on FacebookTweet about this on TwitterShare on LinkedIn

Hello guys! We are excited to release Aspose.PDF Cloud 19.7. With the latest release of Aspose.PDF Cloud, our feature list expands with support of data import and export in PDF Forms. Another important enhancement is the batch processing of pages in PDF to HTML conversion. We have also refactored Aspose.PDF Cloud code in this release, which make it more stable and reliable PDF manipulation API.

I will give you an overview of some of the new features in the following sections. For complete details of new features and enhancement, please check the release notes of this version.

Working with PDF Form Data

There are business scenarios where you need to collect data using interactive forms, from users. The PDF Form is one good reliable option for the purpose. You gather the data in that form and put it into your database so you can use it later. But you do not need the entire form submitted back, only the data that users input. Or maybe you need to do the other way around, populate a PDF Form with data from your database.

In both these cases, data files; XML, FDF and XFDF are used for importing and exporting data in PDF Forms. This technique is an efficient way to transmit and archive data because the data files are smaller than PDF files. You can either send data files as response to some URL or email address using submit button on the form.

Aspose.PDF Cloud already supports to create a PDF Form. However, from this release you can import and export data files in PDF Forms. So, now Aspose.PDF Cloud is a complete solution to work with the PDF Forms.

Aspose.PDF Cloud supports three different file types to store the values of form fields. The FDF (Forms Data Format) file, it contains the values of the form fields in key/pair fashion. The XFDF file, that provides an XML encoded type of FDF called XFDF. An XFDF file stores the values of the form fields in a hierarchical manner using XML. And the XML file usually used by PDF Forms created with Adobe LiveCycle.

Let’s give you a quick overview of importing and exporting data in PDF forms using Aspose.PDF Cloud cURL commands. However, you can use the SDK of your favorite language in your application without worrying about underlying REST API Calls.

Export Data from PDF Form

You can export filled-in PDF form data as data-only files using Aspose.PDF Cloud and use these data files in your workflow.

Here’re the sample cURL commands to export different types of data files:

Export Data to FDF

Export Data to XFDF

Export Data to XML

Import Data into PDF Form

To view the exported data files of PDF forms, you can import back these files back into the corresponding PDF form.

Here’re the sample cURL commands to import different types of data files into a PDF Form:

Import FDF Data

Import XFDF Data

Import XML Data

Try Aspose.PDF Cloud

If you’re new to Aspose.PDF Cloud and haven’t tried yet? The free trial is right here waiting for you. Just sign up with the aspose.cloud and visit our developer resources for a quick start.

Share Your Feedback

Don’t forget to share your feedback, your feedback helps to shape our roadmap because it is important to us to always deliver a product that satisfies your needs., please feel free to drop us a comment below or in the support forum.

Posted in Aspose.PDF Cloud Product Family | Tagged , , , , , , , | Leave a comment