Optical Character Recognition is a smart way to recognize content over raster images. It even becomes more useful when you need to preserve the old archival literature in a digital format. Thousands of years old books can be easily preserved by transforming them into the form of digital libraries using OCR operations. Also, over the years, this need has become ubiquitous. So in order to accomplish this requirement, either you need to use some out-of-the-box software, or in case you need to perform bulk operations without human intervention, the quick and easiest way is to use programming API. In the rest of the article, we are going to explain steps on how to perform OCR on images using Python REST API.
- Overview of REST API
- Supported languages
- Supported file formats
- Perform OCR using Python
- Concluding statements
Overview of REST API
Aspose.OCR Cloud SDK for Python is remarkable when it comes to optical character recognition over raster images (BMP, JPEG, GIF, PNG, TIFF). When performing OCR operations, it enables you to read the characters as well as font information. You may Perform the OCR on the whole image or a specific portion while providing X and Y coordinates. After the Optical Characters Recognition is completed, the response is returned in the XML or JSON formats and the extracted text can be saved into TXT, PDF, and HOCR formats. Specified below are some high-level features
- Automatic skew correction
- Automatic and manual document layout detection
- Advanced automated image pre-processing
- Supports multiple international languages
- High speed with no hardware resources
Along with the English language, the API is fully capable of recognizing text in French, German, Italian, Portuguese and Spanish languages.
Supported file formats
Specified below is the complete list of file formats that are currently supported by REST API for OCR operations.
How to Perform OCR using Python
Our API is completely independent of your operating system, database system, or development language. You can use any language and platform that supports HTTP to interact with our API. However, the scope of this article is specific to using Aspose.OCR Cloud SDK for Python. Please note that it’s easy to get started with Aspose.OCR Cloud SDK for Python as there is nothing to install. Create an account at Aspose Cloud dashboard, get your application information and you are ready to use the SDK.
The cURL command is a flexible way to access REST APIs via console and in order to use Aspose.OCR Cloud SDK using the cURL command, you need to first generate a JWT token. For further related details, please visit How to Obtain JWT token using a Client ID and Client Secret key
curl -v "https://api.aspose.cloud/oauth2/token" \-X POST \-d "grant_type=client_credentials&client_id=xxxxx-xxxx-xxx-xxxx-&client_secret=xxxxxxxxx" \-H "Content-Type: application/x-www-form-urlencoded" \ -H "Accept: application/json"
Once you have the JWT token, please try using the following command to perform an OCR operation on an image located on cloud storage, containing English text.
curl -X GET "https://api.aspose.cloud/v3.0/ocr/downsize.jpeg/recognize?language=1" -H "accept: application/json" -H "authorization: Bearer <JWT Token>"
Perform OCR on local file
The following code snippet provides steps on how to load an image from local storage
Perform OCR on Image in cloud storage
Apart form loading images from local drive/storage, the API also offers the capabilities to perform OCR operation on images available in Cloud storage.
OCR operation on Image from URL
In case you come across a requirement to perform Optical Image Recognition on an image available on a Web URL, the API is fully capable and supports this feature. The post_recognize_from_url method of API can be used to accomplish this requirement.
If you are interested to make changes to API code, it can be found over GitHub repository. The repository also comes with free demos and in order to execute them, you need to follow the steps given below