Optical Character Recognition is a smart way to recognize content over raster images. It even becomes more useful when you need to preserve the old archival literature in a digital format. Thousands of years old books can be easily preserved by transforming them into the form of digital libraries using OCR operations. Also, over the years, this need has become ubiquitous. So in order to accomplish this requirement, either you need to use some out-of-the-box software, or in case you need to perform bulk operations without human intervention, the quick and easiest way is to use programming API. In the rest of the article, we are going to explain steps on how to perform OCR on images using Python REST API.
OCR Online REST API
Aspose.OCR Cloud SDK for Python is remarkable when it comes to optical character recognition over raster images (BMP, JPEG, GIF, PNG, TIFF). When performing OCR operations, it enables you to read the characters as well as font information. You may Perform the OCR on the whole image or a specific portion while providing X and Y coordinates. After the Optical Characters Recognition is completed, the response is returned in the XML or JSON formats and the extracted text can be saved into TXT, PDF, and HOCR formats. Specified below are some high-level features
- Automatic skew correction
- Automatic and manual document layout detection
- Advanced automated image pre-processing
- Supports multiple international languages
- High speed with no hardware resources
Supported languages
Along with the English language, the API is fully capable of recognizing text in French, German, Italian, Portuguese and Spanish languages.
Supported file formats
Specified below is the complete list of file formats that are currently supported by REST API for OCR operations.
.bmp, .dib, .jpeg, .jpg, .jpe, .jp2, .png, .webp, .pbm, .pgm, .ppm, .pxm, .pnm .pfm, .sr, .ras, .tiff, .tif, .exr, .hdr, .pic
OCR Online using Python
Our APIs are developed as per REST architecture, so in this section, we are going to explore the Image to Text conversion using cURL commands. We understand that the cURL commands are the flexible way of accessing REST APIs via console. Now one of the pre-requisite is to generate a JWT token. For further related details, please visit How to Obtain JWT token using a Client ID and Client Secret key
curl -v "https://api.aspose.cloud/oauth2/token" \-X POST \-d "grant_type=client_credentials&client_id=xxxxx-xxxx-xxx-xxxx-&client_secret=xxxxxxxxx" \-H "Content-Type: application/x-www-form-urlencoded" \ -H "Accept: application/json"
Once you have the JWT token, please try using the following command to perform an OCR operation on an image located on cloud storage, containing English text.
curl -X GET "https://api.aspose.cloud/v3.0/ocr/downsize.jpeg/recognize?language=1" -H "accept: application/json" -H "authorization: Bearer <JWT Token>"
Image to Text Conversion on Local Image
In this section, we are going to perform OCR operation on image loaded from local drive
Python OCR on Image from Cloud Storage
We are going to learn the details on how we can load an image from Cloud storage and perform Image OCR using Python code snippet.
Image OCR on URL
In case you come across a requirement to perform Optical Image Recognition on an image available on a Web URL, the API is fully capable and supports this feature. The post_recognize_from_url method of API can be used to accomplish this requirement.
Conclusion
In this article, we have learned the details on how to perform OCR online using cURL command as well as through python code snippet. As our Cloud SDKs are built under MIT license, so you may consider downloading the complete source code from GitHub repository. This repository also comes with free demos and in order to execute them, please follow the steps given below
- Checkout the SDK or get from pip (pip install aspose-ocr-cloud)
- Set Your Client ID & Client Secret
- Run Python console Demo or UnitTests
Related Articles
We highly recommend you to please visit the following links to learn more about: