Perform OCR Online. Image to Text using Python SDK

Optical Character Recognition is a smart way to recognize content over raster images. It even becomes more useful when you need to preserve the old archival literature in a digital format. Thousands of years old books can be easily preserved by transforming them into the form of digital libraries using OCR operations. Also, over the years, this need has become ubiquitous. So in order to accomplish this requirement, either you need to use some out-of-the-box software, or in case you need to perform bulk operations without human intervention, the quick and easiest way is to use programming API. In the rest of the article, we are going to explain steps on how to perform OCR on images using Python REST API.

OCR Online REST API
- Supported languages
- Supported file formats
OCR Online using Python

OCR Online REST API

Aspose.OCR Cloud SDK for Python is remarkable when it comes to optical character recognition over raster images (BMP, JPEG, GIF, PNG, TIFF). When performing OCR operations, it enables you to read the characters as well as font information. You may Perform the OCR on the whole image or a specific portion while providing X and Y coordinates. After the Optical Characters Recognition is completed, the response is returned in the XML or JSON formats and the extracted text can be saved into TXT, PDF, and HOCR formats. Specified below are some high-level features.

Automatic skew correction
Automatic and manual document layout detection
Advanced automated image pre-processing
Supports multiple international languages
High speed with no hardware resources

Supported languages

Along with the English language, the API is fully capable of recognizing text in French, German, Italian, Portuguese and Spanish languages.

Supported file formats

Specified below is the complete list of file formats that are currently supported by REST API for OCR operations.

.bmp, .dib, .jpeg, .jpg, .jpe, .jp2, .png, .webp, .pbm, .pgm, .ppm, .pxm, .pnm .pfm, .sr, .ras, .tiff, .tif, .exr, .hdr, .pic

OCR Online using Python

Our APIs are developed as per REST architecture, so in this section, we are going to explore the Image to Text conversion using cURL commands. We understand that the cURL commands are the flexible way of accessing REST APIs via console. Now one of the pre-requisite is to generate a JWT token. For further related details, please visit How to Obtain JWT token using a Client ID and Client Secret.

curl -v "https://api.aspose.cloud/oauth2/token" \-X POST \-d "grant_type=client_credentials&client_id=xxxxx-xxxx-xxx-xxxx-&client_secret=xxxxxxxxx" \-H "Content-Type: application/x-www-form-urlencoded" \ -H "Accept: application/json"

Once you have the JWT token, please try using the following command to perform an OCR operation on an image located on cloud storage, containing English text.

curl -X GET "https://api.aspose.cloud/v3.0/ocr/downsize.jpeg/recognize?language=1" -H "accept: application/json" -H "authorization: Bearer <JWT Token>"

Image to Text Conversion on Local Image

In this section, we are going to perform OCR operation on image loaded from local drive

	# For complete examples and data files, please go to https://github.com/aspose-ocr-cloud/aspose-ocr-cloud-python/
	import os
	import asposeocrcloud.api.storage_api
	from asposeocrcloud.configuration import Configuration
	from asposeocrcloud.api.ocr_api import OcrApi
	from asposeocrcloud.models import OCRRect, OCRRegion, OCRRequestData, OCRRequestDataStorage, LanguageGroup
	from asposeocrcloud.rest import ApiException

	import json as json

	class RecognizeFromContent(object):

	def __init__(self):

	# Setup CAD and Storage API clients
	with open("config.json") as f:
	server_file_info = json.load(f)


	config = Configuration( apiKey=server_file_info['AppKey'],
	appSid=server_file_info['AppSid'])
	self.ocr_api = OcrApi(config)

	def recognize_text(self):
	file_name = "5.png"
	src = os.path.join(os.path.abspath("data/"), file_name)
	try:

	res = self.ocr_api.post_recognize_from_content(src) # type: asposeocrcloud.models.OcrResponse
	return res.text

	except ApiException as ex:
	print("Exception")
	print("Info: " + str(ex))
	raise ex

	obj=RecognizeFromContent()
	print(obj.recognize_text())

view raw RecognizeFromContent.py hosted with ❤ by GitHub

Python OCR on Image from Cloud Storage

We are going to learn the details on how we can load an image from Cloud storage and perform Image OCR using Python code snippet.

	# For complete examples and data files, please go to https://github.com/aspose-ocr-cloud/aspose-ocr-cloud-python/
	import asposeocrcloud.api.storage_api
	from asposeocrcloud.configuration import Configuration
	from asposeocrcloud.api.ocr_api import OcrApi
	from asposeocrcloud.models import OCRRect, OCRRegion, OCRRequestData, OCRRequestDataStorage, LanguageGroup

	import json as json

	class RecognizeFromStorage(object):

	def __init__(self):

	# Setup CAD and Storage API clients
	with open("config.json") as f:
	server_file_info = json.load(f)
	config = Configuration( apiKey=server_file_info['AppKey'],
	appSid=server_file_info['AppSid'])
	self.ocr_api = OcrApi(config)
	self.storage_api= asposeocrcloud.api.storage_api.StorageApi(config)

	def recognize_text(self):
	self.storage_api.upload_file("5.png", r"data\5.png")
	res = self.ocr_api.get_recognize_from_storage("5.png")
	return res.text

	obj=RecognizeFromStorage()
	print(obj.recognize_text())

view raw RecognizeFromStorage.py hosted with ❤ by GitHub

Image OCR on URL

In case you come across a requirement to perform Optical Image Recognition on an image available on a Web URL, the API is fully capable and supports this feature. The post_recognize_from_url method of API can be used to accomplish this requirement.

	# For complete examples and data files, please go to https://github.com/aspose-ocr-cloud/aspose-ocr-cloud-python/
	import os
	import asposeocrcloud.api.storage_api
	from asposeocrcloud.configuration import Configuration
	from asposeocrcloud.api.ocr_api import OcrApi
	from asposeocrcloud.models import OCRRect, OCRRegion, OCRRequestData, OCRRequestDataStorage, LanguageGroup
	from asposeocrcloud.rest import ApiException

	import json as json

	class RecognizeFromURL(object):

	def __init__(self):

	# Setup CAD and Storage API clients
	with open("config.json") as f:
	server_file_info = json.load(f)
	config = Configuration( apiKey=server_file_info['AppKey'],
	appSid=server_file_info['AppSid'])
	self.ocr_api = OcrApi(config)

	def recognize_text(self):
	url = "https://upload.wikimedia.org/wikipedia/commons/2/2f/Book_of_Abraham_FirstPage.png"
	try:
	res = self.ocr_api.post_recognize_from_url(url) # type: asposeocrcloud.models.OcrResponse
	return res.text

	except ApiException as ex:
	print("Exception")
	print("Info: " + str(ex))
	raise ex

	obj=RecognizeFromURL()
	print(obj.recognize_text())

view raw RecognizeFromURL.py hosted with ❤ by GitHub

Conclusion

In this article, we have learned the details on how to perform OCR online using cURL command as well as through python code snippet. As our Cloud SDKs are built under MIT license, so you may consider downloading the complete source code from GitHub repository. This repository also comes with free demos and in order to execute them, please follow the steps given below.

Checkout the SDK or get from pip (pip install aspose-ocr-cloud)
Set Your Client ID & Client Secret
Run Python console Demo or UnitTests

We highly recommend you to please visit the following links to learn more about:

OCR Online REST API#

Supported languages#

Supported file formats#

OCR Online using Python#

Image to Text Conversion on Local Image#

Python OCR on Image from Cloud Storage#

Image OCR on URL#

Conclusion#

Related Articles#