How to Convert Image to Text in Python

09/12/2021

Contents

In this article, you will learn how to convert image to text in Python.

Convert Image to Text in Python

You can use the Tesseract library in Python to convert an image to text. Tesseract is an optical character recognition (OCR) engine that was developed by Google. It can recognize text in multiple languages and scripts.

Here is a sample code to convert an image to text using Tesseract in Python:

import pytesseract
from PIL import Image

image = Image.open("example.png")
text = pytesseract.image_to_string(image)

print(text)

In this code, you first open the image using the Image.open function from the PIL library. Then, you pass the image to the image_to_string function from the pytesseract library, which returns the recognized text. Finally, you print the text.

Note that you need to install both pytesseract and PIL libraries in your system before running the above code. You can install them using pip as follows:

pip install pytesseract
pip install pillow

Also, make sure that you have Tesseract OCR installed on your system. You can download it from the official Tesseract OCR GitHub repository: https://github.com/tesseract-ocr/tesseract

Here’s a bit more information that may be useful when working with the Tesseract library in Python.

The image_to_string function has several optional parameters that allow you to customize the OCR process. For example, you can specify the language of the text in the image by using the lang parameter. For example:

text = pytesseract.image_to_string(image, lang='eng')

This will specify that the text in the image is in English. You can find a list of supported languages and their codes in the Tesseract documentation.

You can also change the configuration of the OCR engine by passing a string of configuration options to the config parameter. For example, you can use the –psm option to specify the page segmentation mode. For example:

text = pytesseract.image_to_string(image, config='--psm 11')

This will set the page segmentation mode to single line mode, which is useful when you are processing text that is written on a single line. You can find a list of configuration options and their descriptions in the Tesseract documentation.

Finally, it’s worth noting that the accuracy of the OCR process depends on the quality of the image and the clarity of the text in the image. If the text is small or the image is noisy, you may need to pre-process the image to improve the accuracy of the OCR process. For example, you can use image processing techniques such as thresholding or morphological operations to improve the quality of the image.