0% found this document useful (0 votes)

71 views8 pages

Module # 10C - Text Recognition With Tesseract OCR

The document provides a comprehensive guide on Optical Character Recognition (OCR) using Tesseract, detailing its processes, installation instructions for different platforms, and code examples for text recognition from images. It explains the functionality of Tesseract, including its neural network capabilities and configuration options for language and segmentation modes. Additionally, it covers integrating text-to-speech conversion with recognized text using the pyttsx3 library.

Uploaded by

Haya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views8 pages

Module # 10C - Text Recognition With Tesseract OCR

Uploaded by

Haya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Optical Character Recognition (OCR)

Text Recognition with Tesseract

RASPBERRY PI COURSE GUIDE

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com
Optical Character Recognition (OCR)
Text Recognition with Tesseract

Optical Character Recognition (OCR):

OCR = Optical Character Recognition. In other words, OCR systems transform a two-
dimensional image of text, that could contain machine printed or handwritten text from
its image representation into machine-readable text. OCR as a process generally
consists of several sub-processes to perform as accurately as possible. The sub-
processes are:

 Preprocessing of the Image

 Text Localization
 Character Segmentation
 Character Recognition
 Post Processing

What is Tesseract OCR?

Tesseract is an open source text recognition (OCR) Engine, available under the Apache
2.0 license. It can be used directly, or (for programmers) using an API to extract printed
text from images. It supports a wide variety of languages.
It can be used with the existing layout analysis to recognize text within a large
document, or it can be used in conjunction with an external text detector to recognize
text from an image of a single text line.
Tesseract 4.00 includes a new neural network subsystem configured as a text line
recognizer. To recognize an image containing a single character, we typically use a
Convolutional Neural Network (CNN). Text of arbitrary length is a sequence of
characters, and such problems are solved using RNNs and LSTM is a popular form of
RNN.

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com
Optical Character Recognition (OCR)
Text Recognition with Tesseract

Legacy Tesseract 3.x was dependant on the multi-stage process where we can
differentiate steps:

 Word finding
 Line finding
 Character classification

To install tesseract in laptop use the following commands in Anaconda Command

Prompt, make sure you are in the same environment in which OpenCV is installed.

conda install -c conda-forge tesseract

-c conda-forge pytesseract

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com
Optical Character Recognition (OCR)
Text Recognition with Tesseract

To install tesseract in Raspberry Pi, type the following commands in CLI of Raspberry
Pi, make sure you are in the same environment in which OpenCV is installed.
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
sudo pip install pytesseract

To check Tesseract's installation, type the following command in the terminal:

tesseract –version

Code for Text Recognition from a Saved Picture:

import pytesseract
from PIL import Image
import cv2

img = cv2.imread('para.jpg',cv2.IMREAD_COLOR)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #convert
to grey to reduce detials
gray = cv2.bilateralFilter(gray, 11, 17, 17)
original = pytesseract.image_to_string(gray, config='')
print (original)

Before running the above code, make sure that you have saved an image with jpg
extension named as para in your root folder. As in line 5 of the code ‘para.jpg’ is being
read.

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com
Optical Character Recognition (OCR)
Text Recognition with Tesseract

If we want to convert our recognized text into speech then we are required to use a text-
to-speech converter. For that we can install pyttsx3 through the following command:
1. Go to Anaconda prompt and type conda install pip . This will install pip in the
current conda environment.

2. After step 1, type pip install pyttsx3.

To check the installation, run the below code in your Jupyter Notebook and you will hear
a voice saying ‘I will speak this text’

import pyttsx3
engine = pyttsx3.init()
engine.say("I will speak this text")
engine.runAndWait()

Now by adding few extra lines of code we can convert our recognized text into speech.
Hence applying OCR + TTS Technique.
import pytesseract
from PIL import Image
import cv2
import pyttsx3;

engine = pyttsx3.init();

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com
Optical Character Recognition (OCR)
Text Recognition with Tesseract

You can give three important flags for tesseract to work and these are -l , --oem , --psm.

 The -l flag controls the language of the input text.

 The --oem argument, or OCR Engine Mode, controls the type of algorithm used by
Tesseract.
 The --psm controls the automatic Page Segmentation Mode used by Tesseract.

It can be used like this with .image_to_string method of tesseract (used in 2nd last
line of 1st code):

config = ("-l eng --oem 1 --psm 7")

original = pytesseract.image_to_string(gray, config="-l eng --

oem 1 --psm 7")

By default, Tesseract expects a page of text when it segments an image. If you're just
seeking to OCR a small region, try a different segmentation mode, using the --psm
argument. There are 14 modes available which can be found here. By default,
Tesseract fully automates the page segmentation but does not perform orientation and
script detection.

 PSM – Page Segmentation Mode

 OEM (type of algorithm used by Tesseract)

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com
Optical Character Recognition (OCR)
Text Recognition with Tesseract

There is also one more important argument, OCR engine mode (oem). Tesseract 4 has
two OCR engines — Legacy Tesseract engine and LSTM engine. There are four modes
of operation chosen using the --oem option.

OEM Mode:

0 Legacy engine only.

1 Neural nets LSTM engine only.
2 Legacy + LSTM engines.
3 Default, based on what is available.

Page segmentation modes

There are several ways a page of text can be analysed. The tesseract api provides
several page segmentation modes if you want to run OCR on only a small region or in
different orientations, etc.

Here's a list of the supported page segmentation modes by tesseract -

0 Orientation and script detection (OSD) only.

1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line, bypassing hacks that are
Tesseract-specific.

To change your page segmentation mode, change the --psm argument in your custom
config string to any of the above mentioned mode codes.

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com
Optical Character Recognition (OCR)
Text Recognition with Tesseract

Code for Text Recognition with Raspberry Pi Camera:

import cv2
import pytesseract
from picamera.array import PiRGBArray
from picamera import PiCamera

camera = PiCamera()
camera.resolution = (640, 480)
camera.framerate = 30

rawCapture = PiRGBArray(camera, size=(640, 480))

for frame in camera.capture_continuous(rawCapture,

format="bgr", use_video_port=True):
image = frame.array
cv2.imshow("Frame", image)
key = cv2.waitKey(1) & 0xFF

rawCapture.truncate(0)

if key == ord("s"):
text = pytesseract.image_to_string(image)
print(text)
cv2.imshow("Frame", image)
cv2.waitKey(0)
break

cv2.destroyAllWindows()

thingsRoam Academy Contact: +92-308-1222240 academy.thingsroam.com

Email: academy@thingsroam.com

Ocr Nanonets Tesseract
No ratings yet
Ocr Nanonets Tesseract
39 pages
OpenCV OCR and Text Recognition With Tesseract - PyImageSearch
No ratings yet
OpenCV OCR and Text Recognition With Tesseract - PyImageSearch
65 pages
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
No ratings yet
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
7 pages
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
No ratings yet
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
10 pages
Ocr Gtts
No ratings yet
Ocr Gtts
49 pages
We Used Tesseract OCR For Train The Data and Recognize The Character From Digital Image Under The Apache 2
No ratings yet
We Used Tesseract OCR For Train The Data and Recognize The Character From Digital Image Under The Apache 2
1 page
An Overview of Tesseract OCR Engine
No ratings yet
An Overview of Tesseract OCR Engine
15 pages
Optical Character Recognition Research: Index
No ratings yet
Optical Character Recognition Research: Index
6 pages
RP 1 - Merged
No ratings yet
RP 1 - Merged
104 pages
Tài liệu về OCR
No ratings yet
Tài liệu về OCR
4 pages
Installing and Using Tesseract 500 OCRFINAL
No ratings yet
Installing and Using Tesseract 500 OCRFINAL
4 pages
Tesseract OCR Engine: Svetlin Nakov and Veselin Kolev
No ratings yet
Tesseract OCR Engine: Svetlin Nakov and Veselin Kolev
19 pages
Tesseract I CD Ar 2007
No ratings yet
Tesseract I CD Ar 2007
5 pages
Setting Up A Simple OCR Server: by Real Python 37 Comments
No ratings yet
Setting Up A Simple OCR Server: by Real Python 37 Comments
8 pages
Written Notes
No ratings yet
Written Notes
5 pages
Installing and Using Tesseract OCR PDF
100% (1)
Installing and Using Tesseract OCR PDF
5 pages
OCR With Tesseract, Amazon Textract, and Google Document AI: A Benchmarking Experiment
No ratings yet
OCR With Tesseract, Amazon Textract, and Google Document AI: A Benchmarking Experiment
22 pages
Python Project
No ratings yet
Python Project
2 pages
Ocr Gtts
No ratings yet
Ocr Gtts
53 pages
Python OCR Tool for Developers
No ratings yet
Python OCR Tool for Developers
5 pages
Preprocessing Task
No ratings yet
Preprocessing Task
7 pages
RPA & OCR Performance Analysis
No ratings yet
RPA & OCR Performance Analysis
10 pages
An Open-Source Tesseract Based Optical Character Recognizer For Bengali Language
No ratings yet
An Open-Source Tesseract Based Optical Character Recognizer For Bengali Language
21 pages
Package Tesseract': July 25, 2019
No ratings yet
Package Tesseract': July 25, 2019
5 pages
Tesseract OCR for Developers
No ratings yet
Tesseract OCR for Developers
22 pages
Tesseract Osc On
No ratings yet
Tesseract Osc On
22 pages
Tesseract Vs EasyOCR Utilities2
No ratings yet
Tesseract Vs EasyOCR Utilities2
2 pages
Optical Character Recognition (OCR) in Python
No ratings yet
Optical Character Recognition (OCR) in Python
110 pages
Ocr Gtts PDF
No ratings yet
Ocr Gtts PDF
53 pages
Optical Character Recognition
No ratings yet
Optical Character Recognition
27 pages
OCR App Development Guide
No ratings yet
OCR App Development Guide
12 pages
Ahsbsdns
No ratings yet
Ahsbsdns
1 page
Study of Tesseract OCR
No ratings yet
Study of Tesseract OCR
12 pages
Senior Capstone
No ratings yet
Senior Capstone
9 pages
Tesseract Ocr
No ratings yet
Tesseract Ocr
3 pages
Tesseract
No ratings yet
Tesseract
6 pages
Optical Character Recognizer: Team Member
No ratings yet
Optical Character Recognizer: Team Member
7 pages
9589-First Manuscript-57755-2-10-20220620 - X
No ratings yet
9589-First Manuscript-57755-2-10-20220620 - X
12 pages
AI Advantage and Disadvantage 1
No ratings yet
AI Advantage and Disadvantage 1
14 pages
OCR: MATLAB & Android Implementation
No ratings yet
OCR: MATLAB & Android Implementation
27 pages
Text Detection in Natural Scene Images Using Ocr Algorithm
No ratings yet
Text Detection in Natural Scene Images Using Ocr Algorithm
3 pages
Code Snippets
No ratings yet
Code Snippets
2 pages
Python Tesseract
No ratings yet
Python Tesseract
2 pages
OCR Implementation Guide
No ratings yet
OCR Implementation Guide
2 pages
Tesseract OCR Troubleshooting Guide
No ratings yet
Tesseract OCR Troubleshooting Guide
3 pages
IJMIE1April24 55698
No ratings yet
IJMIE1April24 55698
7 pages
ML Report
No ratings yet
ML Report
5 pages
Tesseract 1
No ratings yet
Tesseract 1
35 pages
Image to Text Conversion System
No ratings yet
Image to Text Conversion System
21 pages
Optical Character Recognition: Article
No ratings yet
Optical Character Recognition: Article
5 pages
98DSP
No ratings yet
98DSP
8 pages
Recognition of Handwritten Roman Numerals Using Tesseract Open Source OCR Engine
No ratings yet
Recognition of Handwritten Roman Numerals Using Tesseract Open Source OCR Engine
6 pages
FCS Complete
No ratings yet
FCS Complete
160 pages
Email Lec
No ratings yet
Email Lec
34 pages
39 Case Study
No ratings yet
39 Case Study
12 pages
CamScanner 07-10-2024 11.19
No ratings yet
CamScanner 07-10-2024 11.19
113 pages
SNS Problems
No ratings yet
SNS Problems
7 pages
Day 02 (Module # 2C - Part-1)
No ratings yet
Day 02 (Module # 2C - Part-1)
8 pages
Activity
No ratings yet
Activity
1 page
Ghazwa Uhud
No ratings yet
Ghazwa Uhud
2 pages
Industrial Telemetry System Report
No ratings yet
Industrial Telemetry System Report
6 pages
Computer Applications
No ratings yet
Computer Applications
26 pages
Real-Time Systems Explained
No ratings yet
Real-Time Systems Explained
26 pages
STM8S - Flash - and - Control - System
No ratings yet
STM8S - Flash - and - Control - System
75 pages
Xi-Cs Long Assessment-Final
No ratings yet
Xi-Cs Long Assessment-Final
2 pages
Ebook - ASP - NET MVC-4th Edition PDF
No ratings yet
Ebook - ASP - NET MVC-4th Edition PDF
94 pages
Finance Analytics: Reference Data Insights
No ratings yet
Finance Analytics: Reference Data Insights
9 pages
MPLS Configuration Guide Steps
No ratings yet
MPLS Configuration Guide Steps
5 pages
Netflix - User Onboarding Journey (Community)
No ratings yet
Netflix - User Onboarding Journey (Community)
1 page
Midterm Lecture 3 REQS GATHERING STORYBOARD
No ratings yet
Midterm Lecture 3 REQS GATHERING STORYBOARD
29 pages
Alesis HD24 Meter Display Logic
No ratings yet
Alesis HD24 Meter Display Logic
2 pages
Python Notes by MR Saem
No ratings yet
Python Notes by MR Saem
114 pages
Chapter 3-Artificial Inteligence
No ratings yet
Chapter 3-Artificial Inteligence
24 pages
Maven Colasecing Pipeline Ans - Apr16
No ratings yet
Maven Colasecing Pipeline Ans - Apr16
4 pages
A. V. Stolyarov - Programming Introduction To The Profession Vol. 1 - English
No ratings yet
A. V. Stolyarov - Programming Introduction To The Profession Vol. 1 - English
589 pages
AI Course Overview: Concepts & Applications
100% (1)
AI Course Overview: Concepts & Applications
277 pages
Unable To Validate D
No ratings yet
Unable To Validate D
1 page
Load Balancing With Mod - JK A...
No ratings yet
Load Balancing With Mod - JK A...
2 pages
Sat - 68.Pdf - Smart Glove For Sign Language Translation Using Iot
No ratings yet
Sat - 68.Pdf - Smart Glove For Sign Language Translation Using Iot
11 pages
GSK - N7 PLC Connection Manual - Version 005
No ratings yet
GSK - N7 PLC Connection Manual - Version 005
32 pages
Encryption 1.3
No ratings yet
Encryption 1.3
5 pages
Release 8.2 DCN Planning and Engineering Guide (Photonic Applications)
No ratings yet
Release 8.2 DCN Planning and Engineering Guide (Photonic Applications)
138 pages
Low Power VLSI Design of Modified Booth Multiplier
No ratings yet
Low Power VLSI Design of Modified Booth Multiplier
6 pages
Module 4 - Information Security
No ratings yet
Module 4 - Information Security
35 pages
Srs Document
100% (1)
Srs Document
11 pages
DCD - Schneider Electric Enterprise Trends
No ratings yet
DCD - Schneider Electric Enterprise Trends
35 pages
Ringkasan Langkah Setting Mikrotik 5.20 - (Infoupdatekita - Com)
No ratings yet
Ringkasan Langkah Setting Mikrotik 5.20 - (Infoupdatekita - Com)
3 pages
Lesson Plan in Tle Computer Hardware and
No ratings yet
Lesson Plan in Tle Computer Hardware and
9 pages
NCA-GENL Exam Dumps
100% (1)
NCA-GENL Exam Dumps
13 pages
Experiment-0 How To Install NOOBS On The Raspberry Pi-1
No ratings yet
Experiment-0 How To Install NOOBS On The Raspberry Pi-1
3 pages
Live Mail - Email Exchange-2 - 2 - 11 (1) - Apk PDF
No ratings yet
Live Mail - Email Exchange-2 - 2 - 11 (1) - Apk PDF
52 pages