Question

39

How can I extract text from images?

rated 0 times [ 39] [ 0] / answers: 1 / hits: 47477 / 2 Years ago, fri, may 13, 2022, 6:10:20

How can I extract text from images?

I am not talking about scanned files, but garden variety images, such as when you take a high-def picture of a blackboard at class, and it is nicely handwritten; or when you photograph a page from a recipe book and want the recipe in text format.

Any free and open software for that?

I tried tesseract, and the results were awful.

Answers

Only authorized users can answer the question. Please sign in first, or register a free account.

gliroopy

Add To Favorites

Follow

Total Points: 290

Total Questions: 115

Total Answers: 114

Location: Egypt

Member since Tue, May 3, 2022

2 Years ago

gliroopy questions

1 Windows 10 and Ubuntu dual boot - sleep/wake issues?

Fri, Apr 21, 23, 18:26, 1 Year ago

1 How to remove netplan.io but keep ethernet

Wed, Apr 19, 23, 09:44, 1 Year ago

1 On Ubuntu 20.4.3 Alt+Left Ctrl do not function as Alt Gr on Finnish layout

Sat, Sep 4, 21, 17:22, 3 Years ago

1 Do we have offline Python documentation browser in Ubuntu repositories to have one "point of service" to read and search the Python docs and APIs?

Sat, May 27, 23, 20:09, 1 Year ago

1 Autostart C++ compiled application upon boot

Wed, Apr 13, 22, 08:49, 2 Years ago

View All

answered 2 Years ago reangi · Accepted Answer

The act of extracting text from images is called OCR and Ubuntu has a wiki page dedicated to OCR. From that page:

Available OCR tools

The Ubuntu Universe repositories contain the following OCR tools:

gocr - A command line OCR

fuzzyocr - spamassassin plugin to check image attachments

libhocr0 - Hebrew OCR

ocrad - Optical Character Recognition program

ocrfeeder - Document layout analysis and optical character recognition system

ocropus - document analysis and OCR system

tesseract-ocr

The Ubuntu multiverse respositories also contain:

cuneiform - multi-language OCR system

Some packages are outdated, but unofficial fresh ones can be found in Alex_P PPA (PPA adding code: ppa:alex-p/notesalexp). If you never used a PPA check how to add software from a PPA.

edit:
As shown in comment Clara OCR exists too but it got stuk at Hardy and their website has 2009 as last updated.