Wednesday, May 1, 2024
39
rated 0 times [  39] [ 0]  / answers: 1 / hits: 47477  / 2 Years ago, fri, may 13, 2022, 6:10:20

How can I extract text from images?



I am not talking about scanned files, but garden variety images, such as when you take a high-def picture of a blackboard at class, and it is nicely handwritten; or when you photograph a page from a recipe book and want the recipe in text format.



Any free and open software for that?



I tried tesseract, and the results were awful.


More From » software-recommendation

 Answers
5

The act of extracting text from images is called OCR and Ubuntu has a wiki page dedicated to OCR. From that page:



Available OCR tools



The Ubuntu Universe repositories contain the following OCR tools:




  1. gocr - A command line OCR

  2. fuzzyocr - spamassassin plugin to check image attachments

  3. libhocr0 - Hebrew OCR

  4. ocrad - Optical Character Recognition program

  5. ocrfeeder - Document layout analysis and optical character recognition system

  6. ocropus - document analysis and OCR system

  7. tesseract-ocr



The Ubuntu multiverse respositories also contain:




  1. cuneiform - multi-language OCR system



Some packages are outdated, but unofficial fresh ones can be found in Alex_P PPA (PPA adding code: ppa:alex-p/notesalexp). If you never used a PPA check how to add software from a PPA.



edit:
As shown in comment Clara OCR exists too but it got stuk at Hardy and their website has 2009 as last updated.


[#43672] Friday, May 13, 2022, 2 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
gliroopy

Total Points: 290
Total Questions: 115
Total Answers: 114

Location: Egypt
Member since Tue, May 3, 2022
2 Years ago
;