Last Updated on May 22, 2022
This is a series highlighting best-of-breed utilities. We cover a wide range of utilities including tools that boost your productivity, help you manage your workflow, and lots more besides. There’s a complete list of the tools in this series in the Summary section.
Optical Character Recognition (OCR) is a visual recognition process that turns printed or written text into an electronic character-based file. This makes the document searchable and offers the ability to copy-paste its contents.
PDF is generally considered to be an excellent format for storing and exchanging scanned documents. Unfortunately, PDFs aren’t trivial to modify. OCRmyPDF makes it easy to apply image processing and OCR to existing PDFs. The program add an OCR text layer to scanned PDF files. It’s a command-line only affair.
Let’s get an important distinction out of the way. If you create a PDF document from an electronic source, there will already be an OCR layer applied. Native PDF files have an internal structure that can be read and interpreted. These “generated” PDF documents already contain characters that have an electronic character designation. The most popular office suite for Linux is LibreOffice. That suite automatically applies a text layer to documents exported to the PDF format. For this scenario, you don’t need OCRmyPDF.
PDF documents are also created by scanning a paper document into an electronic format. Typically, this is with a flatbed scanner. The scanner takes a “snapshot” of the paper document. This snapshot is turned into a PDF (or another format such as JPG and TIFF). This is a “scanned” PDF document which often won’t have an OCR layer. Want to add that text layer? Step forward OCRmyPDF.
Installation
Installation procedure will depend on the Linux distro you’re using. On my Arch based system, installation is trivial, as there’s a package in the Arch User Repository.
Installing the package pulls in a number of other programs including tesseract, img2pdf, pngquant, unpaper, and various Python packages.
You’ll also need a language pack.
I’m using the English language pack for Tesseract. But Tesseract supports most languages. Just install the relevant language pack(s) for your requirements. And there’s support for multilingual documents.
Next page: Page 2 – In Operation
Pages in this article:
Page 1 – Introduction / Installation
Page 2 – In Operation
Page 3 – Summary
Complete list of articles in this series:
Excellent Utilities | |
---|---|
AES Crypt | Encrypt files using the Advanced Encryption Standard |
Ananicy | Shell daemon created to manage processes’ IO and CPU priorities |
broot | Next gen tree explorer and customizable launcher |
Cerebro | Fast application launcher |
cheat.sh | Community driven unified cheat sheet |
CopyQ | Advanced clipboard manager |
croc | Securely transfer files and folders from the command-line |
Deskreen | Live streaming your desktop to a web browser |
duf | Disk usage utility with more polished presentation than the classic df |
eza | A turbo-charged alternative to the venerable ls command |
Extension Manager | Browse, install and manage GNOME Shell Extensions |
fd | Wonderful alternative to the venerable find |
fkill | Kill processes quick and easy |
fontpreview | Quickly search and preview fonts |
horcrux | File splitter with encryption and redundancy |
Kooha | Simple screen recorder |
KOReader | Document viewer for a wide variety of file formats |
Imagine | A simple yet effective image optimization tool |
LanguageTool | Style and grammar checker for 30+ languages |
Liquid Prompt | Adaptive prompt for Bash & Zsh |
lnav | Advanced log file viewer for the small-scale; great for troubleshooting |
lsd | Like exa, lsd is a turbo-charged alternative to ls |
Mark Text | Simple and elegant Markdown editor |
McFly | Navigate through your bash shell history |
mdless | Formatted and highlighted view of Markdown files |
navi | Interactive cheatsheet tool |
noti | Monitors a command or process and triggers a notification |
Nushell | Flexible cross-platform shell with a modern feel |
nvitop | GPU process management for NVIDIA graphics cards |
OCRmyPDF | Add OCR text layer to scanned PDFs |
Oh My Zsh | Framework to manage your Zsh configuration |
Paperwork | Designed to simplify the management of your paperwork |
pastel | Generate, analyze, convert and manipulate colors |
PDF Mix Tool | Perform common editing operations on PDF files |
peco | Simple interactive filtering tool that's remarkably useful |
ripgrep | Recursively search directories for a regex pattern |
Rnote | Sketch and take handwritten notes |
scrcpy | Display and control Android devices |
Sticky | Simulates the traditional “sticky note” style stationery on your desktop |
tldr | Simplified and community-driven man pages |
tmux | A terminal multiplexer that offers a massive boost to your workflow |
Tusk | An unofficial Evernote client with bags of potential |
Ulauncher | Sublime application launcher |
Watson | Track the time spent on projects |
Whoogle Search | Self-hosted and privacy-focused metasearch engine |
Zellij | Terminal workspace with batteries included |