KNIME is a coherent and comprehensive open source visual platform for data integration, processing, analysis, reporting and exploration.
Read moreCategory: Documents
RapidMiner – data science platform
RapidMiner (formerly known as YALE) is a flexible Java environment for knowledge discovery in databases, machine learning, and data mining.
Read moreBIRT Project – technology platform used to create data visualizations and reports
The Business Intelligence and Reporting Tools Project is software that provides reporting and business intelligence capabilities.
Read moreNormCap – OCR powered screen-capture tool to capture information
NormCap is an OCR powered screen-capture tool to capture information instead of images. Free and open source software.
Read moreFrog – intuitive text extraction tool (OCR) for GNOME
Frog is an intuitive text extraction tool for the GNOME desktop. Frog is free and open source software written in Python.
Read moreTextShot – Python tool for grabbing text via screenshot
TextShot offers the ability to take a screenshot and copy to the clipboard the text content of the screenshot. It’s free and open source.
Read moreTextSnatcher – perform OCR operations
TextSnatcher is a simple front-end that lets you copy text from images. It uses the Tesseract OCR 4.x for the character recognition.
Read moreTesseract – optical character recognition engine
Tesseract runs from the command line. It can only process an image of a single column and create text from it.
Read moreOCRFeeder – document layout analysis and optical character recognition system
OCRFeeder is a free open source software desktop OCR suite for the GNOME desktop environment. It features a GTK+ graphical user interface.
Read moreOCRopus – Python-based tools for document analysis and OCR
ocropy (referred to as OCRopus) is an OCR system written in Python, NumPy, and SciPy focusing on the use of large scale machine learning.
Read moregscan2pdf – GUI to produce PDFs or DjVus from scanned documents
gscan2pdf is a graphical user interface to produce PDFs or DjVus from scanned documents. gscan2pdf is free and open source software.
Read moregImageReader – simple Gtk/Qt front-end to Tesseract
gImageReader is a simple Gtk/Qt front-end to Tesseract, a popular optical character recognition engine. Free and open source.
Read moreLios – ocr software
linux-intelligent-ocr-solution (Lios) is a free and open source software for converting print into text using either a scanner or a camera.
Read morehocr-tools – manipulate and evaluate hOCR format
hocr-tools is a set of tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results.
Read moreOcrad – OCR (Optical Character Recognition) software based on a feature extraction method
Ocrad is an OCR (Optical Character Recognition) program based on a feature extraction method. Free and open source software.
Read more