An easy-to-use desktop tool to recognise and extract text from images, PDFs and Google Books. It uses the Tesseract OCR engine, combined with modern and efficient preprocessing and analysis pipelines, to produce high quality output in plain text, hOCR and searchable PDF format. The tool has been built with a focus on OCR of historical printed works, but it includes modern language options and also works well on modern printed works.
Added support for PDFs with embedded rotated images, created new icon, added version number to MacOS and Windows builds, fixed possible crash on Windows caused by temporary jpegs not being closed before removal, fixed issue where an invalid PDF could be created in some cases, fixed issue with flatpak where not all trainings were available.