Meet ImgHog - a powerful image processing and text extraction engine dedicated to customized text reading from images. ImgHog is the heart of all CustomOCR solutions.

As opposed to ordinary scanned documents OCR, dealing with complex image types requires a more sophisticated and diverse approach.

Over the past years we have been developing ImgHog – a powerful collection of algorithms allowing to make use of regular OCR systems for reading from complex images.

ImgHog Wraps Every Call To The OCR Engine With Own Routines.

ImgHog parts

  • Before OCR (Preprocessing)
    To convert input images to the form with which OCR systems would work best.
  • After OCR (Postprocessing)
    To process and improve OCR results.

ImgHog Is A Mixture Of Image Processing, Computer Vision, Linguistics And Other Techniques.

Image Processing

  • Scaling, rotation, cropping
  • Noise removal
  • Contrast enhancement
  • Color correction
  • Binarization

Image Understanding

  • Background removal
  • Perspective correction
  • 3D reconstruction
  • Text detection
  • Object and text segmentation


  • Linguistics and dictionaries
  • Heuristics
  • Application-specific knowledge
  • Structured data extraction

With ImgHog We Can Deliver The Best Text Reading Solutions.

High Accuracy

  • The large collection of algorithms is an excellent technical means for building solutions for virtually any image type.
  • Every single routine of ImgHog is carefully implemented in the most efficient way. Fine-tuned and debugged in multiple projects, ImgHog’s code boasts great accuracy and speed.
  • Our broad experience with various text recognition problems allows us to design each solution in an optimal way.
  • Our unique CATT approach to Tesseract font training lets us get much better OCR results compared to naïve/amateur approaches.
  • Using our RATT process for testing and quality control, we steadily bring overall solution accuracy up to the highest possible levels.

Easy Integration

  • Our solutions are packaged as Linux/Windows command line executables.
  • No other software needed, all necessary preprocessing, OCR and postprocessing are included in the solution.
  • The workflow is based on a simple concept: image file in – text file out.


  • Most of images require no more than few seconds to process.
  • Built-in performance monitoring allows to eliminate bottlenecks.
  • The solution executable contains startup code which is very lightweight.
  • Multiple instances can be run in parallel allowing unlimited scalability and easy implementation of load balancing.