Deskew library for document image processing

Deskewing is the most commonly required pre-processing that is applied to scanned document images prior to the main processing of data extraction. The following library provides an easy-to-use wrapper for Galfar’s Deskew Tools for use in UiPath document processing workflows.

One notable limitation of Galfar’s tools is the lack of support for multi-page TIF files - it works for the first page only. The library enables deskewing of all the pages by disassembling the pages, performing deskewing on each and every page individually, and then reassembling them. The original page order is also preserved in the resulting file.

Galfar’s tools are not able to handle PDF files and the library adds the capability to handle them. In a similar manner, the library disassembles all the pages to create a JPEG file for each page, performs deskewing on them, and then reassemble them in the correct page order.

The library requires the following prerequisites on the robot machine:

Supported file formats:

  • Input: BMP, JPG, PNG, JNG, GIF, DDS, TGA, PBM, PGM, PPM, PAM, PFM, TIF, PSD, PDF
  • Output: BMP, JPG, PNG, JNG, GIF, DDS, TGA, PGM, PPM, PAM, PFM, TIF, PSD, PDF

Limitations:

  • Absolute paths must be provided to this library. Relative paths may lead to exceptions.
  • PDF text layer gets removed. The resulting PDF will have images only.
  • DOS windows will open and close on the screen while deskewing is in progress.

Note. Galfar’s Deskew Tools are licensed under MPL 2.0. The library is published under the same license.

DeskewLib.1.0.1.nupkg (618.6 KB)

1 Like