Business Problem

In today's business scenario, documents are pervasive. Everything we do, everything we work on, is documented in some manner. However, these are meant for humans to read, and not for machines to understand. Hence, whenever we require certain specific information from all these documents, it needs to be manually searched for and then transferred to relevant database or spreadsheet softwares by hand. This needs to be done again and again for every document from scratch, even if they follow the same format. This not only makes the process cumbersome, expensive and time-consuming, but also exponentially raises the possibility of human-induced errors during data transfer.

Our Solution

The workflow to go from any document to structured data can be automated and made ready to execute at the push of a button using our Extractor system. The Extractor combines the power of state-of-the-art Computer Vision and Natural Language Processing algorithms with our intelligent hand-crafted rules driven by extensive research on different document formats across industries.

Fig 1: Extractor Process

The system is capable of processing both digital as well as physical documents. Digital documents can be directly fed into the system, while physical documents can be captured using scanners or phone cameras and then sent for processing. The Extractor system then intelligently analyses the textual content and layout features to convert the document to a structure data format, which can then be exported in any desirable format.

The system can be seamlessly integrated in your workflow using Robotic Process Automation (RPA) in order to fetch documents from anywhere in your digital ecosystem (mails, document folders, CRM/ERP tools, etc.) in real-time. Automatic actions can also be set up which are triggered when certain custom criterion are met.

Sample Results

Fig 2: Extractor Results on Sample Check

Fig 3: Extractor Results on Sample Bank Statement