LinkedIn collect

Fintech & Banking

Document Processing

Manual document processing in the back office takes up a lot of time, sometimes even as much as 95% of the entire sales cycle. Storage is another significant cost associated with paper documentation, especially with sky-rocketing real estate prices. Financial institutions have slowly been progressing towards automating and digitizing document processing, with the objective of lowering costs and making the overall process faster and smoother.

However, automated document processing typically comes with some challenges, especially when it comes to non-standardized, specific documents like account statements, proof of address, income, and employment. These documents usually have a unique document structure depending on the issuer, and a “one algorithm fits all” solution is not cost-effective. Moreover, typical automated solutions struggle with hand-written sections and signature recognition, which has proven to be an obstacle in the adoption of such solutions.

Benefits of Automated Document Processing

Increased Speed and Accuracy

Shorter Sales Cycles

Reduced Storage Cost for Physical Documents

Lower Human Resource Costs

Adastra’s Automated Document Processing Solution

Adastra’s approach automates the processing of both standardized and non-standardized documents using Artificial Intelligence solutions and tools that minimize the margin of error.

To begin with, the solution evaluates the overall image quality and localizes the document in the picture using known templates, edge detection and other techniques. Next, it localizes areas of interest in the document and processes each area separately, but with respect to other areas to remove noise and artefacts.

None

To begin with, the solution evaluates the overall image quality and localizes the document in the picture using known templates, edge detection and other techniques. Next, it localizes areas of interest in the document and processes each area separately, but with respect to other areas to remove noise and artefacts.

The tool can be trained to recognize which images are relevant and which are not (such as, incorrect document submitted). Pre-processing also involves determining orientation of the document in the image, identifying common rotation angles, and rotating it back accordingly. Through use of dilation, erosion and adaptive thresholding techniques, watermarks and unnecessary noise can be removed from the image to make it clearer and more suitable for processing.

For Natural Language Processing, the tools we use can recognize many entities (name, data, location, organization, etc.) using pre-trained models. Additionally, techniques like Levenshtein Distance and Pattern Recognition can be used to identify entities.

The solution then uses OCR to read text, post-processes read values and evaluates correctness of the result and its parts, determining the confidence value for each field.

A feedback mechanism is implemented to train the OCR engine to prevent mistakes in text recognition and an entity recognition engine is used to determine what each field represents. Accordingly, the solution updates the database via a web interface.

The solution also makes it possible to automate document summarization by determining underlying themes for relevant documents and mapping text to identified themes. Each paragraph is categorized and scored on alignment, and the ones with the highest scores are highlighted. It also allowed extrapolation by clustering low scoring text together into potential new categorizations.

Want to learn more about Adastra’s Document Processing Solutions? Schedule a free consultation with our experts.

Book a Free Consultation

Thank you

We will contact you as soon as possible.