Information Extraction

We have applied various techniques to large PDF documents in order to extract custom entities that are specific to our business needs. The goal of this process is to automate the extraction of business-related information from these documents. To accomplish this, we have compared and used a range of NLP SOTA techniques such as BERT, Spacy, and attention-based models.  

Image Source
 

image