Trapped data

In a world where real-time data access, business intelligence, security, and efficiency are critical to success, many companies realize that valuable data is trapped in their documents. These files may be paper, E-mail, or standard electronic office files, and the data in these files must be read, tracked, routed, processed, and reported manually. In fact, more than 80 percent of the information is trapped in unstructured content. This means that only 20% of the data is structured and can be easily searched and retrieved from a relational database.

Document capture technology is nothing new, but the industry has come up with innovative tools and capabilities that allow companies to do more than scan documents. Technology now enables enterprises to categorize, learn from, and extract meaning from their documents. Through automation, we can leverage and organize all data, both structured and unstructured.

Advanced document capture technologies are not only critical to improving efficiency and reducing operational costs, but also lead to better business processes through classification and data extraction.

Step 1: collect and capture data

There are many ways to capture data: scanners, multi-purpose peripherals (MFP), UNC folders (network folders), faxes, E-mail, content services or document repositories, mobile devices or through outsourced business process organization (BPO).

Step 2: image processing

Files and images are standardized, cleaned and rotated to prepare for classification. The system USES speckle removal and depolarizing filters to improve image quality. It then generates easy-to-recognize files from which data can be easily extracted.

Step 3: categorize

There are many ways to capture data: scanners, multi-purpose peripherals (MFP), UNC folders (network folders), faxes, E-mail, content services or document repositories, mobile devices or through outsourced business process organization (BPO).

Step 4: extract

This is the process of identifying metadata in a document. Metadata is a set of data that describes and provides additional data information. In the case of files, metadata can be used to organize, find, and/or provide documents to other types of business systems. The system is set to extract data through database lookup and fuzzy logic according to the business rules and information required by the company.

Step 5: verify

If any files are below the default tolerance level, they are highlighted for human review. This can happen, for example, if there are smudges, overflows, blurred characters, or fields that we might miss. You are alerted to these documents for manual validation and correction.

Step 6: export and deliver

Exported documents and data can be stored on local servers or on cloud-based storage, such as Alfresco.