There are 2 basic types of PDF invoices that businesses receive from their suppliers.
1. An electronic PDF generated directly out of an accounting software system and received electronically (email attachment, portal download).
2. A scan or fax which originated in an accounting software system but is now an image rather than an electronic document.
How can you tell the difference?
If you can use a mouse to select/highlight the text in a PDF, it is electronic.
An electronic PDF typically won’t have any hand writing on it.
An electronic PDF won’t be skewed, like when a piece of paper goes through a scanner a little bit crooked.
Printing and scanning typically introduces little blemishes onto the page.
Why would you care about difference?
When the data is received electronically (i.e. 1 above) it can be automated into an accounting system using a tool like InvoiceSmash with minimal oversight. The one click processing is possible because the data starts out electronic, is transmitted electronically and is entered into your accounting system electronically. For example, lets say you receive regular invoices from a supplier XYC company. The first time you process a XYC company invoice using InvoiceSmash, you set up all the rules for how it should be entered into your accounting software, which GL accounts to use, which inventory etc. Thereafter each time you process an invoice from XYC company the software will automatically resolve it for you and you will presented with a one click submission screen.
OCR sucks
On the other hand, if you are receiving the invoices via 2 above, you have a conversion problem. You have to convert the data from ‘paper’ to electronic. This can be done using OCR technology. OCR stands for Optical Character Recognition. Basically the computer looks at the picture of the character or word and uses probability to assert that ‘0’ is zero and ‘i’ is a lowercase ‘I’ and not an ‘L’. The problem is that even the best OCR software sometimes gets it wrong (98% accuracy) which means someone has to maintain oversight and vigilance, and provide data verification data repair. Hopefully this is the responsibility of specialist bookkeepers and accounts staff, but quite often we see even high level staff having to do their own data entry.
Ok so OCR is bad, but why do you care?
Its simple, accessing the data electronically, you save yourself most or all of the data entry and data oversight effort (i.e. the conversion problem). Its a simple change. Just ask your suppliers to send the invoices as email attachments and you’re done. Most suppliers are happy to oblige and many bigger companies have portals where you can access the files yourself.
But wait a minute, InvoiceSmash does OCR doesn’t it?
Yes, but we’re doing it begrudgingly. We really want you to process your invoices electronically so as to avoid the data validation headache. But of course, we’re living in the real world and even companies that are aggressively embracing automation still have the occasional pieces of paper floating around. They can use InvoiceSmash too. (OCR assist, is due to go online, Jun 2013)