Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're right, the format is somewhat standardized. First there's a list of items with prices and then comes the total. So far so good.

But depending on the length of the receipt, the total is always at a different place. Another problem was that the detection rate of tesseract for whole words was pretty low to begin with. So searching for exact words like total or sum didn't work. So even though you could give the parser a rough area to look for the total, you would still need to do some fuzzy matching on "total" and "sum" to get it right.

In the end it's most important that it works for as many cases as possible. Getting to 60% was trivial but after that it got interesting. ;-)



If you are only interested in the sum, can't you ask the user to make a "close up shot" of the sum and not the whole receipt? Worse UX but that should be easier to parse.


The cool thing for me was to get it working without any user interaction. Scan and forget. Sure you could make a close up but then you could also just type it into a form field, right? It works surprisingly well without many tweaks. That's what I like about it.


If you're doing that, why not simply display the image and have the user pick the sum for you? Though, to be honest, that sounds like it'd defeat the purpose of this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: