Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How do these compare to traditional commercial and open source OCR tools? What about things like the Apple Vision APIs?


Yeah I find these discussions without referring to current LTSM model OCRs commonly used like latest version Tesseract strange. I feel like these are the baseline.


> Apple Vision APIs

Can we even OCR pdfs using Apple Vision APIs


I've used ocrit, which uses those APIs. https://github.com/insidegui/ocrit

There are also:

* swiftocr - https://github.com/fny/swiftocr

* macos-vision-ocr - https://github.com/bytefer/macos-vision-ocr


You can[1].

I’m vibe coding a little macOS OCR app since last weekend, and I’m really happy with the results so far. This is my first app, so fingers crossed. If it becomes feature-complete and polished enough, I’m considering open sourcing it. There’s still a long way to go, though.

[1] https://developer.apple.com/documentation/vision/vnrecognize...


Don't think you can. And also there is big difference in plain old OCR, which is just getting all text out from image and document processing which is can you only get the relevant information in a good structure that can be directly pushed into a database.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: