Why Use Tesseract OCR with Actix-Web?
This setup is perfect for invoice OCR, document scanning, or any text extraction task.
Setting Up the OCR API
Step 1: Dependencies
If you don't have installed Tesseract OCR follow this tutorial to install tesseract latest or any version
Add these dependencies to your Cargo.toml
:
Step 2: The OCR API Code
Here's the complete working code for our OCR API:
To use a different language please visit this link and get your lang code
Key Features of This Implementation
Running the API
In a second terminal run this command to test the API using an image
In this Tutorial We have explored leptess crate which is a wrapper of Tesseract OCR in rust
and it shows very high performance as compared to rusty-tesseract.
We have used PSM value 12 for our usecase which is invoice parsing. However, if you want to use Tesseract OCR in production you will need to do alot of preprocessing and document handling before applying tesseract.
Thank You.
Leave a Comment