Explanation:
- Include Tesseract and Leptonica Headers:
#include <tesseract/baseapi.h>
: Includes the Tesseract API for OCR.#include <leptonica/allheaders.h>
: Includes Leptonica library headers for image processing.
- Initialize Tesseract API:
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
: Creates an instance of the Tesseract API.api->Init(NULL, "eng");
: Initializes the Tesseract API with the English language. You may specify the path to trained data files if they are not in the default location.
- Load Image:
Pix *image = pixRead("image.png");
: Loads the image from the file. Replace"image.png"
with the path to your image file.- Checks if the image was loaded successfully.
- Set Image for OCR:
api->SetImage(image);
: Sets the loaded image for OCR processing.
- Perform OCR:
char *text = api->GetUTF8Text();
: Extracts text from the image. The result is a UTF-8 encoded string.
- Output Extracted Text:
- Prints the extracted text to the console.
- Clean Up:
delete[] text;
: Frees the memory allocated for the extracted text.pixDestroy(&image);
: Frees the memory used by the image.api->End();
: Cleans up and releases resources used by the Tesseract API.
Prerequisites:
- Tesseract OCR: Ensure Tesseract OCR is installed on your system.
- Leptonica Library: Ensure Leptonica (a library used by Tesseract for image processing) is installed.
Compilation and Execution:
Compile with Tesseract and Leptonica:
Run the Program: