FileInfo

Languages

Overview

What is a TRAINEDDATA file? A TRAINEDDATA file is an optical character recognition (OCR) model created by Tesseract, a multiplatform open-source OCR engine. It contains data used to automatically recognize and record text contained in images. Each TRAINEDDATA file is typically used to recognize text written in only one language and is named for that language (e.g. eng.traineddata is used to recognize English text).

More Information Optical character recognition is the process of converting text found in images to machine-encoded text. Tesseract is an OCR engine that was originally developed by Hewlett-Packard but is now maintained as an open-source project, sponsored by Google. Developers can use Tesseract to create OCR models, which are then used to recognize and convert text found in images. These models are saved as TRAINEDDATA files. Each TRAINEDDATA file has been "trained" using a series of images that contain relevant text. Tesseract includes many default TRAINEDDATA files, and developers can create their own TRAINEDDATA files. These files are typically stored in the ~/​Tessearct-OCR/​tessdata directory.

Popularity4.2/5

5 votes

Quick AppUsed by
VerifiedVerified by FileInfo.com The FileInfo.com team has independently researched the Tesseract OCR Model file format and Mac, Windows, and Linux apps listed on this page. Our goal is 100% accuracy and we only publish information about file types that we have verified. If you would like to suggest any additions or updates to this page, please let us know .

Document Icon

Breadcrumbs

DeveloperTesseract OCR Community
Popularity4.2 | 5 Votes

Previous / Next

In-Depth

.TRAINEDDATA File Extension

Tesseract OCR Model

DeveloperTesseract OCR Community
Popularity
4.2  |  5 Votes
 

What is a TRAINEDDATA file?

A TRAINEDDATA file is an optical character recognition (OCR) model created by Tesseract, a multiplatform open-source OCR engine. It contains data used to automatically recognize and record text contained in images. Each TRAINEDDATA file is typically used to recognize text written in only one language and is named for that language (e.g. eng.traineddata is used to recognize English text).

More Information

Optical character recognition is the process of converting text found in images to machine-encoded text. Tesseract is an OCR engine that was originally developed by Hewlett-Packard but is now maintained as an open-source project, sponsored by Google. Developers can use Tesseract to create OCR models, which are then used to recognize and convert text found in images. These models are saved as TRAINEDDATA files.

Each TRAINEDDATA file has been "trained" using a series of images that contain relevant text. Tesseract includes many default TRAINEDDATA files, and developers can create their own TRAINEDDATA files. These files are typically stored in the ~/​Tessearct-OCR/​tessdata directory.

How to open a TRAINEDDATA file

TRAINEDDATA files are not meant to be opened. Developers reference TRAINEDDATA files in code that calls Tesseract and uses it to analyze text included in images.

Open over 400 file formats with File Viewer Plus . Free Download

Programs that open or reference TRAINEDDATA files

Sort

Pricing

Program Name

Platform

Reset

X

Windows

Tesseract

Free

Mac

Tesseract

Free

Linux

Tesseract

Free

Category: Developer Files

Updated: March 14, 2022

FAQ

What is a TRAINEDDATA file?
A TRAINEDDATA file is an optical character recognition (OCR) model created by Tesseract, a multiplatform open-source OCR engine. It contains data used to automatically recognize and record text contained in images. Each TRAINEDDATA file is typically used to recognize text written in only one language and is named for that language (e.g. eng.traineddata is used to recognize English text).
How do I open a .traineddata file?
TRAINEDDATA files are not meant to be opened. Developers reference TRAINEDDATA files in code that calls Tesseract and uses it to analyze text included in images.