Optical Character Recognition (OCR) recognizes text in media. This includes text that appears in images, video, and text embedded in PDF files and Office document file formats.
Configuration Parameter | Description |
---|---|
Blacklist | Characters to exclude from the character set used for recognition. |
CharacterTypes | The types of characters to include in the character set used for recognition. |
ContextCheck | Specifies whether to use context checking to improve OCR results |
DetectAlphabet | Specifies whether to detect the alphabet for each image or page. |
FontType | The basic character type of the text that you want to recognize |
HollowText | Specifies whether to look for outlined text. |
Input | The image track to process. |
KeepOnly | Keep only particular types of words and discard all others. |
Languages | The languages to use, which affects the character set and dictionaries used. |
MaxInputQueueLength | Can be used to place a limit on latency. |
NumParallel | The maximum number of video frames to analyze simultaneously. |
OcrMode | The OCR mode to use when you ingest images or documents. |
Orientation | The orientation of text in the ingested media. |
ProcessTextElements | Specifies whether to merge the content of text elements into the OCR results. |
Region | A region of the image or video frame to restrict processing to. |
RegionUnit | The units that the Region parameter uses to specify the size and position of a region. |
RestrictToInputRegion | Specifies whether to analyze a region of the input image or video frame that is specified in the input record, instead of the entire image. |
SampleInterval | The interval at which frames are selected to be analyzed. |
Spacing | Specifies whether to allow multiple spaces between words in the output from OCR. |
Type | The analysis engine to use. Set this parameter to OCR . |
UserDictionary | A comma-separated list of dictionaries to use in addition to the standard dictionaries. |
Whitelist | Extra characters to add to the character set. |
WordRejectThreshold | The minimum confidence level required to include a word in the output. |
Output track | Type | Description | Output1This column indicates whether the information contained in the track is included by default in the output created by an output task (when you don't set the Input parameter for the output task). |
---|---|---|---|
Data
|
OCRResult | Contains one record, describing the analysis results, per line of text, per video frame. | No |
DataWithSource
|
OCRResultAndImage |
The same as the |
No |
Result
|
OCRResult | Contains one record, describing the analysis results, for each line of text. When a line of text appears in many consecutive frames, Media Server produces a single result. | Yes |
ResultWithSource
|
OCRResultAndImage |
The same as the |
No |
CharResult
|
OCRDetail |
Contains one record, describing the analysis results, for each line of text. However, the records in this track provide detail about individual characters rather than the whole line. This track is available only when you ingest images or documents. It is not available if the source is a video file or stream. |
No |
WordResult
|
OCRResult |
Contains one record, describing the analysis results, for each word. This track is available only when you ingest images or documents. It is not available if the source is a video file or stream. |
No |
Start
|
OCRResult |
The same as the |
No |
End
|
OCRResult |
The same as the |
No |
Field name | Type | Description |
---|---|---|
id | UUIDData |
A unique identifier to identify the line of text. Every record in the |
text | TextData | The result of running OCR on the text. |
region | RectangleData | The location of the text in the frame. |
confidence | Integer | The confidence score from OCR, or 100 for text extracted from text elements. |
angle | Integer | The orientation of the text in degrees (rotated clockwise 0 , 90 , 180 , or 270 degrees from upright). |
source | String | Specifies the origin of the text: static text from an image or video (image ), text from video of a news ticker, with text scrolling from right to left (scroller, left ), or a text element in a document (text ). |
The same as OCRResult
records, but with the following additional fields.
Field name | Type | Description |
---|---|---|
image | ImageData | The source frame. |
Field name | Type | Description |
---|---|---|
id | UUIDData |
A unique identifier to identify the line of text. Every record in the |
character | OCRChar |
There is a
|
|