HeaderEntityN
The entities to extract for the header rows of input tables. This parameter allows you to extract entities from structured data.
When matching CSV or TSV input, Named Entity Recognition matches the first non-empty row of input against these configured header entities. You can use this to extract landmark values that describe something that you want to find in a table column. You specify the entity to extract from the other cells by setting CellEntityN
TIP: You can optionally configure Named Entity Recognition to search additional rows for the header entities by setting MaxSearchHeaderRow.
You can also match against structured data, such as the output from Media Server OCR. In this case, you must specify TableCellPath to provide the path to the cells in the structured data. Named Entity Recognition then matches against the header and cells in your structured data.
For example:
HeaderEntity0=pii/date/dob/landmark/all CellEntity0=pii/date/nocontext/all
This example matches date of birth landmark values in the header, and for all subsequent rows in that column, it extracts any date values.
NOTE: The Named Entity Recognition PII Package, PHI Package, and PCI Package, provide landmark entities in most grammars. To extract entities from tables with the Named Entity Recognition standard grammar files, you might need to create your own landmark entities.
You can specify multiple entities in a comma-separated list. If the table header matches any of the configured header entities, Named Entity Recognition matches the cell content against any of the configured cell entities. This option might be useful if you want to match a particular entity in multiple languages, or if you want to include a custom entity in addition to a standard one.
You can also use wildcard expressions in the entity names. The * wildcard matches any number of characters, and the ? wildcard matches a single character.
For more information about table extraction,
Type: | String |
Default: | None |
Required: |
No |
Configuration Section: |
|
Example: | HeaderEntity0=pii/date/dob/landmark/all CellEntity0=pii/date/nocontext/all |
See Also: |