KillDuplicates
This parameter determines how the Content component handles duplicate documents. It allows you to prevent the same document or document content from being stored in Content more than once.
Use one of the following options:
NONE
|
Allows duplicate documents in the Content index. Content does not replace or delete documents. |
REFERENCE
|
Replaces an existing document with the new document if the documen tot index has the same value in its DREREFERENCE field. |
REFERENCEMATCHN
|
Replaces the existing document with the new document if the content of the document is more than NOTE: This method can deduplicate only documents that are already synced in the Content component index. It cannot deduplicate similar documents in the same index job. |
FieldName
|
Replaces the existing document with the new document if the document to index contains a ReferenceType field named You can specify multiple ReferenceType fields in this option, separated by a plus symbol (+) or a space. In this case, Content deletes documents that contain any of the specified fields with identical content. You must percent-encode any punctuation characters in the field name. NOTE: You identify fields as ReferenceType fields by using field processes in the Content component configuration file. If you list multiple fields in the same PropertyFieldCSVs parameter where you list the |
ReferenceField,GREATER:VersionField
|
Replaces the existing document with the new document if the document to index contains a ReferenceType field named
NOTE: When you index IDX documents, for the version comparison to work correctly, the value in the field that you use as the #DREFIELD MyField="N" Content treats existing documents with a missing or non-numeric value in the |
You can postfix any of these options with =2
, to apply the KillDuplicates
process to all Content databases (rather than only to the database into which the current IDX or XML file is being indexed).
If you do not set KillDuplicates
, it defaults to the option specified for KillDuplicates
in the Content configuration file [Server]
section.
NOTE: When you are using the DIH with DistributeSendMinimal mode, DIH sends a minimal representation to all child servers to allow deduplication. To deduplicate on FieldName
you must configure a field process in the DIH configuration file with the fields that you want to use to deduplicate. DIH then includes these fields in the representation it sends to its child servers. By default, it sends only the DREREFERENCE
.
For more information about configuring a field process, refer to the Knowledge Discovery Administration Guide.
Actions: | DREADD
DREADDDATA |
Type: | String |
Default: | |
Example: | KillDuplicates=REFERENCE
|
See Also: | KeepExisting
KillDuplicatesDB |