Question:
How can I check a file cabinet for duplicates using Autoindex?
How can I check a file cabinet for duplicates using Autoindex?
Answer:
Preparations
Preparations
- Create a new text field, for example "Status", which will be used to track which of the following three states any document is in:
New – Document has not been processed by the duplicate check yet.
Original – Document has been processed and no other matching document was found.
(possible) Duplicate – Document has been processed and at least one other matching document was found.
- Designate one "marker text" which will be written to the index field created in step 1 for each of the three statuses. In this example, the following texts will be used:
New → NEW
Original → ORIGINAL
(possible) Duplicate → DUPLICATE
- Determine if only new documents should be run through this duplicate check or whether edited documents which are pushed back into the "New" state should be processed as well.
File Cabinet Event
- As trigger condition, use "for new documents" and/or "if index entries of existing documents have changed", depending on the use case. This example will assume that both check marks are set.
- Filter the trigger condition to only documents in the "New" state. If you use the trigger condition "if index entries of existing documents have changed", also specify the filter "Has changed" for the index field.
- As external data source, use the file cabinet's database. Additionally, filter the external data source so it only includes documents in the "Original" state.
- As matchcode, use all index fields which, when combined together, identify a document uniquely (in this example: Invoice Type, Company, Billing Date and Invoice Number). Additionally, add the relation "Doc ID is not equal Doc ID" to the matchcode to prevent documents from being matched with themselves.
- Ensure that the iterator is set to "Use first data record for indexing".
- If a document has no matches in the external data source, it is not a duplicate and will have its state set to "Original".
- If a document has matches, it is a duplicate and will have its state set to "(possible) Duplicate".
Scheduled
- Filter the trigger condition to only documents in the "New" state.
- Follow steps 3 through 7 for the file cabinet event version of this Autoindex configuration