A more exhaustive documentation will be sent to you “DQE Custom UI”.
All the elements presented below are customizable via our interface.
- Duplicates appear side by side with the first one which is listed as "PARENT".
The rule applied as standard and select the oldest.
The "CHILD" records appear next.
The duplicates have the same group number.
The percentage indicates the certainty of closeness between duplicates. As mentioned before, the comparison between email, telephone and address is an “OR”.
The percentage retained and displayed is the lowest percentage.
The comparison for telephone fields is carried out transversely. It's cross-matching.
In the comparison, an empty field compared to a filled in field returns 100%.
As part of the “merger” of duplicates, here are the operations performed on the duplicates.
The PARENT is the oldest record created.
- For addresses:
Empty fields are completed with filled fields in other records.
If 2 fields are filled, we keep the one with the most recent update date
- For Phone and Email : Identical to address fields.
Our recommandations :
Determine 3 or 4 rules maximum. The goal is to find duplicates en masse.
A rule that only reports a duplicate is useless.
In all rules, one of the fields must have the tolerance indicator at 100%.
This field is positioned first in the comparison.
This approach makes it possible to determine the “hash” keys and therefore to improve processing times.
The tolerance indicator should not be too low.
Otherwise, response times deteriorate sharply and many false duplicates are discovered.
In general, the cursor is positioned at 10 or 15%, this represents the inversion of one letter out of 10.
Experimentation in the workshop with the customer makes it possible to find the optimal value.
Never cross Last name and First name alone.