This article provides explanation for all the potential values of queriedHash, matchedHash, queriedDateOfBirth and matchedDateOfBirth - values that you get from Connect ID whenever a potential duplicate is found. Explanation will be useful in situations when MA does not reply with person details and you will be forced to show only basic information, coming from Connect ID, to the end user. It will not contain personal information.
The "Value" column on the left contains so called transformation. It describes the way that original word (name) has been modified before being matched. Note that it is possible that more than one transformation has been applied to a word and in effect the value of either queriedHash or matchedHash will be a concatenation, e.g. "NormalizeChars+Translate". Multiple combinations are possible but the interpretation is intuitive - two (or more) transformations have been applied one after another. In the example given it would be: "Diacritics removed, name translated".
Value | Description | Example |
Exact | First and last name are matching. Algorithm ignores casing and white spaces. This approach allows to give the maximum score (i.e. 1) for names with most basic mistakes – case or adding a white space. | „John SMIth” -> „john smith” „ joHn smiTH „ -> „john smith” |
Original | This value means that original value of the date of birth has been used for duplicate matching in either input (queriedDateOfBirth) or in the record found (matchedDateOfBirth). If both are "Original", then date of birth has been matched exactly. Most common case is when 1 value is "Original" and the other takes value of any of the date transformations explained below, e.g. "IncorrectDay". Then it is recommended that description from the respective date transformation is used. | |
CollapseSpaces or CollapseSpaces 2 (the space is actually there, so the proper name of the transformation is "Collapse Spaces 2") | Multiple consecutive spaces are changed into a single space. | "John Smith" --> "John Smith" |
CharFolding or ICUCharFolding or ICUCharFolding 2 | First and last name are matching after some letters have been replaced with common alternatives. | Müller --> Mueller |
FirstNameVariants | First and last name are matching after first name has been replaced with its variant (from the same or difference language). The last name is transliterated to Latin alphabet if the resulting first name is also in Latin. | Botros [ARA], Peter [ENG], Peter [GER], Pierre [FRA], etc. |
RomanizedNames | First and last name are matching after either one or both have been replaced by their romanization. If romanization exist only for the first name or only for the last name, the other is transliterated to Latin alphabet if needed. | しめ -> Shime المحطب -> Almehtab |
NormalizeChars | First and last name are matching after diacritics have been removed. | Dzierżawski -> Dzierzawski |
SelfLearning or SelfLearning 2 | First and last name are matching but there are spelling differences. Compares the name to the existing database. Produces variants of the name that already exist in the DB which are “close” to the original in terms of Levenshtein distance. The more frequent a certain variant found in the self-learning DB is, the higher the score. Algorithm performs the same operation for both first name and last name separately. This transformation will account for simple spelling mistakes. A customized version of Levenshtein distance is used that takes into consideration characteristics of the input (e.g. length of the name). | Tohmas -> Thomas Mart -> Marta |
Tokenize | Some but not all parts of multiple names are matching. This only applies for multiple names. Transformation allows to match name which is a subset of another name. | "John Smith-Gunderson" --> "John Smith", "John Gunderson" |
SwapBirthDate | Date of birth is matching after days and months have been swapped. | 12/04/1978 -> 04/12/1978 |
IncorrectDay | Year and month of birth are matching but day is different. | 12/04/1978 --> xx/04/1978 (all days will match) |
IncorrectYear | Month and day of birth are matching but year is different. | 12/04/1978 --> 12/04/xx (all years will match) |
IncorrectMonth | Year and day of birth are matching but month is different. | 12/04/1978 --> 12/xx/1978 (all months will match) |
OneDayAfter | Dates of birth are 1 day apart. | 2001-05-31 --> 2001-06-01 |
OneDayBefore | Dates of birth are 1 day apart. | 2001-05-31 --> 2001-05-30 |
SwapNames | First and last name have been swapped. | John Smith -> Smith John |
Translate | First and last name are matching after translation. | Mateusz -> Matthew אביחי -> Avihai |
Transliterate | First and last name are matching after transliteration into latin alphabet. | Андрей Печонкин -> Andrey Pechonkin |
MergeLetters | First and last name are matching after 2 consecutive letters have been merged into 1. | "Johhn Smith" --> "John Smith" |