Maturity Tracking Algorithm

The insured maturity tracking algorithm aims to determine a similarity between an insured in ClariNet and published obituary information provided by a third party. Details of the matching algorithm are provided below.

Significant ClariNet Fields

Insured Name(s)
- Insured details tab
- Name from the contact set on the Insured Contact Details field
- Insured name variations
Insured Date of Birth
Insured Date of Death
Insured Addresses
- All addresses from the contact set on the Insured Contact Details field

Fields in external feed

Name components (e.g. first, middle, last)
Recent address information (e.g. state, city)
Date of Birth
Date of Death
Obituary/Obituaries text
- Information only, not used for matching

Matching algorithm

Fields are considered in groups:

Address
Name
Date of Birth
Date of Death

The best (most similar) match in each group is taken when considering the strength of the match. For example, an Insured in ClariNet may have many name variations defined, the most similar name will be used when calculating confidence.

Missing data points are skipped. For example, if the Insured in ClariNet only has a last name, the first/middle names in the external feed will not be considered. If the record in the external field has no last name or maiden name, it is skipped entirely.

Fields are weighted so that more specific fields contribute a greater amount to the overall confidence score. That is, name is more specific than city so is weighted more in the confidence calculation.

The overall confidence score is a measure of how close to a perfect match the data is. For example, if we have names and DoB then a perfect match will be the sum of the weights of those fields. If the fields don’t match perfectly, then the confidence will be the sum of the weighted field matches divided by the “perfect match” value. This is the total similarity. Using the two field example:

\frac{(Name Weight \times Name Similarity) + (DoB Weight \times DoB Similarity)}{(Name Weight + DoB Weight)}

That is to say:

Total Similarity = \frac{\sum (Field Weight \times Field Similarity)}{\sum (Field Weight)}

Field weightings are controlled by ClearLife and have been calculated to slightly favour false positives rather than missing a potential match. In practice, this means that names are weighted much more strongly than things like address which may be out of date.

Address match

Address components are matched individually. A similarity score between ClariNet and the external feed is calculated. This allows for data entry mistakes, for example: “Columbs” instead of “Columbus” without producing too many false positive matches.

Name match

Name matching follows a similar rule to addresses. However, common patterns are also tried; for example: Middle name is also tried as first name.

The individual components of the name are not considered separately, a full name is generated from the components and compared.

Date of Birth/Death match

Dates in the external feed may have limited information:

Year only
Year and Month only
Full date

Dates with limited information are considered with half the weight of a full date match.

The number of days between the two dates is considered to score the similarity. Zero days difference constitutes a perfect match.

Match rank/confidence

Once the total similarity is calculated (ratio of actual match score/perfect match score), any score below a configured minimum value is rejected (rank 0). For anything meeting the minimum similarity, a rank of 1-4 is determined with the given label.

Low
Medium
Medium
High

Visually, that looks like this:

Maturity Tracking Algorithm ​