Match*Pro is an application designed to find records that refer to the same entity across different data sources. This matching process, also known as record linkage, data matching, or entity resolution, compares identifiers such as name, social security number, or address between two or more data files. Match*Pro links files using a probabilistic record linkage framework based on the Fellegi and Sunter model. Probabilistic record linkage is a method that uses probabilities to determine when a given pair of records is a match.
Match*Pro includes tools to assess the quality of the linkage data (e.g., data validation). It provides pre-defined field validators (e.g., name, date, SSN, telephone number, etc.) but also lets users define custom validators. Validating data files and resolving data errors before performing a record linkage increases the chances of locating high quality record matches.
Match*Pro has a user-friendly, tabbed interface to configure linkages and manually review uncertain matches. The linkage configuration feature is flexible and allows users to specify blocking and matching methods, adjust the blocking sensitivity, define unknown values, set weights, and perform substitutions of matching fields. The manual review screen is color-coded to quickly show match status of any linked pairs. The application also includes tools to filter, categorize, and export the linkage results.
The software can link data using either raw PII or privacy-preserving record linkage (PPRL) techniques through hashed tokens, which allows datasets to be linked while minimizing the exposure of sensitive information. Users may create their own tokens for linkage via PPRL, or they can use a pre-defined token set that is built into the application. The built-in token set was designed to produce very few, if any, false positives, is HIPAA-compliant, and meets the Expert Determination Standard of HIPAA Privacy Rule §164.514(b)(1). A recent rigorous evaluation of the built-in token set demonstrated that the potential re-identification risk is very low [Kantarcioglu et al, 2025
].
Users should read the application's help system to learn more about tool before configuring any linkages.
Refer to Getting Help for any additional questions.
Download Match*Pro
- Latest Release: Version 3.1 – October 16, 2025
Reference
Murat Kantarcioglu, Will Howe, Benmei Liu, Valentina Petkov, Esmeralda Casas-Silva, Diana Velasquez-Kolnik, Bradley A Malin, Lynne Penberthy, A novel analysis methodology for assessment of re-identification risks for the National Cancer Institute cancer registry privacy preserving record linkage technique, Journal of the American Medical Informatics Association, 2025; ocaf172, https://doi.org/10.1093/jamia/ocaf172
