fuzzylink: Probabilistic Record Linkage Using Pretrained Text Embeddings
Links datasets through fuzzy string matching using pretrained text embeddings. Produces more accurate record linkage when lexical string distance metrics are a poor guide to match quality (e.g., "Patricia" is more lexically similar to "Patrick" than it is to "Trish"). Capable of performing multilingual record linkage. Methods are described in Ornstein (2025) <doi:10.1017/pan.2025.10016>.
| Version: | 
0.2.5 | 
| Depends: | 
R (≥ 4.1.0) | 
| Imports: | 
stats, utils, dplyr, Rfast, reshape2, stringdist, stringr, httr, jsonlite, httr2, ranger | 
| Published: | 
2025-08-29 | 
| DOI: | 
10.32614/CRAN.package.fuzzylink | 
| Author: | 
Joe Ornstein  
    [aut, cre, cph] | 
| Maintainer: | 
Joe Ornstein  <jornstein at uga.edu> | 
| BugReports: | 
https://github.com/joeornstein/fuzzylink/issues | 
| License: | 
MIT + file LICENSE | 
| URL: | 
https://github.com/joeornstein/fuzzylink | 
| NeedsCompilation: | 
no | 
| Materials: | 
README, NEWS  | 
| CRAN checks: | 
fuzzylink results | 
Documentation:
Downloads:
Linking:
Please use the canonical form
https://CRAN.R-project.org/package=fuzzylink
to link to this page.