Home addresses in the blood test records are also prone to typographic error. Roughly 20% match exactly with our address dataset.Another 75% match after cleaning using regular expressions. Another 1% are processed using a fuzzygeocoder, leaving 4% of test addresses unresolved.After cleaning,we collect and generate three kinds of fea-tures: