Standardizing U.S. county names

I have a table with a column listing Michigan counties and I’d like to check if they are all spell correctly. Is there a lookup table/function that I could use in Exploratory that would check the county names and correct any misspellings?

Hi Gustavo,

There is ‘countycode’ function, you want to use this and see if it returns appropriate results for your data.

https://exploratory.io/reference/#countycode

Thanks Kan! I did use the ‘countycode’ function. I had to make some minor adjustments and add a column with State Name (MI) to make the function work. If all your counties are spelled correctly, it works flawless but if there are misspellings, it won’t match those (e.g., Saint vs. St.). Fortunately, I only had a couple of mispellings. It is nice that the function adds the FIPS codes which is much cleaner and easier to use for matching as long as FIPS codes are present in other data sets.

Wish List: Is there a way to correct for misspellings? It wasn’t a big deal because I was only looking at counties in Michigan (83) but if I was to do cities or counties in the country, that would be much harder to do manually.

My point was that, if you there are any values that didn’t match by the countycode function, that means they are most likely ‘misspelling’. Once you identify them, then it’s an easier problem.

To automatically correct the misspelling is another problem because people will misspell in many different ways. There is a R package called ‘stringdist’ that provides such functionality, you might want to look into.