Cloudingo (a data analysis and de-duping tool) has recently made some important changes to the way they “clean names” resulting in incorrectly matching and merging records that are not really duplicates. Here’s just one example of the “new” Name cleaned matching: “American University” and “American International Group” are identified as the same company and merged.
According to a Support email from Cloudingo, they recommend that “if you are running mass or auto merging on an account filter with Company Name Cleaned, that you stop doing so.” In fact, they offer 2 alternatives (neither of which makes sense in the real world where people are trying to actually use the tool):
1 – do a manual review/merge of any group of ‘matched records’. That defeats the purpose of purchasing an automated tool. There are many free tools that will help me find possible duplicate records.
2 – add more matching criteria. Also not practical since many of us de-dupe records as early as possible so we are not spending time collecting additional data just so we can merge out a record.
Cloudingo Support also suggested some other equally “interesting” solutions: use first word match; use first N letters match; set up a fuzzy match. I still can’t figure out how any of these provide even close to what I need — which is the ‘name clean’ I’ve been using all along. Not to mention, I would have to tell my clients and colleagues to review and rebuild their existing (and previously tested and deployed) de-duping jobs.
From experimentation, some of the words that seem to have been added are: ” college”, ” center”, ” associates”, ” group “, ” services”, ” partners “, ” foundation”, ” solutions”. There could be others, but I was so discouraged in my findings, I stopped looking.
Cloudingo is a tool that in the past I have used, and have highly recommended to my colleagues and clients ( MTI is even a Referral Partner for them — we’ll have to see if they let us continue after this post). I don’t have insight into their product roadmap, but until they either let users control the clean terms, or roll back these aggressive changes, I have to advise anyone (whether you’re my client or not) to be very, very careful about your Cloudingo de-duping jobs. Who knows what other changes Cloudingo could make without thinking through the impact to their users. And, as all of you know, trying to un-merge records that have been incorrectly merged costs more time and effort than the value any tool brings.