Our AI-generated summary
Our AI-generated summary
In the domain of Customer Relationship Management (CRM), ensuring the reliability and quality of client registration information stands as a critical factor for the success of targeted retention campaigns. The challenge is exacerbated when a decentralized registration process draws data from diverse sources, leading to an increased susceptibility to duplicated records and insertion errors.
Existing tools support detecting duplicate entries – however that process often relies on an exact match of a particular key, comprising personal unique data such as phone numbers and email information. This process overlooks other essential details – the presence of invalid information or characters in certain fields, or similarities (not exact matches) that almost certainly are mistyping errors, allowing for potential inconsistencies in the database.
Our AI-generated summary
Our AI-generated summary
Recognizing the urgency to rectify this pervasive issue, we partnered with a car retailing group in a project whose primary goal is to establish a unified client database, while enhancing the overall accuracy and usability of the CRM. Aiming to streamline direct marketing actions and sales efforts, the project seeks to implement a more efficient and automated approach to rectify duplicates, acknowledging the imperative role data quality plays in targeted retention campaigns.
The strategy followed can be summarized in four steps:
- Selecting potential duplicate contact pairs: Identify pairs of contacts that may be duplicates based on shared personal information such as phone numbers, emails, or ownership of the same vehicle.
- Classifying duplicates using similarity scores: Assess the similarity between contact pairs through metrics like character and phonetic similarity. Use these scores to train a classification model, determining the probability of duplication for each pair.
- Creating duplicate groups: Group identified duplicate pairs. Each group should ultimately be represented by a single contact entry in the database.
- Consolidating client information: For the chosen contact in each group that will remain in the database, consolidate the client information by adopting the most reliable and up-to-date data available.
Our approach to addressing duplicated records involved leveraging sophisticated machine learning models to detect duplicates. Through meticulous stakeholder discussions, we ensured a seamless transition from theoretical frameworks to real-world applications, bridging the gap between assumptions and practical implementation.
The manual, one-to-one correction of duplicated contacts proves to be a tedious and time-consuming task.
As a result of these efforts, we were able to detect and remove 12% duplicate contacts in the CRM database.
The remaining contacts in the database were consolidated, to keep the most trustworthy data for each field. We also proposed a detailed process for converting invalid phone numbers and emails into a standardized format, strengthening the database against potential inconsistencies. Moreover, we recommended the adoption of a more robust unique key system, minimizing the risk of duplications and ensuring long-term database integrity.