This week let’s delve into an issue that affects many data sets – duplication. A duplicate is essentially the same entity existing more than once within the same data set.

Duplicates exist due to a number of reasons. The main reason is due to multiple data sets being used to create a single repository of information For example – data from inbound leads could be combined with web based data and purchased lists.  This combination of lists from difference sources can mean an entity is imported more than once, thus we have duplication within the database.

So, what makes us want to ensure there are no duplicates? There are three main reasons:

  1. Appearance,
  2. Perception, and
  3. Minimising wastage.

Not only does duplicated data cost us time, effort and money – for no extra monetary gain – it creates the perception of an uncaring business that doesn’t take its data (and therefore its customer) seriously and appears unprofessional. Think about it yourself – if you were to receive a communication more than once regarding the same piece of marketing – how would you feel? Is that business looking after you or treating you like a number?

Organisations that care about their data (and you all do, otherwise you wouldn’t be using DataTools products or reading this blog!) can take a number of different steps to remove duplicates from their systems. There are front end applications that allow some form of rudimentary de-duplication when importing data – for example excluding records based on a phone number, or an address. Depending upon your application, this can often be sufficient – yet for the majority this is just far too basic. Most require something that will not only handle the easy to find duplicates, but also those duplicates that you could never hope to find through normal channels.

This is why we recommend our Twins software for all your de-duplication requirements. Twins is available in an easy to use “Stand Alone” desktop based version, and as an API component. Either approach will rid your database of duplicates, yet only the API version will allow for a fully automated system to help you keep your database pristine.

So why should you care about duplicates? An example that nearly any business can relate to is this. Most businesses have a database of Customers and Prospects and normally these two data sets are either in distinct tables, systems or just identified as such by a varied status within the record. You may want to generate additional long term revenue by sending out an acquisition campaign to increase your customer base. This campaign could be something basic, with enticements embedded into it that are purely to acquire additional customers from your prospect list. With each mail piece costing upwards of $0.90 – $1.00, not to mention the associated costs of the enticement itself and the labour involved in either the response or the subsequent follow-up, you can see that the costs of sending this offer out to existing customers quickly adds up, with no return on the investment. If your database was effectively cleaned and duplicate free, then your exposure to existing customers is completely negated.

There is a great deal to think about when you come to remove duplicates from your database, but with the DataTools Twins software the actual process itself isn’t complex at all. Each business scenario is potentially different, which is why Twins allows for varying degrees of matching with the ability to choose a looser match through to a tighter match, depending upon your requirements. But more on that next time.

So just think about your database, and ask yourself – When was the last time you did a proper audit on the level of duplication within the system?

Want to know more about our Twins software and how it can help you with duplicates? Call our Sales team on (02) 9687 4666.