Newsroom

Cooking Up Success: A Recipe to Normalise Data

For when you’re hungry but too lazy to fix yourself a meal…figuratively speaking, of course!

March 5, 2025

Remember your first kitchen and the thrill of having your own culinary space?

(Or if you don’t have one yet, then we totally understand – the economy is in shambles.)

Picture a fridge stocked with your favourite drinks, weekends reserved for gourmet experiments, and the most fragrant cleaning products to keep the grime at bay.

Yet, reality often falls short of our expectations. Organizing turns into a nightmare, food spoils before it’s cooked, and no amount of cleaning products can hide the mundane reality of household chores.

It takes weeks, maybe months, to find a comfortable routine…and even then, is it the best you can do?

In real life, the answer is often elusive. Thankfully with data, it’s pretty straightforward.

Customer Information

What was missing from that picture? The hard work of just organising and planning things. Thinking of what to buy, where to put it, how to prep it for the week ahead, where you’re going to find the time and strength to do everything…

Now, one way to cut through that dreary workload is normalisation. And in techspeak, this refers to creating standards for a particular dataset.

When it comes to normalising a data set, we need to know what it is we want to normalize. The key fields that should come into play are:

Company Name

Address

Suburb

State

Postcode

Country

First Name

Last Name

Phone (Phone2/Mobile)

Email Address

Fields such as Company Name, First Name, Last Name and Email address will need to be handled differently to a little differently.

For Company Name, proper case would be a great start, as would the standardization of PTY LTD from all the different iterations of it.

First Name and Last Name would also be proper cased, but certain care needs to be taken when dealing with surnames such as McBride or O’Brien.

Simple proper casing of those two examples won’t work, so it’s crucial to implement specific rules that address unique capitalization patterns.

One way is to develop an algorithm that recognizes prefixes like “Mc” and “O’” and automatically capitalizes the subsequent letter. Additionally, provide a manual review system to correct errors and a feedback mechanism to continuously refine the process.

This approach ensures that names are accurately and respectfully stored, enhancing data integrity and reducing errors.

Email

Email is another troublesome field, but there are a couple of things that can be done. Firstly, you’ll want to ensure that it conforms to the email ‘standard’, that being [emailnamehere@provider.suffix].

Once this is done, you can then start to look at the suffix of the email, being that after the @, to ensure that the domain names actually match.

There are of course simple ones to check, like gmail.com which should NEVER have anything after the .com. If you start your focus on the top 100 domains, you’ll then be able to build your logic as you go.

Addresses

Address, Suburb, State, Post Code and Country can all be taken care of in one simple process!

Or you could have one person slog through all that information. But if you’ve been there, you know that it’s a waste of, money, resources, and morale. Nobody’s got the time for that!

You could opt for a tool that can quickly and effectively parse this data into its normalised elements. Whether you need this done in real time or in bulk is entirely up to you, and you’ll need to consider how best to implement it in your system (web API, point and click, or a scripted interface?).

Phone Numbers

Phone numbers are the most mismanaged data element in all the data sets that we’ve ever had to deal with. We’ve seen it all. We’ve had clients specifically request that ( ) be included within the phone element!

But the most common issue is that the leading zero has been dropped from the data.

The root cause of this is that someone has opened a CSV/TXT file in Excel, made some changes and then saved the results. As a result, Excel drops the zero…never to be seen again.

Whatever your normalisation strategy, ensure that:

it’s globally accepted in your database

you have back-end systems in place to monitor and resolve data that your front-end can’t.

And that’s how you clean up your data!

PRODUCTS

FEATURES

RESOURCES

Cooking Up Success: A Recipe to Normalise Data

Table of Contents

Customer Information

Email

Addresses

Phone Numbers

Call to Action Heading Goes Here

Address Validation

Format, clean, repair and verify addresses with 100% accuracy​

Address Capture

Capture addresses more accurately, faster and with less effort​

Email Validation

Check and validate email addresses in real time​

Phone Validation

Check and validate landline or mobile number in real time

Professional Mailing Solutions

Easy to use point and click, to integrated solutions​

Australian Address Lists

A platform designed for purchasing bulk Australian address data​

DataTools Seltaris

A point-and-click platform built to clean and enhance your existing datasets.

Explore products

Bulletproof Security

Ensure data is secure in every way possible

Reliability

Consistent, predictable, scalable and more...

Human Touch

Machine Learning & AI models to read and understand addresses​

More Data Sources

Use Australian, New Zealand and International data sets to get accurate addresses

Simple Implementation

Use webservice and easy to use APIs for Address, Phone, Email and Geocoding

Explore features

DataTools Newsroom

Discover what's new at DataTools with our latest updates and announcements.

Case Studies

Learn how businesses are successfully using DataTools Products.

eBooks & Reports

Unlock a world of knowledge with the latest eBooks from DataTools.

Explore resources

Login / Signup

Australian Address Lists

Search

How it works?

States

Login / Signup

States

New South Wales

Victoria

Queensland

Western Australia

South Australia

Tasmania

Australian Capital Territory

Northern Territory

Login / Signup

Format, clean, repair and verify addresses with 100% accuracy

Capture addresses more accurately, faster and with less effort

Check and validate email addresses in real time

Easy to use point and click, to integrated solutions

A platform designed for purchasing bulk Australian address data

Machine Learning & AI models to read and understand addresses