Study Notes
Download pdf 844 KB
12 years delivering excellence
Join a global community
Globally recognised
Toolkits, content & more
Why is this different type of data an important aspect to consider? Now particularly at the prepare phase, we need to understand the types of data that are important for your organization; and secondly, how are you going to capture that information, and how are you going to store that information.
Now that process seems a little bit elementary, but in reality, if you haven’t prepared for this at the outset you can imagine that your dataset may only be able to hold numerical values, when in reality you may need to have a lot of qualitative understanding. The vice versa may be true, where you may have a whole host of qualitative data. Now that becomes an issue when trying to analyze that data, for example. So being able to create a robust infrastructure based on the type of data that you know you’re going to get in is extremely important.
It’s information with a high degree of organization; it’s simple generic fields. You can go back to some of the census data that we have to fill out every couple of years, name, gender, what you do, your income levels, all those types of things are so standardized that it means that it can easily be held across your data fields that you want to capture.
It’s information which does not conform within the formal structure of data modules; it may be the next layer down. So you may be thinking about things like different types of profession; how often have you been into a banking environment, and they’ve asked for your profession, and it doesn’t fit neatly into one of the many options that they may have in that scroll down menu. You need to have the ability to also do free text in that sense, to be able to still capture the richness of that insight, but also not to be able to lose it.
The key to the semi-structured data is you may want to see how it then starts to fit to structured data later on. So if you’re seeing in the banking example I gave you a lot of people talking about a new type of career that they may have or a job, you may want to include it within the dropdown field later on. So a good example of that is data science, a relatively new profession which is unlikely to be within the formal constraints or infrastructure of drop down menus of professions. You may want to see if a lot of your customers are starting to be data scientist, that you may want to be putting that across the data set as well.
That’s effectively, information that either does not have a pre-defined data model or is not organized in a pre-defined manner. This can become relatively problematic both from a capturing perspective as well as an analytics perspective to determine some of the key insights from it.
Social media data is just one of those great examples where you’re not quite sure what you’re going to get back until you actually get the information. How do you capture that? How do you store that? And more to the point, how do you then analyze it and then take it back and link it back to some of those customers that you meet that are talking about you? Think about a Facebook profile; that could be anybody’s name, you don’t even know if they are a customer or not. So that doesn’t necessarily fit into the actual formal data realms that organizations have captured before.
So you may want to use a variety of new tools. A great example of that is as Crimson Hexagon use a lot of social listening tools out there and analyze your social media buzz in the ether. They are a great organization that can then help you to tie that back into your customer databases to then start to work from moving from structured data into unstructured and how they start to link together to really understand what your customers are doing, saying, thinking and feeling about you.
Good data clearly has a couple of good characteristics. They are accurate, it’s complete, consistent, unique and timely.
Accuracy and completeness are really fundamentally important. If we have big massive gaps in the data, it just doesn’t help anyone to draw those big insights.
You also need a certain consistency within your dataset. Make sure that you’ve captured all the various fields that you’re going for. And understanding exactly what those fields and what you’ll do with that data is actually as important that the outset.
To draw insights, you want the data to be unique. In other words, you want it to be something that nobody else knows about. So you need to think about how you go about doing that and capture things that you think only you have access to. And finally, clearly it needs to be relevant and time bound.
What does bad data look like?
It’s inaccurate and it’s conflicting, so when datasets tell you different things it becomes confusing as to what you should believe.
If it’s irrelevant, if you’re just capturing things for the sake of it, it not only becomes really tedious to analyze but also, from a customer perspective, only capture what you need. Insurance is a great example of this when, effectively, you don’t capture what you just need and then people start to get irritated about giving that to you. Only ask for things that will be important.
It could be outdated, so if you’re asking for a couple of years’ worth of history of something, again people will start to question why you need that. And finally, if the dataset is incomplete it will basically mean you can’t glean the right insights or create reliability that you may want to think about as well.
What common sources of bad data do you need to think about?
What are the entry points for the data in the first place. What are the ways in which that data could start to turn bad in effect? And clearly, you also need think about some of the implications and considerations. At what point do you need to start to refresh that data, to ask it again? To ask your customers for permission to get that data again. Keeping it fresh is going to be fundamental in being able to get accurate data, which basically means you can create great campaigns which are relevant to your consumers.
Now making key judgments about when you do that within the customer journey and life cycle is going to be critical, because if you ask too much or too many times your customers are clearly just going to be annoyed. Ask too little and suddenly you don’t have the accurate data you need to build robust campaigns.
So for example, things like entry quality become extremely important, getting it right the first time around. When you’ve got new customers going through application processes, make sure that they are filling it in in the most robust way. A great example or bottlenecked point here can be when data is being entered by call center agents. Make sure that they understand the importance of getting the data right the first time.
How do you input that data? Is it self-fulfilled? In other words, does the customer do it? Or is it done through an agent model?
You need to have a robust way of being able to process that data and identify when that data is going wrong. So if a field is not filled in properly or is not filled in at all, often you can’t move any further. It’s important to have those steps in place.
Think about where the data sits and what systems it sits on. You may capture a wealth of data in one system but it doesn’t translate into another. So the usefulness of that data is incredibly compromised.
Wen thinking about the moving nature or the refreshment of the data, aging quality of data is really important. A lot of the times in insurance companies you tend to basically have data which is a number of years old. Someone may have done a quote with you two or three years ago and you then try and use that data to try and identify if they’re coming up to their renewal period. Now, clearly so much could have changed in that environment. And if you start to spam them or you start to do communications to them during that period of time, they may well just be more turned off than anything else.
Back to TopRitchie Mehta has had an eight-year corporate career with a number of leading organizations such as HSBC, RBS, and Direct Line Group. He then went on setting up a number of businesses.
Data protection regulations affect almost all aspects of digital marketing. Therefore, DMI has produced a short course on GDPR for all of our students. If you wish to learn more about GDPR, you can do so here:
If you are interested in learning more about Big Data and Analytics, the DMI has produced a short course on the subject for all of our students. You can access this content here:
The following pieces of content from the Digital Marketing Institute's Membership Library have been chosen to offer additional material that you might find interesting or insightful.
You can find more information and content like this on the Digital Marketing Institute's Membership Library
You will not be assessed on the content in these short courses in your final exam.
ABOUT THIS DIGITAL MARKETING MODULE
This module dives deep into data and analytics – two critical facets of digital marketing and digital strategy. It begins with topics on data cleansing and preparation, the different types of data, the differences between data, information, and knowledge, and data management systems. It covers best practices on collecting and processing data, big data, machine learning, open and private data, data uploads, and data storage. The module concludes with topics on data-driven decision making, artificial intelligence, data visualization and reporting, and the key topic of data protection.