Study Notes
Download pdf 844 KB
12 years delivering excellence
Join a global community
Globally recognised
Toolkits, content & more
Now, let’s think about where big data has come from, and some of the key associated technologies around it. There’s been more data that’s starting to come to the forefront than ever before. As a consequence of that, we have the ability to mesh different datasets, to create core insights and understanding about our customers and environment than we’ve ever had before.
Now, as a consequence of doing that, it’s the coming together of a number of different tools and techniques. Some of these key technologies in this area is around predictive analytics. Being able to use the data to understand how things may happen in the future.
And you can just see there’s so many key components that feature into big data analysis. It becomes quite a complicated beast to uncover. And it’s very important that you start small, and you start by thinking about the right type of questions, which is some of the things we touched on in the first section. And then we want to think about how to collect these type of information. How do you bring them together? How do you integrate these core elements of your data?
Srream analytics is a software that can filter, aggregate, enrich, and analyze a high, you know, a throughput of data from multiple disparate live data sources in any data format. This is a really dynamic way of thinking about analytics functioning. It’s effectively being able to use things in real-time for multiple datasets to make certain level of analysis and layer one over the top, to be able to determine key insights that you may have about your customers. Clearly, it’s important to then respond to that, which we’ll come on to.
It’s technology that delivers information from various data sources, such as big data sources, as well as distributed stores in real-time and near real-time as well. So it’s about translating those raw inputs into something more meaningful. Now, this, in essence, if you’re doing it across one data source, would be quite easy. It’s when you’re overlaying a number of them that database visualization become quite a specialism in itself.
That’s providing a lower latency access and processing of large quantities of data, by distributing data across dynamic random access memory, flash, SSD, or a distributed computer system. It’s basically how you store that data, and then how that data then flows from one different system to the next.
All of this then starts to link into predictive analytics. Now, effectively what this is all about, is making core predictions based on far past and previous behavior to evaluate what we think is the likely outcome for the future.
Now, this is very, very important when thinking about buying behavior, for example in a marketing world. Think about retention behavior, or think about even complaint or claims behavior in an insurance context. We need to be very cognizant about how we make these predictions. Now, the only issue that you have is holding on to the data, and what it says, too rigorously and vigilantly. We need to overlay an element of humanistic judgment into this whole arena. And if you don’t, you will miss out on one vital factor, and that is the future. We need to be intuitive about what exactly is going to happen, the variable that have changed from the past to make certain assumptions about what will then be predictable in the future. If you solely rely on historical data, you effectively will lose the richness of the intuition that you would have in overlay, to be able to understand what would happen in 12 months’ time, two years’ time.
Think of it as a jigsaw. Now, why do we use this jigsaw illustration to think about some of the big data technologies? Well, in reality, it’s all about meshing these together. There’s an element of overlap, as well as linkages that all these various things need to come together to create a robust big data strategy. And if you don’t have these linkages, you’re effectively missing a trick.
Now, when thinking about big data, there are a number of different ways that you can think about this. One of the most useful frameworks that you can think about is that of the four Vs. And the four Vs are volume, variety, veracity, and velocity.
Let’s start with volume. In order to be or have, you know, the big data in itself, probably intuitive, you need to have a larger number of data sources, as well as volumes of insights on individual customers. It’s the aggregating which basically creates the richness within the dataset. And remember, it’s not across a single dataset, it’s about meshing a number of different data sources. So you’ve got some that feature across the slide.
The second thing is basically meshing together a different variety of data types and sources. We come back to some of the definitions around structure, semi-structured, unstructured data. And that’s what you basically get in that variety section.
Veracity is effectively the need to basically have robust data. Effectively, you have integrity within your data. Make it trusted, make it clean, make it deduped, because any anomalies that you have within it would only hinder the result that you get out of the other side, causing inaccuracies that sometimes you’re not able to pick up later on. Remember, robust inputs equal robust and valid outputs, and this is fundamentally important in big data.
We talked about speed, accuracy, if you’re putting too much historical data in there, too much would have changed. You know we talked about the changing nature of things, and the unpredictability of the marketplace. And this is a core facet here that if we don’t get right for big data, we’re going to make mistakes. We’ll start to question the reliability and validity of the data that we basically have, and the insight that we gleaned from that data. So velocity of putting the most current datasets in a dynamic way, and often in a real-time way to input the data is very, very important.
So if you think about the four Vs, a really useful framework to be thinking about how you can structure your big data approach, how do you collect big data in this way, and how do you start to analyze it as well.
Statistics is all about creating generalizable results using certain datasets. Being able to analyze with certain level of certainty that something is going to occur. You’ve also got statistics as a way to understand correlation versus causation, and they help you to outline that, but also at a more general level, to understand basic interpretations of what the data is telling you.
Advanced analytics
Advanced analytics is the analysis of data using sophisticated quantitative methods to produce insights that traditional approach to business intelligence is unlikely to uncover. So using analytics as a way to make key business decisions is absolutely critical for us.
Back to TopRitchie Mehta has had an eight-year corporate career with a number of leading organizations such as HSBC, RBS, and Direct Line Group. He then went on setting up a number of businesses.
Data protection regulations affect almost all aspects of digital marketing. Therefore, DMI has produced a short course on GDPR for all of our students. If you wish to learn more about GDPR, you can do so here:
If you are interested in learning more about Big Data and Analytics, the DMI has produced a short course on the subject for all of our students. You can access this content here:
The following pieces of content from the Digital Marketing Institute's Membership Library have been chosen to offer additional material that you might find interesting or insightful.
You can find more information and content like this on the Digital Marketing Institute's Membership Library
You will not be assessed on the content in these short courses in your final exam.
ABOUT THIS DIGITAL MARKETING MODULE
This module dives deep into data and analytics – two critical facets of digital marketing and digital strategy. It begins with topics on data cleansing and preparation, the different types of data, the differences between data, information, and knowledge, and data management systems. It covers best practices on collecting and processing data, big data, machine learning, open and private data, data uploads, and data storage. The module concludes with topics on data-driven decision making, artificial intelligence, data visualization and reporting, and the key topic of data protection.