The basics of CPG Data Sources for newbies!
I started my career as a data analyst serving one of the biggest CPG(Consumer Products Goods) manufacturers in the world, and hence have had my fair share of experience working with new data sets which were very intimidating at first. I can empathize with all you fresh engineering graduates or product designers or tech-coders who have just started their journey in Analytics and get overwhelmed looking at the bulky Excel and CSV files for the first time! But trust me, interpreting and using them is not at all complex and once you understand the basics of CPG data, you can easily draw parallels between CPG and other industry verticals, such as Pharma, Technology, or Financial Services. So in this article let us talk about the key data sources in the CPG industry. As a bonus, towards the end, I will also be sharing some very simple and useful analytical solutions possible with each of these data sets individually and well as in coalition with other data sources.
You will see many different “types” of datasets on Sales, eCommerce, Omnichannel, Social Media Platforms, Ratings & Reviews, P&L , Transportation & Warehouse, etc all being very diverse from each other and explaining some piece of your business. For example, P&L data will have financial information, TV data will have GRP, TRP numbers, and so on. In this article, we will not be discussing these datasets per se but will be focussing on the 3 types of data “sources” and then you can classify each of these datasets into either one of 3 sources!
We have 3 wide types of data sources — Internal data which is generated within the company, Syndicated data which is sold by Data providers & vendors and Retailer Direct data which is provided by big chains of retailers such as Walmart, Kroger, Carrefour, and others. While internal data is free, CPG manufacturers need to pay hefty cash to buy data from the other two sources.
Why do we need so many different types of data sets?
You might be wondering can’t we just use our internal data for analysis? The answer is very simple, we cannot. In today's highly evolving and competitive world, it is crucial to know the exact stores, retailers (eg. Walmart, CVS, etc), Channels (eg. Supermarkets, Drugstores, Convenience stores ), and markets (eg. France, Italy, India) where the company is currently growing or declining. Secondly, it is important to understand shopper preferences and their demographic patterns. Thirdly, unless we link our financial data with actual sales, we will never learn how to best optimize our portfolio. To do all this and more, a CPG company will certainly require a lot more than just one single data source!
1. INTERNAL DATA
On a broad level, this is the data generated internally by the manufacturer.
Sell-In data: This includes the information of all your different products shipped from your warehouse to the retailer. Example — From Unilever to say Walmart or Dmart. This can be very granular information about the product type, pack, variant and so on along with the actual number of units shipped and the price at which it is sold to the retailer.
Financial data: Here we have KPIs such as Turnover, Profit, Gross Margins, Supply Chain Costs, and their financial metrics. This also includes “Spend” which can be as diverse as your marketing spends on advertisements, billboards, celebrity endorsements, social media promotions, Adwords, Consumer Promotions such as Coupons, and so on. Roughly when you subtract all your spending from your turnover is how you get your net profit.
Supply Chain and Warehousing data: Today millions of dollars are spent in optimizing the Supply Chain of the organization using Central Analytics Hub (Control Tower). We have large amounts of data associated with the procurement, processing, and distribution of goods. Metrics such as Fill Rate, OTIF (On time In Full), Inventory Turnover Ratio, Supply Chain cycle time are some of the key ones.
2. SYNDICATED OR VENDOR AGGREGATED DATA
This data is not specific to just your company and the facts that you track. In fact, this is general market data that covers your retailers (eg. Walmart, CVS, Kroger, Dmart, Traditional channels such as Kirana stores, Mom and Pop stores, Pharmacies, and so on). The biggest advantage here is that you receive the data of your competitors too!! Nielsen, SPINS, and IRI are the most common data vendors for CPG companies. Also, in terms of the coverage of data facts and metrics, this is a gold mine for all CPG companies.
POS (point of sales) or Sell-out data- Here we will see the actual unit sales (in absolute no. of items sold), Sales (in dollars), Promotion price, Base price, volume sales (mostly kg or ml), Weighted and Numeric Distribution are the most important KPIs. With this data, along with your own brand growth and sales, we can also see our position in the market and how our competitors are performing?
How is it measured? Nielsen collects electronic point of sale (POS) data from stores through checkout scanners. In emerging markets where POS information is unavailable, they use field auditors to collect sales data through in-store inventory and price checks. Later using advanced statistical models and harmonization techniques they project the “total” sales for that product across channels and markets.
Shopper or panel data or Household data — I cannot stress enough about the importance of this data. This data gives a clear picture of consumer behavior and shifting trends. For example, it helps in answering business questions such as — Are my products preferred by middle age group more or by smaller age group? Which flavor is preferred by the target shopper group? Are my shoppers shifting towards premium products? What is the %household penetration of my brand in this country?
How is it measured? Nielsen demographically takes a balanced sample of more than 250,000 households in 25 countries who scan UPC coded purchases in-home. Then they extrapolate this figure using advanced statistical techniques using census and market research data.
3. RETAIL DIRECT DATA
Doing business with a particular retailer is a very tricky art. It is important to keep the retailer happy while also increasing your topline and bottonline! So to build a stronger collaboration with retailers you will have to glean this data, since it is truly a value add.
We have “POS (Sell-out)” and “Shopper” data provided by retail giants. The key difference between this and vendor data is that here we have just only one retailer in scope and secondly, it is much closest to reality, which means it is your actual consumer purchase data without any extrapolation! Great, no? However, my personal experience says that it takes immense time and effort (and money) to massage this data and use it in our analysis. Though retailers are working hard towards creating better UIs to extract data, setting up dashboards, and working on improving the data quality, there is still a long way to go and only giant CPG manufacturers right now can afford the luxury of this gold mine.
And now as promised, I have tabulated some primary analysis which we can do if we have access to the key data metrics from each of these data sources. Remember that most of the time, through smart mapping and harmonization, we accomplish to use both internal and external data for our analysis!
Please write to me if you have any suggestions or questions!
Thank you :)