The purpose of this article is to explain how to clean and classify data to ensure accurate clusters, ranges, planograms, and reports.
Cleaning and classifying your data forms is important when preparing your data.
Clean Data Characteristics
Clean data must have:
- No missing data within a column;
- No replicated and/or duplicated data (for example barcodes/product codes);
- Correctly spelt words;
- Well-classified display hierarchy;
- No commas in your data; and
- No NULL values.
Ensuring that the data you work with is clean holds numerous benefits. It will improve your decision-making, minimise risks, boost the results, and improve the quality of your outputs. Additionally, any analysis performed on the data will be more accurate and trustworthy. Clean data also ensures a clean reputation.
Within DotActiv, clean data must be in the market and product dimensions.
Market Dimension
The following fields must be clean in the market dimension:
- Store Name;
- Store Code;
- Store Format; and
- Region.
When using the clustering tool in the DotActiv software, you must ensure that the category format and the category number of drops are correct.
Product Dimension
Within the product dimension, ensure that the merchandise or display hierarchy and product item detail is clear.
Before beginning, create your hierarchy. Decide on the departments and categories therein. That ensures consistent naming conventions. It also prevents misspellings and ‘made up’ departments and categories.
Read this article to better understand product classifications.
It’s best to start classifying at a department level and vertically first. Allocate the appropriate department to each product.
At this stage, within the department, allocate the appropriate categories. Once you have assigned all products, you can split the workload across category specialists or shelf planners.
You can then create the sub-categories as agreed upon.
Quick Tips For Cleaning And Classifying Data
Below are a few tips that can help you when you want to clean and classify your data.
- All data must be captured in capital letters;
- Refrain from using apostrophes or other punctuation marks;
- Follow the same naming conventions;
- Work vertically first in a field and then move horizontally.
Item Detail Fields
Next, move onto the item detail fields in the product dimension in the DotActiv software. Your item detail must have the following fields populated:
- Barcode: Another name for the barcode is EAN or UPC. Some environments may use Article Number of Product Code. This is the unique identifier that links the brand and product description.
- Brand: List the consumer brand description. For example, I am going to buy a Kit Kat.
- Product Description: This must include the Brand Name, Advertising Description (what is it), Size, UOM, and Flavour. In that sequence consistently. Limit the description to 40 characters.
- Size: This is the numerical value used to show the size of an item i.e. 250, 300, 500, etc.
- Unit of Measure/UOM: This is the unit of measure. It includes Grams, Kilograms, Milliliters, and Litres. Remember; capital letters only.
- Size and UOM: This combines the size and UOM fields i.e. 250 grams, 300 millilitres
- Supplier: This refers to anyone who provides goods or services to a company. A vendor often manufactures and then sells those items to a retailer. For example, Nestle is the supplier for Kit Kat.
If you want to abbreviate brand names, it’s best to determine the brand names and appropriate abbreviations before classifying all the data.
We suggest creating a national product library for your environment. A national product library will store all the clean data and can be used as a unit standard and reference guide for cleaning and classifying data.
Take the time to correct the data to ensure correct assortments and accurate planograms.
Should you struggle with any of these areas or if you simply do not have the time to clean and classify your data, you can visit our website to read more about our Data Collection and Processing service.