# DATASET STRUCTURE

The tool expects a dataset in "tidy" form:

- Each variable must have its own column.
- Each observation must have its own row.
- Each value must have its own cell.

Download an example file from this link (right click => "open link in new tab", go to new tab, right click => "save as").

#### Required Columns

Your uploaded file needs to have:

- A few "dimension columns" (e.g. channel, product, country,..)
- A cost/revenue column. This column must be in plain number format (no thousand separators). In Excel they need to be in the "General" number format or in the "Number" format with no 1000 separator, no currency format, and zero written as 0 not as -. If you select "Run"> "Data Profiling" a data profiler report will show you how your data was read. If the number columns have not been read like numbers, you have a problem with your format.
- Either a date or a period column. A date column is a column in date format, while a period column already groups the data in the "before after" periods we need for the bridge analysis. If you have a date column, the values must be in the form YYYY-MM-DD."

'Mparanza does not need a Price column. It will re-calculate the unit price and ignore Price columns if present.

#### Cost and revenue column expected format

As mentioned, the cost and the revenue column must be in plain number format with no thousand separators. Since there are different standards of thousands separators (1'000 in USA and UK, 1.000 in Continental Europe) the app would not know how to parse the number.

#### Negative value format

Negative values are normally formatted with a leading minus sign ("-20"). Some ERP systems, such as SAP always store negative numbers with the minus sign at the end of the value. If the app finds a trailing minus sign, it will automatically try to correct it and will return a message.

#### Other Metric Columns

Your uploaded file can also have other metric columns:

- A
*units column*. Measures the number of items/packs of product sold. This column must be in plain number format (no thousand separators). If the dataset has a units column, the code will also calculate the "unit price" variance. If the dataset only has a cost/revenue column, all variance will be classified as "units" variance. - A
*volume column*. Measures the volume (kilos, liters, bottles,...) of product sold. This column must be in plain number format (no thousand separators). If the dataset has a volume column, the code will also calculate the "volume price" variance. If the dataset only has a cost/revenue column, all variance will be classified as "volume" variance. - A
*discount column*. If the dataset has a discount column, the tool will also return the "discount variance" (variance after discounts and commissions). The discount column must contain the discount in absolute value (not in %) and must be in positive sign. A discount of 100$ must appear as 100. - A
*COGS column*. If the dataset has a COGS column, the tool will also return the "COGS variance" (variance after COGS). The COGS column must contain the COGS in absolute value (not in %) and must be in positive sign. A cost of 100$ must appear as 100. *Promo and No Promo*quantity and amounts columns. If the dataset has Promo/No Promo columns, the tool will offer some promo-related visualizations.- One or more driver columns.