Data structuring thought process

We have Sales data going back 20 years. To save space, processing, etc., I only show sales data that has been ordered in the last 5 years in Dundas. There are two main date fields that users would use to aggregate data, Order Date and Invoiced Date, to compare across years to create historical trends.

If they compare using OrderDt across years, everything is fine. If they compare using InvoiceDt, the totals for the early years are not accurate because they do not include the sales ordered before 5 years ago, but still invoiced in the last 5 years.

If I change my source data to include sales either ordered in the last 5 years OR invoiced in the last 5 years, that problem is solved but now if a user groups by the OrderDt, they may see like 10 lines in 2013, those lines that were invoiced in the last 5 years, but ordered in 2013. Now the Order Date aggregations are not entirely correct for the early years.

One solution would be to have two sales data sets, one limiting to Order Date in the last 5 years and one limiting to Invoice Date in the last 5 years. This would just be a training issue in trying to explain which one users should use and it is an almost doubling of data stored.

Another would be to show the last 10 years of data and just note to users that data is less accurate in the earlier years.

I would be interested in anyone has dealt with similar situations. The real situation is more complex of course because I already have several sales cubes (one showing detail records only for a few years, one showing summary information further back in time, etc.), but that is the basic issue.

Hi Kelly,

Would it be reasonable to union both of these together? You could make it appear like everything is coming from a single data source/table? Even if you have two different data fields, you could do a sub select from one and the other then join them together into a single query for your users. I think something like this would help remove some of the confusion. Just a thought but maybe worth giving it a try.

Thanks, Jeff, for your response. Here is what I ended up doing: I included in my source data all sales either ordered in the last 5 years or invoiced in the last 5 years. Then in the Sales data cube that uses this data, I changed my Time Dimensions for both OrderDate and InvoicedDate to only go back 5 years. I was only able to do this because of the enhancement/fix that was put into Dundas version 7 where you can still see a date that falls out of the date range you set your time hierarchy with. When the user sees the calendar filter on a dashboard for either of these date options, they will only see the last 5 years as filter options (I know they can still get to previous years in the filter, but by default, just my range shows). By designing the dashboards to only show the pertinent years, it lessens the chance there will be confusion. I’m very happy with that Dundas time hierarchy change. In version 6 I had to have very large date ranges defined in our time hierarchies so that dates didn’t show up as blank in the detail, even though we didn’t need to give the user that much of a date range to filter by in the dashboards. Thanks again.

Great feedback Kelly, thank you for this detailed information. Our team will be delighted to hear that you liked the change.