How to bring BI and analytics to modern nested data structures

Over the past few years, there has been a subtle but significant shift in the way that data is structured in databases. Whereas yesterday’s databases were typically limited to storing data in rows and tables, today’s modern databases often make use of nested data structures.

In this article, we will take a deeper dive into the nature of nested data structures, how they are represented in different databases, and the benefits and challenges of using nested data structures. Finally, we’ll propose an approach that addresses the challenge of marrying the traditional world of business intelligence with the modern world of nested data.

What is nested data?

Let’s start with a little introduction to dimensional modeling, using a website visit as an example.  There are measures of the visit that exist at the visit level, such as the number of visits and the length of the visit. There are also attributes of the visit that only exist at the visit level, such as the user’s IP address, browser type, and OS. There are also page views that occur as part of each visit, each with their own measures, for example the number of page views and the time on page. And there are page view specific attributes, such as page name, page category, and page URL.

In the traditional world of data mart or data warehouse design, a common approach to creating a model to support the analysis of this web data might be to create something that looks like the following (simplified) data model.

Leave a Reply

Your email address will not be published. Required fields are marked *