A common refrain in IT is to "begin with the end in mind." It is one of those refrains that has been repeated so many times that it quickly becomes ignored. For systems that collect data, "the end" may be very complicated. Some data collected can be presented in reports and that is sufficient. Reporting is such an easily understood term and used so frequently it often hides complexity. It’s in that complexity that frustration for system engineers and customers alike begins. Because of that, complexity reporting is often not a cost effective means for exploring data. Complex reports can take a significant amount of time, (translate that to cost), in order to complete. When users are looking at the data to see what trends or interesting anomalies appear, reports just aren’t efficient. We begin by recognizing that "the end" can be as complex a system as the front end collection system.
Architecting an analysis system can comprise many parts ans some of the most overlooked and underutilized are OLAP tools (Online Analytical Processing). As such, optimizing those tools and working with users to capture the areas the tool needs to work with are important. It’s that OLAP part of an analysis system that we begin to explore in a bit more detail.
So to begin working with OLAP, two terms will immediately need to be understood--dimension and measure. Working off of a graph diagram is the best explanation to start with. Looking at Figure 1, notice the X–axis of the graph. In OLAP, that axis will be called a "dimension". The calculation that occurs at each notch in the access is called a "measure". That in mind, figure 1 shows location as the dimension and debt amount as the measure.
Figure 1.

Using the same basic graph model, imagine that we add a new dimension debt type. So now the intersection that points between the dimensions debt type and location show our debt amount measure. The great thing about OLAP tools is that depending on the particular tool, they can look at many measures at the same time, can flatten a dimension out, and look at all of one and a particular point or notch on another dimension. So, why are they called cubes?
Figure 2 shows the addition of the Calendar dimension, essentially time limited to the granularity level of day. With that dimension our graphical representation now resembles a cube, hence the name. While graphically we can only represent 3 dimensions clearly and easily, an OLAP system can represent hundreds of dimensions.

Figure 2.
OLAP tools provide different types of interfaces that allow users to simply drag, drop and click to explore data. Some tools like SAS Enterprise Guide can even graph dynamically while the user is exploring. Robust tools like this provide a good user experience and enhances the analysis system. Many systems like SQL Analysis Services connects directly to SQL Reporting services, allowing for reports to be built quickly and easily in addition to the normal canned reports. So, what does an architect need to consider with a basic understanding of what OLAP does for users?
An OLAP system basically represents hundreds or thousands, depending on the number of dimensions and measures, of queries working against the data simultaneously. In processing, systems like SQL Analysis server offer different processing options to do as many pre-calculations as possible. These processing options are HOLAP, MOLAP, and ROLAP, but let’s not get into the weeds. What we need to know is that the system is going to perform some level of pre-processing and store that pre-processing. To do that pre-processing, and for that matter, processing, there are data structure designs that work more efficiently with OLAP. Some will refer to these models as Star and Snowflake patterns. What architects need to look at is that the data model for the collection may be differently optimized. Don’t get into discussions of normalization here, as it doesn’t apply. Both the data models can be normalized but still not be optimized for their working intent. Normalization does equal optimization for all cases.
Looking at figure 3, the data model shows a table labeled with the prefix "fact." The fact table in an OLAP model is where the values to calculate measures are usually stored. Keep in mind that a measure is an individual calculation and the facts are the values that enable those calculations. To help the OLAP system work, a group of foreign keys relate out to the tables with the prefix "dim." The dim tables may be single tables as represented with the debt type. Or they may be hierarchal as represented with the Calendar dimension.
Figure 3.
What creating a database model like this allows for is the OLAP engine to process quicker. Creating this model may result in ETL routines being created, which is a benefit, as dimension data can be transformed from being collected one way to another in the OLAP data model. This provides even more flexibility towards what we can represent, as well as interjecting new considerations. How often do we move the data? Do we copy the data or do we move it? An analysis system has to consider all of these options.
OLAP can also be a data source for the front end interface and provide new dashboards and functionality to the collection interface as well. Visual Studio.net for example, has quite a few capabilities in this area that will be in another blog coming soon. Getting back, this allows analysis systems the capability of being an integrated system in new ways with the collection system.
If you see the value of OLAP, look at it as another tool for the analysis system. Yes, it can be a replacement for some reports. However, a solution will still include canned reports, as we are always going to need those. A full analysis system is going to have more components to it and be fully thought out. With this end in mind, if you were to design the exploration, reporting, queries, and analysis you want a system to provide, how different would the collection system be?
Many systems are designed from the point of view of the collection system. After all, that’s what most users will see and work with. So we focus there. Only later to realize we want to analyze data and we have the wrong level of granularity or we are missing a variable. Many know that the cost and time increase significantly when changing a constructed system versus at design time. Have you considered what the cost is of building the systems in the wrong order? Is the tail wagging the dog?