Archive

Posts Tagged ‘Semantic Layer’

Sets Integrated into the Semantic Layer

August 18th, 2009 1 comment

On August 5, 2009, the SAP BusinessObjects Innovation Center announced a prototype for moving Sets Technology directly into the Semantic Layer.

Potential Future of Sets

Since the EOL announcement of BusinessObjects Performance Manager and two of the three analytic engines, Process Analysis and Predictive Analysis, there has been an ongoing question on the future of Set Analysis.  I have long been an advocate of the incredible power and importants of sets to help customers get a better understand of how data is changing under the covers.  If you have 100 more customers, that’s great… but how many did you lose and how many did you gain?  You may have lost 50 and actually added 150.  What about my “Gold” Customers?  How many new Gold Customers do I have?  Where did they go?  Did they become Silver members?

We have many customers who have successfully implemented Set Analysis; however the solution had not had significant R&D investment for some time and was tied to the legacy metrics engine of Performance Manager.  How could this technology be integrated into the new XI 3.X platform?

Sets and the Semantic Layer – I’m in Heaven

The Innovation Center has delivered a prototype which show Sets integrated directly into the Semantic Layer.  This prototype is a plug-in to the universe that allows you to create these “sets” , static or dynamic – all in the query panel.  Semantic layer sets lets customers leverage the complexity of  sets natively within that environment.  This is a huge step forward.

Check out the 15 minute video on the SDN website

You can see that this implementation can support all the classic calendar sets which will support of joiners, leavers, stayers, which are at the core of Set Analysis.  In this labs implementation there is support for both Visual Data and Visual Sets; however I hope this is expanded in the future to support Freehand SQL and an imported list.

I think this is a fantastic new innovation and needs to be added to the core product ASAP.  If you think so too, let your voice be heard:  http://sets.uservoice.com

I’ll keep you up-to-date with additional innovations around sets and may provide some additional posts on this topic if there’s interest.  Let me know.

«Good BI»

Universe Measures – Divide and Conquer

October 24th, 2008 No comments

I was recently working with a universe and attempting to calculate a ratio between two numbers and coming up with some strange results.  I wanted to share this with you so you didn’t waste as much time as I did trying to solve the problem.

Always Returning Zero

In my example, I had two measures and i wanted to calculate the ratio between this.  I was working on a call center universe and I needed to calculate First Call Resolution Ratio.  That would be First Calls/Total Calls.  The problem was that my calculation was always returning 0.  Here is what the SQL looked like:

select count(dbo.Current_Facts.first_call) /count(dbo.Current_Facts.call_id)
from dbo.Current_Facts

There is nothing wrong with the SQL.  The problem was with the data type.  In my case, both the numerator and denominator are integers, therefore the resulting value is also an integer.  Since there are always less first calls than total calls, the values would be a fraction and therefore were truncated to 0.

An Easy Fix

Once you realize the problem, the fix is simple.  Simply modify the numerator to be a float value.  This will cause the calculate to result in a float value.  In my case I changed the first_call value as casted it as a float so my SQL came out like this:

select cast(count(dbo.Current_Facts.first_call) as float) /count(dbo.Current_Facts.call_id)
from dbo.Current_Facts

Although this should fix the problem, there are probably databases that may require you to be more specific in which case you might modify the SQL to look like this:

select cast(cast(count(dbo.Current_Facts.first_call) as float)/cast(count(distinct dbo.Current_Facts.call_id) as float) as float)
from dbo.Current_Facts

Hopefully this little tip will safe you tons of time spent scratching your head.

«Good BI»

Multipass SQL 2.0 (Updated 2/23/2009)

October 2nd, 2008 7 comments

So what is Multipass SQL?

Multipass SQL. To calculate comparisons or to correctly calculate non-additive measures in report break rows, the query tool must break the report down into a number of simple queries that are processed separately by the DBMS. The query tool then automatically combines the results of the separate queries in an intelligent way. Breaking a single complex request into several small requests is called multipass SQL. Multipass SQL also allows drilling across several conformed data marts in different databases, in which the processing of a single galactic SQL statement would otherwise be impossible. Finally, multipass SQL gives the aggregate navigator a chance to speed up the report, because each atomic SQL request is simple and easily analyzed by the aggregate navigator. (from Ralph Kimball)

Welcome Multipass 2.0 with BusinessObjects!

What You Are About To Read May Blow Your Mind

Okay – it’s not new, but I dubbed it 2.0 because in my opinion BusinessObjects takes a superior approach to the Multipass SQL problem.

BusinessObjects employs a number of techniques to solve traditional multipass problems in a unique way.  Users of BusinessObjects can get the power of Multipass SQL without the traditional contraints and overheads of inserting records into temporary tables and requiring the report creator to engage in a number of complex steps to get the desired results.  (Anyone ever run out of temp space?)

Highlighted here are some of the most common scenarios in which BusinessObjects employs Multipass 2.0 techniques.

Single dimension across multiple fact tables – The IT organization can configure the Universe (BusinessObjects Semantic Layer) to generate multiple SQL statements when measures exist in from multiple fact tables.

Using the grains of measurement in the same query -  Sometimes you want to be able to do create a query that shows Previous YTD, YTD, QTD, MTD for the same measure, e.g. Revenue.

Requiring the end results to be used in a calculation – Displaying values such as % contribution, ratios (revenue per employee), etc.

Semi-additive measures – My personal favorite.  These measures are traditionally measures that aggregate over all dimensions EXCEPT time.  Some examples are stock level, account balance.  The stock level in the database display the number that are on-hand as of the end of the day.  The account balance displays how much money is in my account and the end of the day.  If I look at these figures across geographies, then they need to be summed up; however if I want to look at the values for the current month, I should only use the value as of today.

Analyzing a subset of data – Sometimes I want to be able to analyze a subset of data to another subset of data.  For example, get all the latest customer transactions and compare them to the historical transactions.

In every single one of these cases, BusinessObjects delivers the results that the user needs without creating temporary tables and inserting thousands of records.  Instead BusinessObjects uses a number of techniques including:

  • Derived Tables
  • Multi-SQL Universe Options
  • Universe Contexts
  • Query-Drill
  • Report Variables

How To Whitepaper Available

If I’ve whet your appetite for more information, then you will want to download the whitepaper from the SAP SDN site.  This whitepaper explains in detail exactly how to accomplish each of these scenarios using BusinessObjects.  I’ve had some problems with these documents from time to time, so I’ve also made them available from my blog directly.  You can download the whitepaper here and the powerpoint here.

The bottom line is that modern high performance databases are experts at handling extremely complex SQL.  Their optimization engines mean that it is no longer necessary to break queries down into managable pieces and perform multiple passes in the traditional sense.

What Do You Think?

If  you have a complex query scenario that can’t be solved by one of the techniques used in this whitepaper, I’d love to hear from you.

Enjoy.

«Good BI»

Set Analysis Part 5b: Custom Groups in WebIntelligence

June 26th, 2008 2 comments

So how we do do this exactly? How can I use BusinessObjects to create custom groups on-the-fly? How can I get beyond my requirements for multi-pass SQL and speed up my reports? The answer is Sets.

Set DataOn Target with Set Analysis

All the information regarding sets is located is the tables called sets_*. The most important tables are:

  • set_set – contains the set name, description, etc.
  • set_set_detail – contains all the set member information including member key, join date, leave date.

Here is a universe I created that you can download and use with the sample database provided in my earlier blog post. We need to add the set_set and set_set_detail tables to our WebIntelligence Universe so that the set member information they contain can be used within our query.

Custom Groups in WebIntelligence

Here you can see that we created a join to the DW_CRM_CONTACT table. We also created a self-join on the set_set_detail table. The self join make sure that we reference the edate value of 12/31/2999, which is the default edate for current members of any set. In the case of a dynamic set, If a member had left the set then the edate would not be equal to 12/31/2999.

In addition to this, we also joined the set_set table to the set_set_detail table using the setversion_id key. (It may appear that the setversion_id is always the same as the set_id, but it is not so only join the tables via the setversion_id)

Here is a look at the Custom Product Groups

Custom Groups in WebIntelligence

BEST PRACTICE: Whenever you may need to alias a table. If you alias a table, then all references to that table should be aliased. You can see that because I needed to reference the set_set_detail and set_set tables twice, once for products and once for contacts, I aliased both.

User Prompts

After joining the tables, we needed to create the filters for our custom groupings to make it easy for users to select which groups they would like to use on the report.

Creating a Group Filter in the Universe

I specifically defined the filters a multi, which will allow the user to select more than one filter. This is important because it provides even more power for your end-users. Now they can do direct comparisons of one group to another, even if a member (e.g. contact) exists in more than one group. The only downside is that it means that a member can be counted twice, this is why I recommended that when you define your sets you should use naming conventions that identify the sets as being mutually exclusive.

Here is what the prompt will look like in action:

BusinessObjects Universe Prompt

In my universe I did not use cascading prompts to first limit the sets by folder, but that is certainly an option since folder information is located in the set_project table.

Getting Results

After configurating the universe and setting up the prompts this is an example of the results you will see. I have run a report and selected multiple custom groups which are mutually exclusive. These groups on based on the current periods sales amounts:

Here you can see the report is broken down by my custom groups which were defined and processed by the Sets Engine.

Summary

I’ve really enjoyed taking you on a quick tour of Set Analysis and showing you some of the ways that you might get more value out of your existing data assets. Sets offers so many solutions around data analysis:

  • Custom Groupings – which allows the organization to more nimble as changes such as mgmt structures, mergers and acquisitions take place.
  • Classic Segmentation – Joiners, Leavers, Stayers (which in turn are perfect for analysis by a predictive tool so that you can see why are people joining or leaving? What are the influencers?)
  • Advanced Analysis – “Give me all quad-pack customers who have never purchased phone accessories from a company owned retail store.”

The ability to divide customers up into categories of: highly profitable down through :unprofitable, and using that to drive the business because the bottom line of any business is the customer and your organization’s ability to make a profit.

Although dashboards are currently in vogue and everyone wants one, I often wonder how organizations are analyzing the information beyond the dashboards. I have nothing against dashboards, but I would encourage you to ask yourself this. Once we see that sales are falling, what am we going to do? How am we going to do root cause analysis? Were all customers affected? What about products? Segmentation might well hold the answer to these yet unsolved mysteries.

Why use a Semantic Layer?

December 30th, 2007 2 comments

Recently I had a customer ask me about using a semantic layer.

They said:

I am unaware of anyone using a metalayer (Universe) for Business Objects Enterprise XI. Most developers will build the report connecting directly to Teradata (or other sources; Oracle, SQL Server) and then schedule the report on XIR2. I will normally build the SQL first and then plug it into a Command Statement in Crystal Reports using dynamic date ranges to eliminate user selectable parameters for dates. Others do use parameters, but it’s discouraged due to both the response time from Teradata and processing overhead can not be managed as effectively.

I thought this was a great question. Sometimes you are get so comfortable with the technology that you think everyone recognizes the elements of a good Business Intelligence system and understands the advantages and disadvantages of different infrastructure elements. The semantic layer is definitely secret sauce of BusinessObjects and it is the primary reason why it was able to grow to become the largest provider of Business Intelligence software.

What is the Semantic Layer?

The semantic layer is a business translation layer that sits between the database and the end user. This means that user can interact with their data using familiar business terminology instead of having to understand where the data resides and what business rules to apply to the data.

Semantic layer

What are the Advantages of a Semantic Layer?

Having a semantic layer on your database allows IT organizations to:

  • Guarantees Correct Results – by applying rules to define database complexity and ambiguity. These rules drive the generation of the SQL and guarantee that if two users ask for the same information, they will get the same results.
  • Guarantee Database Performance – by always generating the best SQL possible
  • Guarantee User Understanding and Acceptance – by allowing users to understand how modifying their query will result in different results, while at the same time giving them independence from IT. The #1 complaint from most business organizations is the amount of time it takes IT to build reports for them. They want the independence to be able to build their own reports and know that the results will be correct.

A semantic layer will be able to create sophisticated SQL and in many instances may need to generate multiple SQL statements in order to return the correct results (chasm trap/fan trap). The semantic layer must understand how to deal with database loops, complex objects, complex sets (union, intersect, minus), aggregate table navigation and shortcut joins.

For more information about some of these concepts, there is an excellent presentation by Integra Solutions here

Most importantly, the semantic layer allows an organization to define a single version of the truth. That means that regardless of what application or user pulls information from the database, as long as they use the semantic layer, they will always get the same answer. Now that we are in a post-Enron/WorldCom age of corporate governance and Sarbanes-Oxley, it is more important than ever to make sure that there are clear definitions for revenue, margin %, churn rate, turnover, etc.

Implementing a semantic layer will put an end to conference pickering over who’s numbers are correct because now everyone can make sure they are using the same numbers.

What are the Disadvantages of a Semantic Layer?

Now, as much as I love all the business value that comes with the semantic layer, there is a cost. Prior to working for Business Objects, I was with Crystal Decisions and we didn’t have a semantic layer, so we had to figure out all the reasons why not the use it. (We did eventually release a semantic layer called Business Views in Crystal Enterprise 10). Therefore, I thought it would be good to share with you what we felt were the problems with a semantic layer.

The main contention was that the semantic layer was an “extra layer”. It’s not free. It has to be created, maintained and managed. It must be kept in sync with any database changes that occur. The other point was that you can use database constructs such as database views or stored procedures to accomplish much of what the semantic layer provides.

The final disadvantage to the semantic layer or at least the semantic layer of BusinessObjects was that it could only connect to one database at a time. In other words, if you had data in MS SQL Server and Oracle, you needed a semantic layer for both.

Semantic Layer Rebuttal

Although all these points were true, in fact the arguments were quite week. Although the semantic layer does have to be created, maintained and managed, it’s a lot easier to maintain 2 or 3 semantic layer definitions than it is to maintain 1,000’s of reports.

It also means that you don’t have to engage the database administrator directly to create views and stored procedures. Most large organizations would not be nimble enough to make changes quickly to constructs that live within the database. Typically these would not be able to be added or modified in a production database unless the next “major” database application release. In addition, you still have the issue of chasm traps and the need to generate multiple SQL statements if data exists in multiple fact tables. This type of complexity could not be handled by a database view alone and stored procedures are very inflexible. They must be programmed. The bottom line is that the SQL generated by a stored procedure or view is much more likely to be less efficient than the semantic layer.

It is true that BusinessObjects can only connect to one data source per semantic layer, but our report tools provide two alternatives you can either:

  1. C reate a report pulling data from two universes and merge it in the report
  2. Implement Data Federator which will pull information from two different data sources in real time using a single semantic layer definition.

Finally, a good semantic layer allows you to add query governors onto your queries so you can limit:

  • how much data is pulled back
  • how long a query will run
  • who can connect to which dataset
  • what row and column level security might be applied (e.g. I can only see “NY” data because I’m a manager in NY, or I cannot see the salary column in the HR database because I’m not a supervisor.)

Moving Beyond Relational Data

What about non-relational data sources like cubes? Well, cubes have their own built in metadata. Hyperion Essbase calls it an “Outline”. Its a definition of how the cube is constructed. The person who designs a cube must define what the dimensions and facts of the cube are going to be. Because the predescribed definition is required for a cube, the semantic layer in essence already exists in a cube.

In the case of BusinessObjects, we read in the metadata from the cube and build the equivalent structure within our semantic layer. In addition, we will change the type of data querying we perform. Instead of generating SQL for example, we will generate MDX for MS SQL Server and SAP BW.

Conclusion

A semantic layer is a key component is any truly successful Business Intelligence implementation. It will help you to deliver a single version of the truth to your business users, while providing the safety and security to your IT department to guarantee that users will only access data they are allowed to access, while now allowing them to accidently run a table scan against a 100 million row fact table. One of the hardest things about creating reports is knowing where to get the data and validating that the data is correct. The semantic layer delivers both of these in spades. Report developers can let the semantic layer ‘find’ the data for them and because it’s coming from the semantic layer, it must be correct because it’s already been validated by IT.

Remember: The semantic layer as the semantic center between your two halves of your BI cookie.

Semantic Layer Cookie

Hmmm. Good.