Tuesday, May 24, 2011

Modelling a STAR schema from ER model

An ER model is a two dimensional data model used in transactional databases. These models does not provide features required in a data warehouse. For this purpose star schema model is used. We can design a star schema model from a ER data model.

Few points that needs to be considered for converting an ER model to a star schema are as fallows:


  • In an ER model, generally a FACT table is considered a table in ER model where there is maximum cardinality for all the relationships viz many in one to many or many to many relations.

  • Dimensions are the tables with cardinality of 0 or 1.

A star schema can also be modeled from scratch using top down approach. Following are the steps to be followed for modeling a STAR schema:



  • Identify the business process for analysis.

  • Identify measures or facts.

  • Identify dimensions for facts.

  • Determine lowest grain i,e lowest level of cardinality for fact table.

In the next post we will see the concept of surrogate keys.



Sunday, May 22, 2011

Business Intelligence and reporting:

Business intelligence is used for reporting purpose, to show the performance of any business using different dimensions like time, region, product etc.
Reporting requires data to be presented in a manner so that decision making can be done on that. For this purpose concept of data warehousing has been introduced.

The data warehousing concept was intended to provide an architectural model for the flow of data from operational systems to decision support environments. The process of gathering, cleaning and integrating data from various sources, usually from long-term existing operational systems to a target data warehouse is called data warehousing.

Design model for Data warehouse:
Generally two models are followed for storing data in data warehouse.
• Normalized approach – In this approach of storing data, the dimensional tables are normalized usually in 3NF. This approach is called snowflaking.
The advantage of this schema :
• Loading data from OLTP(source database) is easy .
Disadvantages of this schema :
• join data from different sources into meaningful information and then
• access the information without a precise understanding of the sources of data and of the data structure of the data warehouse.

• De normalized or Dimensional approach : In this approach the data are divided into dimensions and facts. The fact table is used to store measures to be analyzed while dimension tables are used to store the context on which facts in fact table are analyzed.

Advantages of dimensional approach :
• Complex query processing can be done efficiently.
• The schema can closely depict the business model.
• It incorporates the concept of slowly changing dimensions.

Difference between dimensional model and relational model:
• An E-R diagram (used in OLTP or transactional system) has highly normalized model (Even at a logical level), whereas dimensional model aggregates most of the attributes and hierarchies of a dimension into a single entity.
• An E-R diagram is a complex maze of hundreds of entities linked with each other, whereas the Dimensional model has logical grouped set of star-schemas.
• The E-R diagram is split as per the entities. A dimension model is split as per the dimensions and facts.
• In an E-R diagram all attributes for an entity including textual as well as numeric, belong to the entity table. Whereas a 'dimension' entity in dimension model has mostly the textual attributes, and the 'fact' entity has mostly numeric attributes.
Dimensional approach of data modeling leads to a design approach called OLAP (Online Analytical Processing). The core of OLAP system is an OLAP cube. It consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typically created from a star schema or snowflake schema of tables in a relational database.
Data warehousing uses the OLAP design approach for data modeling. An OLAP uses a technique of OLAP cube for modeling and building a decision support system.

To understand an OLAP cube, lets take an example.
A company has got a project to warehouse all the sales data of different soft drinks company. The data will be used to create report of sales according to region, time and product of sale.
Here the data from analytical system will be extracted and stored in a data warehouse. The data warehouse will be designed using an OLAP cube as shown below.





Here, there are three dimensions on which our data warehouse reporting will be done. Time, Geography and Product.
If we have to find out the sales of a product class in a particular year for a particular region, we will move to that product class on product axis. Then move to that particular year on time axis and then for a particular geography, move in direction of geography axis. This will lead us to a cube that will give the value of sales we require. This multidimensional approach of warehouse design will help us drill to sales related data for any product class, for any year, month, week based in any geography.
This OLAP cube now has to be implemented on any data storage system. An OLAP designed on a relational database system is called as a ROLAP (Relational OLAP).
For a ROLAP system the data modeling schema followed to implement this multidimensional structure is STAR schema.

In te next post we will see how to model a star schema from a ER model schema..

Business Intelligence:

Business Intelligence refers to computer based technique, used in identifying and analyzing business data to get an insight into the trends of business health. The sales summary, market summary etc. of any business by region, by time period, by product, by product class etc.
Business intelligence aims to support business decision making by creating processes using transformation of business data in a form that helps in decision making.
Often transforming of business data is done using the concept of data warehousing.
PS: Remember that all Data warehouse is used in business intelligence, but all business intelligence does not require data warehousing.
In simple terms “Business Intelligence is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information used to enable more effective strategic, tactical, and operational insights and decision-making”.
Common uses of business intelligence are reporting, data mining, process mining, business performance intelligence, predictive analysis.

In the next post I will describe about business intelligence and its use in reporting.

Oracle analytics - Different pricing schems

OBIEE on premise Licensing: Component based licensing: The pricing is based on the user base or volume(COGS/revenue). It is a buy as ...