Thursday, November 8, 2018

Oracle analytics - Different pricing schems

OBIEE on premise Licensing:

Component based licensing: The pricing is based on the user base or volume(COGS/revenue). It is a buy as you go model priced according to price list.

Enterprise license agreement: It is based on what oracle after seeing the customer's employee base and revenue sets the price lists. It does not considers if only some part of all employee strength is using the product. Once the license metric is exceeded additional fees in the increment of 10% is charged.

Unlimited license agreement: Unlimited license allows use of listed products for unlimited users and unlimited time. However duration is typically limited to 3 years. There is a risk of overbuying in this case.

BICS : Licensed per user / month basis ($150 as per oracle price list) with minimum 10 users (Free DVD)

DVCS : Licensed per user / month basis ($75 as per oracle price list) with minimum 5 users (Free DVD)

DVD  : Licensed per user at $1250 as per oracle price list (If only DVD license is bought otherwise it comes free for a specified set of user base)


OAC Licensing:

The licensing is based on OCPU per hour basis.The cost differs based on if you are BYOL or getting new license.

Standard
Data Lake

Enterprise

OAC should also need cloud DB and storage license for setting up the service. OAC upgrades are controlled by customers and not Oracle while BICS upgrades are oracle managed.




Saturday, August 18, 2018

Connecting OAC or OBIEE 12c with Oracle autonomous data warehouse cloud

Oracle is now picking up pace and providing new and enhanced services for cloud. The new Oracle autonomous data warehouse cloud service gives a data store which is cloud based highly performant and highly available and has ability to automatically tune itself  to give best performance for BI and analytics application. Inbuilt machine learning capabilities also gives customers to explore the AI and data science angle on top of the data already stored in the data warehouse.

As many oracle customers are already using OBIEE or OAC, they would like to migrate and connect OBIEE with Oracle ADWC. However as ADWC supports oracle client 12c and above connecting this through OCI client becomes difficult as normal adminitration tool of OBIEE supports OCI 10g/11g clients only.

Below are the steps one can carry out to connect Oracle's OBIEE or OAC with oracle ADWC.

  1. Download the client credentials from ADWC console. This is a zip file with wallet credentials for ADWC database.
  2. Download the oracle 12c client. ADWC supports oracle client 12c and above.
  3. Copy the tnsnames and sqlnet in the oracle client network/admin folder from the zip file downloaded from ADWC console. This is to override sqlnet.ora to state the wallet location on local system where oracle client is installed.
  4. Create a wallet folder inside oracle client folder and copy all the files downloaded in the client credentials zip and place in the new wallet folder.
  5. Edit sqlnet.ora copied in step 3 and point the wallet folder location as in step 4 in sqlnet.ora.
  6. Download OAC admin tool. Normal OBIEE admin tool is not supported.
  1. Open rpd and point the tns alias in connection pool.



That's it for now. Have a good day.

Thursday, June 28, 2018

Amazon Redshift as datawarehouse

Redshift is a columnar database provided by Amazon AWS as a cloud database service. Redshift uses a cluster architecture with a leader node responsible for cluster and node management and compute node for data storage and computation. Redshift is based on postgre sql database however with some relaxed constraints and rules.

Redshift database cluster can be created by specifying the number of compute nodes required in the cluster and customers are only charged for compute nodes. Leader nodes usage is completely free. A redshift database architecture is as given below.





                                                                        Physical Architecture



                                                                                                                   Logical Architecture


For schema to be used when a user logs in and queries in a database search_path is used to resolved the schema. To add a schema in search_path use set command.

set search_path to '$USER',pg_temp,pg_catalog,...;
------------------
pg_temp and pg_catalog is used first by default in order and then any schema specified by search path is scanned

To set the timeout of queries in session use statement_timeout.

set statement_timeout to ;
----------------------
This specifies the timeout for any statement being executed on the cluster. 0 turns off the timeout.

To set the timezone of a session use timezone.

SET timezone time_zone
-----------------------
TO set the timezone of current session. Run "select pg_timezone_names();" to view list of timezones. to set the timezone for database user used
"ALTER USER SET timezone to 'America/New_York';"


WLM: Workload management
You can use workload management (WLM) to define multiple query queues and to route queries to the appropriate queues at runtime.

When you have multiple sessions or users running queries at the same time, some queries might consume cluster resources for long periods of time and affect the performance of other queries. For example, suppose one group of users submits occasional complex, long-running queries that select and sort rows from several large tables. Another group frequently submits short queries that select only a few rows from one or two tables and run in a few seconds. In this situation, the short-running queries might have to wait in a queue for a long-running query to complete.

You can improve system performance and your users’ experience by modifying your WLM configuration to create separate queues for the long-running queries and the short-running queries. At run time, you can route queries to these queues according to user groups or query groups.

You can configure up to eight query queues and set the number of queries that can run in each of those queues concurrently. You can set up rules to route queries to particular queues based on the user running the query or labels that you specify. You can also configure the amount of memory allocated to each queue, so that large queries run in queues with more memory than other queues. You can also configure the WLM timeout property to limit long-running queries.

SET query_group TO group_label   
------------------------------
This is used to set query group to a specific group for the session; This can be used to drive WLM to assign a query to a queue based on group assignment for the session.
create group admin_group with user admin246, admin135, sec555;










Wednesday, June 20, 2018

Pandas: Read encrypted zip file to data frames

This is easy....


#!/bin/bash
import pandas as pd
import zipfile


zf = zipfile.ZipFile("./kivaData_augmented.zip")
txtfiles = zf.infolist()
for txtfile in txtfiles:
    print(txtfile.filename)
    df = pd.read_csv(zf.open(txtfile, "r", pwd="1234".encode('cp850', 'replace')))
    print(df['term_in_months'])

Pandas: Python for data analysis

Pandas: Pandas library in python is used to do all kinds of data analysis. Pandas is highly efficient in data analysis where label oriented analysis is required. Major advantages of using python and pandas is that one would not need any relational database to first load the data and then do tha analysis using sql. Also with multitude of data formats like  csv, xlsx, zip, pdf, xml, json etc one would need a tool that supports reading data from all these formats. As for now there are no single tool supporting everything. Python being open source can be used freely by any organisation to create a data reading
framework which can read data from the files (structured and unstructured), do some data cleansing on the go and carry out data analysis using pandas with label data columns.


Pandas support for mainly two data type, i,e. series and data frames makes it very flexible and efficient in data analysis.


Series: It is similar to ndarray with added advantage of having labels to the items in the array. Any compute operations between series takes care of computation based on labels. This is not available in ndarray.

             sr = pd.Series([1,2,3,4],index=['col1','col2','col3','col4'])

DataFrames: Dataframes in pandas acts like a sql table with columns and index labels. Added advantage for data frames is that one can pass index labels as well with column labels. This helps in getting specific data values for data analysis based on the row index label and column names(column labels).

df = pd.DataFrame({'one':pd.Series([1,2,3],index=['a','b','c']),
   'two':pd.Series([4,5,6,7],index=['a','b','c','d'])})


Advantages in data analysis using data frames and series is that one can easily carry out scalar operation on each item of the
dataframe or series one go by using dataframe object and using scalar operation of the object itself.

for eg., df1 = [[1,2,3,nan],[4,5,6,7]]
df2 = [[8,9,10,nan],[11,12,13,14]]

df3 = df1 + df2

Which will give,

df3 = [[9,11,13,nan],[15,17,19,21]]

This is similar to operator overloading in C++.

Also some of the basic operations in dataframes are as follows:

Operation                                              Syntax Result
Select column                                       df[col] Series
Select row by label                                df.loc[label] Series
Select row by integer location df.iloc[loc] Series
Slice rows                                             df[5:10]         DataFrame
Select rows by boolean vector df[bool_vec] DataFrame

Every dataframe object has another copy of same dataframe transposed. This can be accessed using df.T operation on the dataframe object.

See you soon!!

Sunday, February 18, 2018

Download OAC Admin tool (Cloud BI)

Oracle has now made available cloud version of admin tool or Oracle Client which can be installed locally and be used to open RPD in cloud analytics (OAC).

Below is the link to download OAC Client tool (Cloud Client)

http://www.oracle.com/technetwork/middleware/bicloud/downloads/bi-client-tools-3959521.html

Oracle Analytics Cloud (OAC) - Open RPD in cloud using OBIEE Admintool

New cloud version of OBIEE client aka admintool has got an option to open RPD deployed in Oracle Analytics Cloud directly using option "In the cloud"  directly.


User: Oracle cloud admin user (Used to login to Oracle cloud)
Password: Cloud admin user password
Cloud: bootstrap (default)
Host Name: Public IP address of OAC service deployed. You can obtain it from OAC service console.
Port Number: 9704
SSL: Uncheck


Once RPD is open, work on it as usual


Good Luck!

Sunday, February 11, 2018

Oracle Data Visualization (ODV)

Oracle Data visualization:


Oracle has some time back released data visualization capabilities to compete with mariad of visualization tool in the market like Tableau.Within its coveted business intelligence and analytics tool (OBIEE) the feature comes in the name of Visual Analyser(VA, to be licensed separately) and Data visualization desktop or Data visualization cloud service as cloud service in Oracle cloud as a part of Oracle analytics cloud.

All of these are licensed products details of which can be found here.

https://blogs.oracle.com/emeapartnerbiepm/oracle-data-visualisation-license-pricing

The desktop version of Oracle DV does not require any kind of OBIEE or database to be installed to get started with the data visualisation and exploration activities. Once installed this works as other desktop application.

Here we will go through the step by step process of creating a project and get going with creating visualization.

Creating a sample project in Oracle DV:

1. First step is to create a project in Oracle Data visualization. to do this go to home screen in DVD and click on Create > Project.

2. If this is the first time, then create a data set or pick the data set already imported.

3. To create a data set from the scratch you have to create a connection. A connection can be either directly through file or connection to any data source. Here I have create a data source to connect to Oracle Database cloud service(DBaaS).

4. Post creation of connection, you can import the data set with the columns needed from the data set.


One can also add data set from within the project by clicking on the + sign at the top left corner of the project data elements.


5. Once the data set is added, and in case multiple data set(tables) are added, one can check if joins are reflecting properly by clicking on the hamburger icon at the top left corner.



6. Once done, click on visualize at the top right, and then drag and drop the attributes and measures that is needed to be visualized. Oracle DV will automatically provide a visualization which best suits the data added to the palette.

7. However the visualization can be changed to whatever is needed from the first option in the left side toolbar.

8. At last one can click on narrate just beside the visualize option and add the canvas created in the project with visualization(In our case it is just one shown above). Users can add commentaries on the canvas to make it more clear and save the project.

9. The final project can be exported as a file in the form of ppt, or shared over email as a ppt or can be uploaded to cloud storage.


And that's it. 





Oracle analytics - Different pricing schems

OBIEE on premise Licensing: Component based licensing: The pricing is based on the user base or volume(COGS/revenue). It is a buy as ...