Big Data and the Goldilocks Principle

I was inspired to write this post (I can hear all of you sighing ‘Yet Another on Big Data’) due to another ‘Big’ reason. I listened to a TED (www.ted.com) talk by David Christian titled ‘The History of the World in 18 minutes’ in which he narrates a complete history of the universe, from the Big Bang to the Internet, in a riveting 18 minutes. This is “Big History”: an enlightening, wide-angle look at complexity, life and humanity, set against our slim share of the cosmic timeline. Check out his website – www.bighistoryproject.com , and I promise you that this ‘Big’ has nothing to do with Big Data, as we know it. But what got me interested in his talk is his reference to the ‘Goldilocks moment’ – a moment so precisely right for certain thresholds to be reached to enable higher forms of complexity (life) in the universe.

That got me thinking – Is Big Data the ‘Goldilocks moment’ for organizations with respect to analytics helping them towards achieving better business outcomes?

I think the answer is ‘Yes’ and this stems from the following hypothesis – An organization can utilize analytics for better business outcomes if:

a)      they have more data points to be analyzed (volume)

b)      have the ability to perform sophisticated analysis on large and diverse datasets (variety)

c)       and can do it at a much faster rate than before (velocity)

In that context, I really liked the picture (given below) from one of the IBM articles, which illustrates how Big Data when synthesized  properly along with standard transactional data can help in better business decision making (in this case, it was Fraud Detection)

Source: IBM – Understanding Big Data by Paul Zikopoulos

On the other hand, the exponential increase in processing power of CPUs, the steep fall in memory prices and high bandwidth availability, have enabled the practical use of Big Data techniques. From the human angle, people are creating digital data, viz. social media chatter, video sharing, blogs, mobility etc. at a rapid pace that organizations (with help of Big Data techniques, of course) can potentially solve the ‘Innovators Dilemma’ by providing new products and services that the consumers did not ask for simply because they couldn’t figure out what they actually want.

All in all, I think we are at a precise moment in history (the Goldilocks moment) where organizations can greatly increase their ability to provide better products & services for their consumers using Big Data techniques.

Source: http://blogs.hexaware.com/business-intelligence/big-data-and-the-goldilocks-principle/

Business Focused Analytics – The Starting Point

Having been a Business Intelligence practitioner for the last 13 years, there has never been a more exciting time to practice this art, as organizations increasingly realize that a well implemented BI & Analytics system can provide great competitive advantage for them. This leads us to the question of – ‘What is a well implemented BI system?’ Let us follow the Q&A below.

Q: What is a well implemented BI system?

A: A well implemented BI system is one that is completely business focused.

Q: Well, that doesn’t make it any easier. How can we have BI that is completely business focused?

A: BI & Analytics becomes completely business focused when they have ‘business decisions’ as the cornerstone of their implementation. The starting point to build / re-engineer a BI system is to identify the business decisions taken by business stakeholders in their sphere of operations. Business decisions can be operational in nature (taken on a daily basis) and/or strategic (taken more infrequently but they tend to have a longer term impact). To reiterate, the starting point for BI is to catalog the business decisions taken by business stakeholders and collect the artifacts that are currently used to take those decisions.

Q: The starting point is fine – What are the other pieces?

A: The next step is to identify the metrics and key performance indicators that support decision making. In other words, any metric identified should be unambiguously correlated to the decision taken with the help of that metric and by whom. Next we need to identify the core datasets in the organization. Please refer to my earlier blog post titled ‘Thinking by Datasets’  on this subject.

Q: What about the operational systems in the landscape? Aren’t they important?

A: Once we have documented the relationship between Business Decisions to Metrics to Datasets, we need to focus on the transactional applications. The key focus items are:

  • Inventory of all Transactional Applications
  • Identify the business process catered by these applications
  • Identify the datasets generated as part of each of business process
  • Next step is to drill-down into individual entities that make up each of the datasets
  • Once the Facts & Dimensions are identified from the entities, sketch out the classic ‘Bus Matrix’ which would form the basis for dimensional data modeling

 

Q: All this is good if we are building a BI system from scratch – How about existing BI systems?

A: For existing BI applications, the above mentioned process could be carried out as a health-check on the BI landscape. The bottomline is that every single report / dashboard / any other analytical component should have traceability into the metrics shown which should then link to the decisions taken by business users. BI & Analytics exist to help organizations take better business decisions and that defines its purpose & role in an enterprise IT landscape.

The answers mentioned above provide the high-level view of Hexaware’s approach to Business Intelligence projects. We have worked with many organizations across industries and a business focused analytical approach has provided good value for our customers.

Thanks for reading. Please do share your thoughts.

Business Focused Analytics – The Starting Point – Part 2

Business decisions are the cornerstone of a successful BI implementation. Cataloging the decisions taken by the key stakeholders in an organization is the first step in understanding the information requirements for a data warehouse. Capturing business decisions – strategic and operational, is not a simple task as most business decisions tend to be complex requiring diverse data points. Further, when all decisions are collected, how will we know the decisions that have the most impact on the business?

Here’s a simple framework based on the six primitive interrogatives that Hexaware has effectively used while assessing information requirements. This framework helps systematically uncover important dimensions of information and organize them in a format that is easy to comprehend.

Question Description Comments Example
Who? The decision-maker Stakeholder Service Delivery Manager (SDM)
What? The decision A decision requiring supporting data / information Resourcing for a project
Why? The motivation for the decision The significance of the decision to the business Getting the right project team is critical to the success of a services project.
When? When is the decision made frequency (or) a point in time Made during the planning stage and reviewed at periodic intervals
How? The basis for making the decision – KPI / metrics / a logic the metrics and datasets required for making the decision By comparing the skill set requirement and project schedule in the Project plan to the  availability of resources (HR and PMO databases) with the right skills (skills database) and good track record (appraisal database)
Where? The place where the decision is made Specifies mobility /  additional access requirement This information needs to be accessed through extranet

Depending on the stakeholder (mostly), the decisions could be strategic or operational. A manager responsible for carrying out a business process will have an operational view of information and will be making operational decisions relating to his/her sphere of operations.  Decisions taken by top management personnel with longer-term business responsibilities tend to be more strategic.  The above framework helps capture both strategic and operational decisions along with the datasets required to make the decisions.

Successfully capturing the decisions and the relevant metrics and the datasets is only half the story in the assessment for a data warehouse. Let’s reserve the other half for a subsequent blog.

Hope this information was useful. Please do share your comments/suggestions.

Earnings Conference Call on Tuesday, July 31st 2012 at 4.00 pm IST

The Board of Directors of Hexaware Technologies is scheduled to meet on July 31st, 2012 to consider the audited results for the quarter ended June 30th, 2012 (Q2 2012).

The company is hosting an earnings conference call for investors and analysts on the same day to discuss the results.

Details of the conference call are as follows:

DateJuly 31st, 2012
DayTuesday
Time4:00 PM IST
Dial-in Access Numbers
MumbaiPrimary Access:+91 22 3065 0539
Secondary Access:+91 22 6629 0539
Hexaware ParticipantsAtul Nishar– Chairman
P. R. Chandrasekar
 – Vice Chairman & CEO
R. V. Ramanan
 – Executive Director & Head – Global Delivery
Prateek Aggarwal
 – Chief Financial Officer
Deependra Chumble
– Chief People Officer
Sreenivas V– Chief Strategy Officer

Replay Facility*

Playback number: +91 22 3065 1212

When prompted, please enter the Conference ID 32580 followed by #

*The replay facility of the conference call will be available 60 minutes after the call ends and for three days from July 31st to August 02nd 2012.

Participants are requested to kindly dial-in 10 minutes in advance.

Thank you!

For further information: please contact – Sreenivas V
[e-mail: sreenivasv@hexaware.com; phone: +91 99401 90091]

More: http://hexaware.com/news/earnings-conference-call-on-tuesday-july-31st-2012-at-4-00-pm-ist-2/



Quality Assurance and Testing Services: Strategy of Oracle BI Testing

This Document gives an overview of OBIEE Testing and commonly used in BI Components while doing validation.

1. Testing in Levels: In a typical OBIEE project, it is advisable to test in multiple areas rather than attempting to test everything at once.

a) The first set of tests can verify the accuracy of the column to column transport of the data between the source and target. This verification is typically done using SQL statements on the source and target databases.

b) The next step is to verify the accuracy of the repository (the .RPD file.) These tests will include testing with appropriate dimensional filters on the metrics and the formula used to compute those metrics. Testers can build two sets of comparable queries within the repository interface.

c) The next step in testing will be to verify the dashboard / reports against comparable queries on repository metrics. In these tests, testers verify dashboard charts / reports against corresponding results from queries they execute on metrics of the repository.

d) Finally, the functional interface tests will cover tests to verify the lookups, performance, ease of use, look and feel etc.

The first three types of tests are performed by testers who can create simple SQL statements.

Structure and organization of test cases:

The choices on test cases naming convention and structure can help organize the test artifacts better and aid a great deal in implementing the overall testing strategy.

For example: If the test cases are grouped based on the nature of the tests, like, source to target verification, RPD metrics tests, functional, security, performance and usability, it would be easier to pick and choose the tests based on the testing context and tester capabilities.

1. User acceptance criteria:
Users typically have an existing legacy mechanism to verify if what is displayed in the new solution makes sense. Testers should dig into this and understand how the end users built the project acceptance criteria. Testers should challenge the assumptions made by the business community in deriving the acceptance criteria. This activity helps get an end user perspective built into the testing efforts from early on.

2. Validating Master Detail Report: 
Master Details linking of views allows you to establish a relationship between two or more views such that one view, called the master view, will drive data changes in one or more other views, called detail views.

3. Time series functions validation: Time series functions provide the ability to compare business performance with previous time periods, allowing you to analyze data that spans multiple time periods.

Time series functions enable comparisons between current sales and sales a year ago, a month ago, and so on.

a. Ago: With ago function we can compare period to period
b. To date: Time series functions enable comparisons between current sales and sales a year ago, a month ago, and so on.
c. Period rolling: The PERIODROLLING function does not have a time series grain; instead, you specify a start and end period in the function.

4. Oracle bi-publisher validation: Oracle BI Publisher known as XML Publisher offers efficient scalable reporting solution available for complex, distributed environments. It provides a central architecture for generation and delivering information to employees', customer and business partners both security and in the right format.



Quality Assurance And Testing Services: OBIEE Validations

Hexaware has developed a variety of accelerators that combine Oracle's technology with Hexaware's functional and technical expertise to effectively meet your business needs. Oracle E-business suite range consists of financials, supply chain, manufacturing, HRMS & CRM.

Source to target validation: Here source is represented by metadata repository and Target represents OBIEE Reports & Dashboards, BI Publishers.

1. Presentation layer object validation:
This is the layer of the logical business model that is accessible for the client through the structure query Language better known as the logical SQL. The presentation is the appropriate layer to set user permissions and to validate user permissions to reports.

2. Categorizing the metrics:
It is important to classify the metrics from multiple perspectives such as, their frequency of use, potential performance impacts, and complexity of calculations involved. Such a classification helps drive priority of testing

3. Dashboard charts and filters criteria:
User interface testing should encompass tests with multiple options in the available filter criteria. OBIEE gives enough drilldown features to verify the underlying data on the clickable components of the charts. Test cases written should be detailed enough to verify data aggregated at various layers.

4. Filter Validation
. Validate the entire filters which are available on report. Example refers below report and its filter
. Example: For Performance Measure filter- Validate filter contents against report requirement and database

Filter types:
Local filters: Filtering the records in the report level.
Global filters: Filtering the records based on user selection in Dashboard.

Dashboard Validation:
When a user selects certain request that need to display the exact results in the dashboard.

5. Data level security:
Data level security validation means user will be able to see only particular data for the given permission
Example: Both the Eastern and Western region Sales Managers will be seeing the same reports but the Data visible to them in the reports will be Eastern and Western region Sales data respectively.

Object Level security: Need to validate whether the particular user is able to access the particular dashboard or folder etc.
Example: For example, users in a particular department can view only the subject areas that belong to their department.

6. Bursting the reports:
Bursting the reports means distributing the reports based on the regions. 
Example: If there are 4 regional reports, validate to burst the reports (based on East, West, South, North regions).

7. Buzz Matrix validation: 
Need to validate the alerts in the Dashboard.

Example: We are running stock market and CEO is very much interested to know today's business weather, has it reached a certain level that which he expects compared to the last week. If the level has reached to a certain level in Dashboard Buzz (Alert), it should raise an alert saying that it has reached the level in such a way the buzz matrix validates.



Quality Assurance And Testing Services: OBIEE Validations

Hexaware has developed a variety of accelerators that combine Oracle's technology with Hexaware's functional and technical expertise to effectively meet your business needs. Oracle E-business suite range consists of financials, supply chain, manufacturing, HRMS & CRM.

Source to target validation: Here source is represented by metadata repository and Target represents OBIEE Reports & Dashboards, BI Publishers.

1. Presentation layer object validation:
This is the layer of the logical business model that is accessible for the client through the structure query Language better known as the logical SQL. The presentation is the appropriate layer to set user permissions and to validate user permissions to reports.

2. Categorizing the metrics:
It is important to classify the metrics from multiple perspectives such as, their frequency of use, potential performance impacts, and complexity of calculations involved. Such a classification helps drive priority of testing

3. Dashboard charts and filters criteria:
User interface testing should encompass tests with multiple options in the available filter criteria. OBIEE gives enough drilldown features to verify the underlying data on the clickable components of the charts. Test cases written should be detailed enough to verify data aggregated at various layers.

4. Filter Validation
. Validate the entire filters which are available on report. Example refers below report and its filter
. Example: For Performance Measure filter- Validate filter contents against report requirement and database

Filter types:
Local filters: Filtering the records in the report level.
Global filters: Filtering the records based on user selection in Dashboard.

Dashboard Validation:
When a user selects certain request that need to display the exact results in the dashboard.

5. Data level security:
Data level security validation means user will be able to see only particular data for the given permission
Example: Both the Eastern and Western region Sales Managers will be seeing the same reports but the Data visible to them in the reports will be Eastern and Western region Sales data respectively.

Object Level security: Need to validate whether the particular user is able to access the particular dashboard or folder etc.
Example: For example, users in a particular department can view only the subject areas that belong to their department.

6. Bursting the reports:
Bursting the reports means distributing the reports based on the regions. 
Example: If there are 4 regional reports, validate to burst the reports (based on East, West, South, North regions).

7. Buzz Matrix validation: 
Need to validate the alerts in the Dashboard.

Example: We are running stock market and CEO is very much interested to know today's business weather, has it reached a certain level that which he expects compared to the last week. If the level has reached to a certain level in Dashboard Buzz (Alert), it should raise an alert saying that it has reached the level in such a way the buzz matrix validates.