Posts

EU DataViz 2019

Image
Happy to announce that I have been selected as one of the speakers in EU DataViz 2019.  Check the official page here - https://publications.europa.eu/en/web/eudataviz/home EU DataViz 2019 is an international conference organised by the Publications Office of the European Union. It addresses for the first time the specific needs of the community engaged in data visualisation for the public sector in Europe, bringing together experts, practitioners and solution seekers.

In-house BI Teams - Part 1 : Why you should consider joining an In-house BI team?

Image
Are you tired of working as an external and researching for reasons to switch to an in-house BI team? Or are you in a situation where you have to choose between an IT consulting/services company position and an in-house BI team position for the next step in your career? Or just curious to know how it is to work in an in-house BI team? Irrespective of the reasons, I hope that the content in this part 1 of the in-house BI teams multi-part article provides you a well-rounded perspective about the topic and probably even triggers your interest to be a part of it. As there are various definitions for BI and confusions around it, it’s important that we clearly state what we mean by BI to ensure we are all on the same page before we start comparing how it is to work in an in-house BI team vs as an external. Business Intelligence In simple words BI (Business Intelligence) is all about deriving information and insight from data efficiently at scale to enable fact-based decision making in...

2 years of IBI

Image
It's been over 2 years that I have started IBI . A small subset of the results is shared below (charts) with an intention to create awareness about IBI. One of my personal objectives for this year is to create awareness about IBI among high school and university students.  So if any of you have connections with academic world and would like me to present the IBI concept, tools, my experience and learning from it, I would be glad to present. Almost everyone today, especially data professionals, preach that organizations should be data-driven, however, how many of the data professionals are themselves  data-driven individuals ? How many of us put in any effort in capturing and accumulating our own data in a way it can be used for self-discovery? Shouldn't we practice what we preach? We all agree that if we don't measure something, it can't be improved. Why don't we measure ourselves? Don't we want to improve? On 19th Feb 2019 I completed 2 years of capturi...

Data, Information and Insight

Image
What exactly does data, information and insight mean in the context of BI? Data : The raw form from which information and insight can be derived. Usually any recorded values, numbers, text, audio, video, stored in any form, any size and any location that gets generated in any event or transaction or just based on the current state or status.  It could be internal data like employee related data (name, address, phone number, gender, etc.) or products, services related data or customer related data or server logs, web clicks, call center data , product reviews, ratings, etc. Anything and everything that can be used to derive information is data. The much hyped big data is also data. It could be stored as files or in a database or just as logs. Burger Chain Example   For example, the event of a customer buying a burger from a fast food restaurant generates lot of data. Time of purchase, terminal used for payment, employee who served the customer, amount and curre...

What is data profiling?

Image
I wanted to profile data,  CSV files with 60+ columns and 1 million plus rows. I started searching for a easy to use tool that I could use for data profiling of these files.  That's when I noticed that data profiling is not clearly explained anywhere.  So here is my attempt to cover this important topic, and I will also introduce the tool, that I found out as part of my search, which helped me a lot to get to know the data. What is data profiling?  In simple words data profiling is a process in which we try to understand the characteristics of the data without associating it with a business process. So basically anyone can carry out data profiling for any data. You don't have to know who generates the data, where and how the data is generated, what is the context of that data. What are the answers we are looking for?  Some are listed below to give you an idea How many columns are actually there in the file? Does it match specification/documentati...

Don't use front-end where it is not required

Image
Regular manual download of data from a portal (frontend) and loading it into a backend system is like exiting an airport and then reentering the same airport to catch a connecting flight from the same terminal when you have no business outside the airport.  I have seen people doing this, downloading data from portals and then loading into other systems manually. If there is no one looking at the data in the UI and no decision is to be made and only regular data feeds are required then don't do it via frontend (GUI), just get a data extract job created that will automatically load the backend system.  This is partly related to one of the previous posts -  BI can take you to places

Should business users spend their time in creating reports?

Image
A marketing manager, or a HR manager, a sales manager, or an account manager should he (or she) be spending time in creating reports or using reports to make decisions? On one side, total dependence on BI team for all information needs can slow down business users. On the other hand if business users have to create their own reports or work their way through the dashboards or self-service BI to get to the numbers they are looking for, it could kill their time and thereby decreasing the time for their real work, part of which is to take decisions based on information and insights. And that's why there needs to be a balance to ensure basic first level information can be self-served and for complex requirements BI teams spend time in delivering the information. 

Prediction vs Forecast

Image
In the context of data analysis/BI  when is something called as a prediction and when is something called as a forecast? Quite a lot of people use these terms interchangeably. Dictionary also can't help here, see below. Source :  https://en.oxforddictionaries.com/definition/prediction So in the context of data analysis/BI I would say forecast is based on past trends.  Time series  is involved. Based on previous behavior future behavior is forecasted for a specific time period.  On the other hand prediction may or may not be based on past trends. So all predictions are not forecasts, but all forecasts are predictions. In this way forecast is like a subset of prediction. Example of a prediction which is a forecast - No of books that will be sold each month in the next 6 months. Example of a prediction which is not a forecast - Country X will win the world cup because they are a good team and in the best form compared to other teams. What do ...

New Year Resolution - Starting with IBI?

Image
A few people who have attended my presentations this year and became aware of the concept of IBI  (Individual Business Intelligence) and a few people with whom I have been in contact this year have shared their interest to start with IBI starting from 2019 or have already started. It feels nice to get feedback like below.  " I attended your presentation at the publication office. I just read your article on IBI and I love your idea."   -   Message in LinkedIn by a senior professional who attended my presentation of PublicBI BI solution for EU Public Procurement at the Publication Office of the European Union, Luxembourg. This is really great that people have started it or plan to start.  I wish you all the best. For those who are still not sure what IBI is and how to use it, below links should be useful. Basically in this post I am placing all the important IBI links in one post in a sequence so it is easy for people to find and understand.  ...

BI can take you to places

Image
Using BI team only as a data extract team is similar to using a car headlamp to light a room.  You are in a dark room in the ground floor, somebody starts a car outside the room and the light from headlamp of the car enters your room through the glass windows. You can now see some of the things in your room. You now order (you have authority unfortunately) the driver to keep the car on with headlamp on. Driver tries his best to convince you to please get a bulb soon for your room so that he can take the car to go places, but you don't understand, because you have never seen a car and don't know that it can move.  This is how some of the uninformed business users and uninformed non-BI technical people view BI teams. They think BI team has the expertise in moving data so let us use them for moving data. No, BI teams move data to consolidate, to combine, to integrate data, to harmonize, etc., so that users can get full picture of the business based on the information a...

Open data is the low hanging fruit within Public data

Image
Public data is all the data that is publicly available for everyone to make use for any purposes they wish to use it. And Open data is the subset of Public data. Open data is well-defined, maintained, generally more reliable, and there is some sense of assurance that there will be continuous availability of data as the data and the related documentations, APIs, access points, portals, etc., are  made available by the generator (source institution) or by authorized data aggregator organization. In this sense, from my point of view open data is the low hanging fruit within Public data. Open data is a subset of public data.  

Tool to auto populate data based on a dimensional model

Image
Are there any tools that can be used to auto populate data into tables based on a dimensional model? If it doesn't exist, may be this is  something some company can build and offer. Various tools including ETL tools provide feature/component that can be used to generate dummy data based on schema definition.  What I am looking for is a tool to which we can provide a dimensional model (Facts and Dimensions), and target DB connection and that the tool is able to auto populate dummy data into the tables (Dimensions first and then Facts) accordingly maintaining all the relationships in tact. Tool should be able to create data for all SCD types and all types of Fact tables.  A tool like this would help in speeding up prototyping, testing, visualizations, etc., so basically would speed the development and delivery. 

Information from data is like bread from wheat

Image
When people are hungry and in a hurry and lack bread-making skill, you can't give them wheat and ask them to make bread and then eat.  You need bakers who know how to transform wheat to bread in a scalable way and keep the bread ready for hungry people. This is exactly how BI professionals are required to transform data to information to derive insights and keep them ready for knowledge hungry users to consume. 

Google Data Studio - Excel on steroids for free

I have seen some people referring to one of the popular data visualization tool as Excel on steroids. Based on that, I think Google Data Studio  is soon going to become the Excel on steroids tool for free . Thanks Google for bringing this amazing product. For those who are new to this, Google Data Studio is the data analysis and visualization platform from Google, and it's free to use.

European Commission awards prizes for the innovative use of public procurement data - European Commission

European Commission awards prizes for the innovative use of public procurement data - European Commission : European Commission awards prizes for the innovative use of public procurement data - European Commission .

KABI Building Blocks and Workflow

Image
Based on a request from one of the readers, the images that captures the building blocks of KABI and the Workflow is attached below. KABI Building Blocks - Click to enlarge KABI Workflow - Click to enlarge

PublicBI BI Solution for EU Public Procurement is now online

Image
After months of unbelievable amount of hard work (no one except for my wife and my 4 year old son knows what I have gone through), feels really great to have taken the Public Procurement BI Solution project to a shape that we could present it at the EU Datathon challenge in Brussels on 2nd October, 2018 (a special day, Gandhi Jayanthi).   And on top of it received an invitation from the European Commission to receive the award from Portuguese Minister of the Presidency and of Administrative Modernization and Ms. Irmfried Schwimann, Deputy Director-General DG GROW, European Commission and present the solution briefly at the Digital Transformation in Public Procurement Conference, Lisbon on 18th October, 2018 in the presence of several distinguished guests, including members of European Parliament.   Happy to announce that the solution is online and can be accessed from this link ( https://www.publicbi.com/euproc ).  I welcome all feedback on the solution. Our 7 m...

Data Analysis of Data Scientists Survey Results Report

Image
The Data Analysis of Data Scientists Survey Results Report has been published on Reportpedia.com.  The report is available on this link -  https://www.reportpedia.com/2018/07/data-analysis-of-data-scientists-survey.html . 

EU Datathon 2018 Finalist

Image
Feel very happy and motivated that the PublicBI team has been shortlisted as one of the finalist in the EU Datathon 2018 competition organized by the Publications Office of the European Union. For more details about EU Datathon see the official website  https://publications.europa.eu/en/web/eudatathon/speakers .   Congratulations to all the finalists and all the best! This now also means that I have to put some of my planned articles (example GDPR) on hold so that I can work on the Datathon topic.

AWS Innovate 2018 - Interesting and Useful

Image
Today I received my certificate for attending the AWS Innovate 2018 Online Conference on 19th July 2018. I had managed to attend all of the sessions from the Big Data and Analytics track and couple of sessions in the AI and Machine Learning track (see agenda below). All of the sessions that I attended were quite interesting and useful. If you have missed don't worry there is still an option to watch the sessions using the on-demand option for free.  See  https://aws.amazon.com/events/aws-innovate/ . If you have attended other tracks and can recommend a must watch session that would be great. I would like to sincerely thank all those who were involved in making this event successful. I can imagine the hard work that would have gone in making this happen based on my experience in organizing the first ever BI Online Conference ( PublicBI BIKON May 2018 ). Looking forward to more such events from AWS team.  AWS Innovate 2018 Agenda Screenshot - Click to enlarge

500 Days of IBI

Image
For all those people who are following my blog, especially on the IBI topic, I am happy to announce that I have now completed 500 days of data capture. For all posts related to IBI see  IBI posts . Just by being disciplined enough to capture the data for 500 days and the fact the I managed to do it and still continue doing it, there is some sense of achievement even though nothing is achieved.  So today I spent some time in data analysis, reflected upon the days that went by using data. Went through the details of the mistakes that I have made, the days I was not happy, etc., to understand if there is a recurring pattern and to tell myself that I shouldn't repeat those. I also made some changes to the data capture template and included new charts too.  I have now added a "Year" column in the data capture template because now I have data for more than a year.  With this addition there is possibility to compare same months across years and also to compare year...

Popular posts from this blog

ETL developer vs Data engineer

KABI - The new Agile Methodology for BI Projects - Implement BI projects quicker happily

Context of data should be clear before data analysis