Showing posts from May, 2020

Businesses should classify data not IT

Currently what we see is that tools and technology limitations are used as a basis for classifying data, and even worse is that the classification is in itself incorrect. The so-called big data is so wrongly named. I have already explained why the naming is incorrect in another article - Why big data is actually small?   Wouldn't it be much simpler, better, more meaningful and a standardized approach to classify data into primary and secondary data instead of using misnamed, meaningless and non-standard terms such as small data and big data?  Primary data is the essential or core data without which the business cannot function. For example, a purchase transaction in a retail store has to be captured and stored for billing, payment, compliance,  etc. These are mandatory requirements. Business Intelligence (BI) on top of that data is not mandatory but very useful, but that is not the main purpose of storing that core data in the first place. Secondary data is all of that data which e

ETL developer vs Data engineer

Rephrased question : ETL developer vs Data engineer Answer : Unfortunately there are no strict industry standards on these job titles. That is one part of it. Before ETL tools like Datastage, Informatica, Ab Initio, etc became popular, developers were hand-coding every ETL flow. These ETL tools shortened the ETL flow development time to a great extent and allowed ETL developers to focus on business rule/logic/requirement (what to implement) than how to code it or optimize the code. There are many other benefits of using a tool but I won’t go into that in this answer. So an ETL developer with experience in these tools without any programming (coding) experience was/is able to design and develop end to end data flows. Whenever new types of source/target data format comes up, these tools catch up but it takes time, i.e the ETL tool provider (e.g. Microsoft, IBM, etc) adds new components/ connectors within the ETL tool to be able to work with new data format. For example, let’s say xml f

Future data in data warehouse?

Rephrased question : Does data warehouse store future data (meaning forecasts or predictions) ? Answer : Data warehouses stores whatever the company /business decides to store. There is nothing that stops anyone from storing forecasts. So if forecasts are created and if these needs to be stored then yes it is stored. For a Gas group in the UK we did a project in which we had to store the forecasts for the next 25 years and this was an yearly exercise, which means every year the forecasts for the next 25 years was stored and versioned, so we would be able to fetch the differences between the forecasts too. 

Importance of data privacy compared to data governance?

Rephrased question : What is the level of importance of data privacy compared to the level of importance of data governance? Answer : Data privacy is one of the aspects within data governance. In data governance, simply put, on one side there is a need to ensure data is protected and that it doesn’t fall into wrong hands, on the other side there is a need to enable potential users of data and a potential to monetize data. Data governance should come up with policies, framework, principles, etc that satisfies both sides.

Data Analyst or Business Analyst experience to become a Data Engineer?

Rephrased question : Which job would lead to a data engineer job? Is it data analyst or business analyst? Answer : Sorry, there are two things wrong in the question from my point of view There is a wrong assumption in the question that to become a data engineer there is a need to first get into some other entry-level job. A business analyst or a data analyst is a job of its own and is not an entry-level job to lead to data engineer job. From a BA you become a senior BA, from a data analyst you become a senior data analyst and so on. Now that there is some clarity, from any of these roles if the person is interested, is ready to learn/unlearn and there is an opportunity in the organization the person can switch between these roles. As data analyst is in general more technical than a business analyst (in the sense of requirements analyst) it would be easier for a data analyst to switch to data engineer role compared to a business analyst. However, the difference is, these are horizontal

Best affordable business intelligence software in 2020?

Best affordable business intelligence software in 2020? This was one of the questions asked.  Answer : If you are referring to BI software in the narrow sense limited to front-end tools like Microstrategy, Tableau, Power BI etc instead of referring to BI in the correct broader sense as an umbrella term for the end to end process of deriving information and insights from data. In any case the answer depends on the specific requirements and current situation (current set of tools / software, strategy, etc) of the business. In my view it is not right to state that one tool is best for all purposes and situations. We should pick up the tools based on the needs and the situations. For example, if the company has invested a lot in Oracle products and if Oracle provides additional software for BI free then why wouldn't you want to consider that as an option? Same with Microsoft, let's say some department already has purchased license for MS SQL, SSIS, SSAS, SSRS, MS Excel, Power BI, S

Data Science job without experience

Data Science job without experience. This was one of the questions asked. Simple answer : One of the best ways in my view is to start as a junior/trainee/intern in a team and learn on the job with guidance from experienced people.

Popular posts from this blog

Corona Virus Analytics

IBI New Template - Excel Version

BI Architect course and BI Tool question