Posts

Showing posts from May, 2020

Businesses should classify data not IT

Image
Currently what we see is that tools and technology limitations are used as a basis for classifying data, and even worse is that the classification is in itself incorrect. The so-called big data is so wrongly named. I have already explained why the naming is incorrect in another article - Why big data is actually small?   Wouldn't it be much simpler, better, more meaningful and a standardized approach to classify data into primary and secondary data instead of using misnamed, meaningless and non-standard terms such as small data and big data?  Primary data is the essential or core data without which the business cannot function. For example, a purchase transaction in a retail store has to be captured and stored for billing, payment, compliance,  etc. These are mandatory requirements. Business Intelligence (BI) on top of that data is not mandatory but very useful, but that is not the main purpose of storing that core data in the first place. Secondary data is all of that data which e

ETL developer vs Data engineer

Image
Rephrased question : ETL developer vs Data engineer Answer : Unfortunately there are no strict industry standards on these job titles. That is just one part of it. Before ETL tools such as DataStage, Informatica, Ab Initio, etc., became popular, developers were hand coding every ETL flow. These ETL tools shortened the ETL flow development time to a great extent and allowed ETL developers to focus on business rule/logic/requirement (what to implement) than how to code it or optimize the code. There are many other benefits of using a tool but I won’t go into that in this answer. So an ETL developer with experience in these tools without any programming (coding) experience was/is able to design and develop end to end data flows. Whenever new types of source/target data format comes up, these tools catch up but it takes time, i.e., the ETL tool provider (e.g. Microsoft, IBM, etc.,) adds new components/connectors within the ETL tool to be able to work with new data format. For example, let’

Future data in data warehouse?

Image
Rephrased question : Does data warehouse store future data (meaning forecasts or predictions) ? Answer : Data warehouses stores whatever the company/business has decided to store. There is nothing that stops anyone from storing forecasts. So if forecasts are created and if these needs to be stored then yes it is stored. For example, for a Gas group in the UK we did a project in which we had to store the forecasts for the next 25 years and this was an yearly exercise, which means every year the forecasts for the next 25 years was stored and versioned, so we would be able to fetch the differences between the forecasts too. 

Importance of data privacy compared to data governance?

Image
Rephrased question : What is the level of importance of data privacy compared to the level of importance of data governance? Answer : Data privacy is one of the aspects within data governance. In data governance, simply put, on one side there is a need to ensure data is secure, protected and that it doesn’t fall into wrong hands, on the other side there is a need to ensure that value is derived out of data and data is monetized. Data governance should come up with policies, framework, principles, etc., that satisfies/balances both sides.

Data Analyst or Business Analyst experience to become a Data Engineer?

Image
Rephrased question : Which job would lead to a data engineer job? Is it data analyst or business analyst? Answer : Sorry, there are two things wrong in the question from my point of view There is a wrong assumption in the question that to become a data engineer there is a need to first get into some other job. A business analyst or a data analyst is a job of its own and is not an entry-level job to lead to data engineer job. From a BA the natural progression is that you become a senior BA, and similarly from a data analyst you become a senior data analyst and so on.  Now that there is some clarity, from any of these roles if the person is interested, is ready to learn/unlearn and there is an opportunity in the organization the person can switch between these roles. As data analyst is in general more technical than a business analyst (in the sense of requirements analyst) it would be easier for a data analyst to switch to data engineer role compared to a business analyst. However, the d

Best affordable business intelligence software in 2020?

Image
Best affordable business intelligence software in 2020? This was one of the questions asked.  Answer: You are actually referring to BI software in the narrow sense, i.e., limited to frontend tools such as Microstrategy, Tableau, Power BI, etc., instead of referring to BI in the correct broader sense as an umbrella term for the end to end process of deriving information and insights from data. In any case the answer depends on the specific requirements and current situation (current set of tools, software, strategy, etc.) of the business. In my view it is not right to state that one tool is best for all purposes and situations. We should select the tools based on the needs and the situations. For example, if a company has invested a lot in Oracle products and if Oracle provides additional software for BI free then why wouldn't you want to consider that as an option? Same with Microsoft, let's say some department already has purchased license for MS SQL, SSIS, SSAS, SSRS, MS Exce

Data Science job without experience

Image
How to get a data science job without experience? This was one of the questions asked. Simple answer : One of the best ways in my view is to start as a junior/trainee/intern in a team and learn on the job with guidance from experienced people.

Popular posts from this blog

ETL developer vs Data engineer

3 years of IBI