Rephrased question : ETL developer vs Data engineer Answer : Unfortunately there are no strict industry standards on these job titles. That is just one part of it. Before ETL tools such as DataStage, Informatica, Ab Initio, etc., became popular, developers were hand coding every ETL flow. These ETL tools shortened the ETL flow development time to a great extent and allowed ETL developers to focus on business rule/logic/requirement (what to implement) than how to code it or optimize the code. There are many other benefits of using a tool but I won’t go into that in this answer. So an ETL developer with experience in these tools without any programming (coding) experience was/is able to design and develop end to end data flows. Whenever new types of source/target data format comes up, these tools catch up but it takes time, i.e., the ETL tool provider (e.g. Microsoft, IBM, etc.,) adds new components/connectors within the ETL tool to be able to work with new data format. For example, let’
Currently what we see is that tools and technology limitations are used as a basis for classifying data, and even worse is that the classification is in itself incorrect. The so-called big data is so wrongly named. I have already explained why the naming is incorrect in another article - Why big data is actually small? Wouldn't it be much simpler, better, more meaningful and a standardized approach to classify data into primary and secondary data instead of using misnamed, meaningless and non-standard terms such as small data and big data? Primary data is the essential or core data without which the business cannot function. For example, a purchase transaction in a retail store has to be captured and stored for billing, payment, compliance, etc. These are mandatory requirements. Business Intelligence (BI) on top of that data is not mandatory but very useful, but that is not the main purpose of storing that core data in the first place. Secondary data is all of that data which e
It's satisfying to know that the book Business Intelligence Demystified is gradually gaining recognition and more people are purchasing and more importantly reading it and also sharing their feedback. Happy to let you all know that I have completed reading of the paperback edition of the book. It took me around 12 to 14 hours to read it. It took me more time than I had expected because I was also making a note of all mistakes that I should have avoided in the first place and also a note of all the improvements that can be made in the next edition (if any). I would like to list all the mistakes and keep it transparent, so in the link below you can check the list of mistakes. If you happen to notice any other mistakes that are not listed, please do let me know. Also, if you would like to share how this book has helped you, please do. https://docs.google.com/document/d/1Sdc68dgzZxcedEPaaUoeAW5Zi8a215ScXBhunzXgrPs/edit?usp=sharing
Comments
Post a Comment
Thanks for your comment. It will be posted after checks.