What is a Data Lake?
WHAT IS A DATA LAKE? You may have heard the term data lake when it comes to how data is captured and stored.
We’ll look at what each of these contain – why you would use data lake – who uses data lake and how quick data lake are to make changes to.
Data lakes are set up to accept all kinds of data. Think of a data lake as the largest way to store data and there are offshoots underneath.
A data lake is a centralized repository that allows you to store a vast amount of raw data in its native format until it is needed. Data-driven businesses often use this storage architecture to get more business value from their data assets.
The data lake is an emerging technology that has redefined the way we extract, store and analyze data. The data lake is different from data warehouse in many aspects but the main difference is in its philosophy. Data lake philosophy says – LOAD FIRST, THINK LATER whereas Data warehouse philosophy is THINK FIRST, LOAD LATER.
Data lake is like a real lake which gets water from all the different sources of water like rain, rivers, tributaries, and sewers, etc., similarly data lake get all kinds of data from all kind of source systems. Data lake can contain structured, semi-structured, unstructured data, images, logs. Data warehouse, on the other hand, can only store structured data as per defined business requirements.
***** Data lake Features *****
1) Load First, Think Later
2) Stores all type of data
3) Low-cost storage
4) Agile and Flexible
5) Suitable for advanced data analytics – Data scientists
6) Security is still maturing
Written by admin