Data Lake:

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is related and refined elsewhere. Data lake shares a data environment that comprises multiple repositories and capitalizes on big data technologies. It provides data to an organization for a variety of analytics processes.

Characteristics of Data Lake:

1. All data is loaded from source systems, and no data is turned away.

2. Data is stored at the leaf level in an untransformed or nearly untransformed state.

3. Data is transformed and scheme is applied to fulfil the needs of analysis.