Advertisement
ctrlvfailed

Data Warehouse / Data Lake

Jan 17th, 2020
226
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.30 KB | None | 0 0
  1. a data lake is a centralized repository for all data, including structured and unstructured. A data warehouse utilizes a pre-defined schema optimized for analytics. In a data lake, the schema is not defined, enabling additional types of analytics like big data analytics, full text search, real-time analytics, and machine learning.
  2. __________________________________________________________________________________________________________________________________________
  3. Data Warehouse
  4. Relational data from transactional systems, operational databases, and line of business applications.
  5. Designed prior to the data warehouse implementation (schema-on-write).
  6. Fastest query results using higher cost storage.
  7. Highly curated data that serves as the central version of the truth.
  8. Business analysts, data scientists, and data developers.
  9. Batch reporting, BI, and visualizations
  10.  
  11. Data Lake
  12. Non-relational and relational data from IoT devices, web sites, mobile apps, social media, and corporate applications.
  13. Written at the time of analysis (schema-on-read).
  14. Query results getting faster using low-cost storage.
  15. Any data that may or may not be curated (i.e. raw data).
  16. Data scientists, data developers, and business analysts (using curated data).
  17. Machine learning, predictive analytics, data discovery, and profiling.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement