This Lake House approach consists of following key elements:įollowing diagram illustrates this Lake House approach in terms of customer data in the real world and data movement required between all of the data analytics services and data stores, inside-out, outside-in, and around the perimeter.Ī layered and componentized data analytics architecture enables you to use the right tool for the right job, and provides the agility to iteratively and incrementally build out the architecture. The data lake allows you to have a single place you can run analytics across most of your data while the purpose-built analytics services provide the speed you need for specific use cases like real-time dashboards and log analytics. Lake House approachĪs a modern data architecture, the Lake House approach is not just about integrating your data lake and your data warehouse, but it’s about connecting your data lake, your data warehouse, and all your other purpose-built services into a coherent whole. This Lake House approach provides capabilities that you need to embrace data gravity by using both a central data lake, a ring of purpose-built data services around that data lake, and the ability to easily move the data you need between these data stores. In this post, we present how to build this Lake House approach on AWS that enables you to get insights from exponentially growing data volumes and help you make decisions with speed and agility. To overcome this data gravity issue and easily move their data around to get the most from all of their data, a Lake House approach on AWS was introduced. As data in these systems continues to grow it becomes harder to move all of this data around. To get the best insights from all of their data, these organizations need to move data between their data lakes and these purpose-built stores easily. At other times, they are storing other data in purpose-built data stores, like a data warehouse to get quick results for complex queries on structured data, or in a search service to quickly search and analyze log data to monitor the health of production systems. In order to analyze these vast amounts of data, they are taking all their data from various silos and aggregating all of that data in one location, what many call a data lake, to do analytics and ML directly on top of that data. Organizations can gain deeper and richer insights when they bring together all their relevant data of all structures and types and from all sources to analyze. October 2022: This post was reviewed for accuracy.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |