Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump

Data Lake Architecture

Organizations invest incredible amounts of time and money obtaining and then storing big data in data stores called a data lake. But how many of these organizations can actually get the data back out in a useable form? Very few can turn these into an information gold mine. Most wind up with garbage dumps.

This book explains explain how to build a useful data lake, where data scientists and data analysts can solve business challenges and identify new business opportunities. Learn how to structure these resources as well as analog, application, and text-based data ponds to provide maximum business value. Understand the role of the raw data pond and when to use an archival data pond. Leverage the four key ingredients for success: metadata, integration mapping, context, and metaprocess.

Forest Rim Founder and CEO Bill Inmon opened our eyes to the architecture and benefits of a data warehouse, and now he takes us to the next level of data lake architecture. Currently Bill is leading the efforts of Forest Rim Technology in developing Textual ETL, a ground breaking tool that allows organizations to harness their unstructured text data and make informed decisions based on this data.

Read more about this subject in this article also authored by Bill Inmon.

Buy this book on Amazon here.

Watch a recent fireside chat by Bill Inmon with Databricks CEO Ali Ghodsi on the Data Lake on our Youtube Channel here.

Bill Inmon is a