As the volume of data in the enterprise continues to explode, with ever large amounts stored in data warehouses and data lakes, the problem of data discovery has become an increasingly painful one. How do data analysts, data scientists and business people find not just data, but the right data for the problem they need to solve? How do they know how it was produced, how recently it was updated and whether that’s the right dataset they need to use? In addition, from an organization’s perspective, there’s a question of data governance – how to manage access in a way that preserves data security and privacy, and ensures compliance with data protection regulations (GDPR, CCPA, etc.).
Data catalogs have been a powerful response to those problems, and that category has seen renewed activity in the last couple of years with a whole new group of startup entrants.
At our most Data Driven NYC, we got a chance to chat Mark Grover, co-founder and CEO of Stemma and the co-creator of Amundsen, the leading open source data discovery and metadata engine. Mark built Amundsen while he was a product manager at Lyft and started Stemma to offer a fully managed Amundsen.
It was a fun conversation about the space. Below is the video and below that, the transcript.Continue reading “In Conversation with Mark Grover, CEO, Stemma”