“In Conversation” Series

Introducing Kedro: Yetunde Dada, Principal Product Manager at QuantumBlack

If you follow the various talks at Data Driven NYC, and the data ecosystem on general, it’s plenty apparent that the overall tooling for data, data science and machine learning is still in its infancy, particularly compared to the software stack.

While this may feel ironic (yes, I really do think) given the billions in venture capital money that have been poured in the space, it’s worth remembering that the data stack (at least in its “big data” phase) is relatively recent (10-15 years), while the software stack has had several decades of evolution.

In many organizations, the data science and machine learning stack looks a collection of various tools, some open source, some proprietary, glued together with one-off scripts. Teams started experimenting with one tool, then another, then created ad hoc pathways to make it all work together over time, and before you knew it, you ended up with complex environments that are painful to manage.

In response to this situation, various machine learning frameworks have emerged to make abstract away the complexity. Several of those frameworks were developed internally at large tech comapanies to solve their own problems, and then open sourced.

Kedro is one such example. It was developed and maintained by QuantumBlack, an analytics consultancy acquired by McKinsey in 2015. It’s McKinsey’s first open-source product.

Kedro is somewhat hard to categorize. If it had its own category, it might be considered a Machine Learning Engineering Framework. What React did for front-end engineering code is what Kedro does for machine learning code. It allows you to build “design systems” of reusable machine learning code.

At our most recent Data Driven NYC, we had the great pleasure of hosting Yetunde Dada, a Principal Product Manager at QuantumBlack, who has been the key driving force behind Kedro.

Below is the video and below that, the transcript.

In conversation with Alok Gupta, Head Of Data Science & Machine Learning at DoorDash

Hosting Alok Gupta at our most recent Data Driven NYC was special for a couple of reasons.

First, because Alok is the very talented head of data science and machine learning in a company that has all sorts of really interesting use cases for AI and just had a phenomenal IPO, valuing it at $60B at the time of writing.

Second, because it was a homecoming of sorts for Alok, whose journey in the field of data science was inspired in part by Data Driven NYC – as he puts it:

This also feels like it nicely completes my journey starting 8 years ago when I was working on Wall Street in 2013 and started coming to your monthly evening talks at the Bloomberg building to learn more about ‘Data Science’. That was really a launching point for me to switch from trading to DS, and I’m grateful to be able to give back in a small way :).

One of those stories that brings joy to the heart of the organizers of this community!

Here are the video, as well as a full transcript for easy perusal:

In Conversation with Tristan Handy (Fishtown/DBT) and Jeremiah Lowin (Prefect)

As we close an incredibly active year in the world of data infrastructure, it was a particular treat to host at Data Driven NYC two of the most thoughtful founders in the space, for an in-depth conversation about key trends.

Tristan Handy, is the Founder & CEO of Fishtown Analytics, makers of DBT. DBT is one of the most popular, open-source, command-line tools that enable data analysts and engineers to transform data in their warehouse more effectively. Based in Philadelphia, the company raised both a $12.9M Series A and a $29.5M Series B, back to back in 2020. Tristan also does a great weekly newsletter, The Data Science Roundup.

Jeremiah Lowin, Founder & CEO of Prefect. Prefect is the new standard in dataflow automation, trusted to build, run, and monitor millions of data workflows and pipelines. As another leader in the open-source world, Prefect powers data management for some of the most influential companies in the world.

We had a wide ranging conversation, covering lots of topics: the modern data stack, data lake vs data warehouse, empowering data analysts, workflow automation etc.

Video and full transcript below!

In Conversation with Amit Bendov, CEO, Gong

“It wasn’t a walk in the park. Today, Gong is a super hot company. But at that time, we got a lot of no’s, by not stupid people. There were a lot of objections, like salespeople are going to hate it as a big brother, and Google and Amazon will compete with you“, says Amit Bendov, the CEO of Gong.

From those early days of facing skepticism, Gong has indeed become a hot startup loved by customers and ushering its own category, revenue intelligence. It’s also had tremendous fundraising success with VCs, raising $305M in less than 18 months, including a $200M round on a $2.2B valuation, announced in August 2020.

We were thrilled to welcome back Amit at Data Driven NYC, where he had spoken a few years ago, when he was CEO of SiSense.

In Conversation with Ashley Kramer, CPO/CMO, Sisense

Sisense is a fast-growing business intelligence startup that was ranked #31 in this year’s Forbes Cloud 100, and reached unicorn status at the beginning of 2020 through a $100M Series D led by Insight Partners.

We’ve had Sisense speak twice at Data Driven NYC over the years, first CEO Amit Bendov (now CEO of Gong) (video of the talk here) and then new CEO Amit Orad (video of the talk here).

With all the recent progress, we were particularly excited to hear the update and welcome Ashley Kramer, who recently joined Sisense as Chief Product and Marketing Officer, after a very impressive run at Amazon, Tableau and Alteryx.

We covered a bunch of topics, including:

What does “Business Intelligence” actually mean?
The convergence of BI and data science
How does Sisense position in the context of the consolidation of the BI industry (hint: multi-cloud and focus on different personas, including business users, data analysts and more technical folks)
Where Sisense sits in the modern data stack
How Sisense has been building data network effects with its knowledge graph
Dashboards are great, but embedded analytics are better

As always, Data Driven NYC is a team effort – many thanks to Jack Cohen for co-organizing, Diego Guttierez for the video work and to Karissa Domondon for the transcript!

In Conversation with David Cancel, CEO, Drift

David Cancel puts the “serial” in serial entrepreneur. David has founded a total of five software companies over the years, which he says make him “certifiable”. The list includes Performable, which was acquired by Hubspot, where David subsequently spent three years as Chief Product Officer.

In 2015, David left Hubspot to start Drift, a Boston-based conversational AI platform for marketing and sales. The company has grown very rapidly and now has a whopping 50,000 customers. Drift has raised a total of $107M from a number of venture firms including Sequoia, General Catalyst and CRV. The company has also been recognized as a Forbes Cloud 100 company.

David also has built a very strong presence and brand in the entrepreneurial community. He writes a popular newsletter, ‘The One Thing’ and hosts a long-running podcast, ‘Seeking Wisdom’. He’s very involved in a number of startups as advisor and angel investor. He’s also an Entrepreneur-in-Residence at Harvard Business School.

David and I had a really interesting, wide-encompassing conversation at our most recent Data Driven NYC event, where we covered a range of topics including:

Building a global SaaS brand with 50,000 customers in an astonishingly short amount of time
How Drift was founded to take advantage of a fundamental paradigm shift
Creating a new type of CRM, driven by conversational data, with automation at the core

In Conversation with George Fraser, CEO, Fivetran

One of the biggest recent trends in the data world recently has been the rapid emergence of the “modern data stack”.

This stack is largely centered around the cloud data warehouse, with its massive scalability and elasticity capabilities. Snowflake’s blockbuster IPO this week, and the underlying performance of the company, demonstrate the level of excitement from both customers and investors about the data warehouse.

But the modern data stack is more than just the data warehouse, there’s a whole pipeline involving other technologies, where data gets collected, stored and analyzed. Downstream from the data warehouse, you find business intelligence solutions, as well as some machine learning platforms, to analyze the data. Upstream from it, you find solutions that focus on extracting data from various sources and loading it into the data warehouse (ETL/ELT).

This is where Fivetran comes in. A fast-growing company with a unicorn status, it automates data integration from source to destination, through a large library of connectors.

It was very fun to host Fivetran’s CEO, George Fraser, at our most recent Data Driven NYC event. We had a great conversation, both very approachable for a non-technical audience but also interesting for more technical folks.

From Student Project to Unicorn: In Conversation with Alexander Rinke, CEO, Celonis

Celonis was founded in 2011 by three students who didn’t know they wanted to start a company, but fell in the love with a school project.

Today, Celonis is a Forbes Cloud 100 company, and the leader in a very interesting category, enterprise performance acceleration software, leveraging a company’s data exhaust to understand which processes work and which need to be be optimized, through process mining technology.

It’s also a unicorn startup with $367 million raised to date, most recently at a $2.5 billion valuation from investors such as Accel, 83North, our friends at Arena Holdings (who kindly introduced us to Alex) and Qualtrics founder Ryan Smith, who spoke at Data Driven NYC a few years ago, and then famously went on to sell his company, Qualtrics, to SAP for $8 billion.

We had a really fun chat at our most recent Data Driven NYC, with Celonis co-CEO Alexander Rinke:

How Celonis started as university consulting project when the founders were 21
How Alex waited several hours outside the VIP area of a tech event, until he was able to talk to the founder of SAP, which resulted in a transformative partnership
How the company was bootstrapped for 5 years before taking any VC money
What it takes for a startup to successfully expand internationally
What is process mining software and how does it work
Go to market strategies – horizontal vs vertical
And a lot more

Here’s the video, and below is a full transcript of the chat:

In Conversation with Nate Stewart, Chief Product Officer, Cockroach Labs

While we miss the special energy of having 350+ people in a room every month, it’s been fun to settle into the routine of hosting the online version of Data Driven NYC — our 19,000 person community focused on all things data, AI/ML and enterprise software. Our events are free and open to all, and having them online has enabled many more folks around the world to attend them live – to the point that my co-organizer Jack Cohen and I have been mulling renaming the event “Data Driven Global”.

At the most recent event a few days ago, we had the pleasure of hosting Nate Stewart, Chief Product Officer, and newly appointed Board Member, at Cockroach Labs.

Cockroach is an exciting, fast-growing startup. Named to the 2020 “Future Unicorn” list by CB Insights and Fast Company, it is the company behind CockroachDB, a distributed database with standard SQL for cloud applications. The company has raised $195m to date, from a long list of great investors such as Benchmark, GV, Index, Redpoint, Altimeter, Tiger, Bond, Work Bench – and our firm FirstMark, since the very first round of funding.

Facebook as an AI company: In conversation with Jerome Pesenti, VP of AI, Facebook

While AI may seem like a futuristic goal for most companies around the world, Facebook has already been there for a while. “There’s pretty much a deep learning system in every single Facebook product and they are very much at the core of them” says our guest Jerome Pesenti, VP of AI at Facebook.

Jerome leads the development of artificial intelligence at Facebook, and oversees hundreds of scientists and engineers whose work shapes the company’s direction and impacts our world.

We had had the pleasure of welcoming Jerome at Data Driven NYC in October 2017, in his prior role as CEO, BenevolentAI, and we had chatted about using AI for drug discovery.

It was wonderful to welcome him back in his new capacity at our first **online** Data Driven NYC, courtesy of the coronavirus. It was a fascinating, in-depth conversation.

Below are: a) the video, b) some highlights and c) the full transcript.

Pages: 12

AI, Data, Culture: In Conversation with Ben Horowitz, GP, a16z

Photo Credit: Jack Cohen. Bottom left picture: Amr Awadallah (VP Developer Relations, Google Cloud), Matt Turck (FirstMark), Ben Horowitz (a16z), Michael James ( Founder and Chief Software Architect at Cerebras Systems)

Ben Horowitz resoundingly falls in the category of “needing no introduction”: a highly successful entrepreneur who navigated a perilous situation with his business (Loudcloud, which became Opsware) to a $1.65B acquisition by HP, he’s also the founder of premier Silicon Valley venture capital firm Andreessen Horowitz (aka “a16z”), and the best selling author of two books: “The Hard Thing About Hard Things” and the newly-released “What You Do Is Who You Are”.

It was a special treat to host Ben for a fireside chat at the most recent most recent edition of Data Driven NYC – a great evening that included two other terrific speakers: Amr Adwallah, now VP of Developer Relations at Google Cloud, and previously co-founder and CTO at Cloudera (NYSE: CLDR) and Michael James, co-founder of AI chip Cerebras.

We spent a good hour with Ben and covered a bunch of topics, loosely organized in two parts, first AI and data, and then culture an his new book.

Below are two videos covering each part, as well as a FULL TRANSCRIPT for anyone who prefers to read.

Building a $12B Public Company: In Conversation with Olivier Pomel, CEO, Datadog

By any measure, Datadog is an incredible entrepreneurial success story. The company went from a tiny startup in 2010 that had trouble raising money, to a public company that, at the time of writing, has a market capitalization of $12.5B. It was a pioneer in the category of DevOps and observability, and it’s now a clear leader. With revenues hovering around $350M, it has 1,300 employees across 31 locations around the world.

Perhaps improbably, the founders built the company out of New York, which many people over the years have thought of as a hub for adtech, media and commerce startups only. Along the way, they faced a lot of skepticism: “Whenever we pitched West Coast investors it was sort of seen as a form of mental deficiency to be based in New York and doing infrastructure“, says Olivier. I wrote a few months ago about the significance of the Datadog IPO for the ecosystem and beyond. Ironically, out of the three top public tech companies in New York today, two are infrastructure software companies (Datadog and MongoDB).

Not one for gratuitous self-aggrandizing, Olivier has given surprisingly few interviews over the years, and it was a real treat to sit down with him for a fireside chat in front of a packed house of 350 attendees at our most recent Data Driven NYC.

We had an in-depth conversations and covered a lot of topics.

The first half of our conversation was focused on Datadog itself, starting with a high level overview of the observability and DevOps space to make the discussion approachable by people who don’t know the space.

The second half of the conversation was focused on all sorts of lessons learned along the way of building a major company- sales, marketing, fundraising, etc.

Below is the video. We have also provided a full written transcript to make the content easy to scan through (many thanks to Karissa Domondon for her help with this).

The Power of Open Source: In conversation with Mike Volpi, General Partner, Index Ventures

Our most recent VC guest at Data Driven NYC, Mike Volpi of Index, has had a pretty amazing last couple of years, with three of his venture investments going public: Zuora, Sonos and Elastic.

Before becoming a VC, Mike ran Cisco’s routing business where he managed a P&L in excess of $10 billion in revenues, and acquired over 70 companies (note: probably a pretty good way to make a lot of friends in Silicon Valley).

A partner at Index Ventures in San Francisco, Mike invests primarily in infrastructure, open-source and artificial intelligence companies, so he was a perfect guest to have at the event. In particular, he invested in two prior presenting companies: Confluent and Cockroach Labs (in which FirstMark is also an investor).

We had a really interesting conversation about open source, AI and venture capital. Here’s the video below, and l have jotted down a few notes as well, below the fold.

Notes from the chat:

The Killer App for Machine Learning: In Conversation with Pedro Domingos, Head of Machine Learning, D.E. Shaw

Best-selling author, Professor of Computer Science at the University of Washington, recent recipient of the prestigious IJCAI John McCarthy Award for excellence in artificial intelligence research (among other awards) and Head of the Machine Learning Research group at D.E. Shaw: Pedro Domingos has one of the most incredible resumes in the world of AI, and we were thrilled to host him for a fireside chat at our most recent Data Driven NYC.

We covered a bunch of things, including why finance is a killer app for machine learning, his much-lauded book, ‘The Master Algorithm’ and what’s truly scary about AI (hint: not the Terminator).

The Cambrian Explosion of SaaS: In Conversation with Sarah Guo, General Partner, Greylock

Last year, Sarah Guo made news by becoming the youngest General Partner at Menlo Park firm Greylock Partners, and we were delighted to host her at our most recent Data Driven NYC.

Greylock is one of the oldest firms in venture capital, notable in particular for its investments in Facebook, LinkedIn and AirBnB. Greylock has also actively invested in the data ecosystem, including in a number of companies that presented at Data Driven NYC over the years: Cloudera, Sumo Logic, Trifacta, Instabase, etc.

Sarah is mostly focused on enterprise, SaaS and security investments, and we got into a bunch of interesting topics during this conversation.

Here’s the video, and below are some notes.