Every year, as part of our MAD project, we do a presentation at Data Driven NYC about the top trends we see across data and ML/AI. (here’s the 2022 version for reference).
The presentation, done this year with my FirstMark colleague Kevin Zhang, is a whirlwind tour of top trends, as opposed to anything particularly in-depth, as we tried to keep it short. But hopefully it should provide a good overview of what’s been happening in those spaces, for anyone interested in a recap.
Software Daily (aka Software Engineering Daily) has been on my podcast rotation for a while, so it was fun to get a chance to be a part of it – thanks to Jocelyn Houle who moonlights as podcast host on top of her day job at Securiti. While this was done in connection with the publication of the MAD 2023, we ended up talking a lot of about venture capital and entrepreneurship in general, including some personal stories.
The video is below, and here’s the audio-only podcast: Apple, Spotify.
One of the cool parts of publishing the MAD landscape every year is the conversations that come with it. Here’s a fun chat I did recently with Joe Reis and Matthew Housley, co-founders of data consulting company Ternary Data and co-authors of the O’Reilly book, Fundamentals of Data Engineering (see their recent talk at Data Driven NYC). We covered a lot of things, check it out!
It has been less than 18 months since we published our last MAD landscape, and it has been full of drama.
When we left, the data world was booming in the wake of the gigantic Snowflake IPO, with a whole ecosystem of startups organizing around it.
Since then, of course, public markets crashed, a recessionary economy appeared and VC funding dried up. A whole generation of data/AI startups has had to adapt to a new reality.
Meanwhile, the last few months saw the unmistakable, exponential acceleration of Generative AI, with arguably the formation of a new mini-bubble. Beyond technological progress, it feels that AI has gone mainstream, with a broad group of non-technical people around the world now getting to experience its power firsthand.
The rise of data, ML and AI is one of the most fundamental trends in our generation. Its importance goes well beyond the purely technical, with a deep impact on society, politics, geopolitics and ethics.
“It’s been crazy out there. Venture capital has been deployed at unprecedented pace, surging 157% year-on-year globally […]. Ever higher valuations led to the creation of 136 newly-minted unicorns […] and the IPO window has been wide open, with public financings up +687%”
Well, that was…last year. Or more precisely, 15 months ago, in the MAD 2021 post, written pretty much at the top of the market, in September 2021.
Since then, of course, the long-anticipated market downturn did occur, driven by geopolitical shocks and rising inflation. Central banks started increasing interest rates, which sucked the air out of an entire world of over-inflated assets, from speculative crypto to tech stocks. Public markets tanked, the IPO window shut down, and bit by bit, the malaise trickled down to private markets – first at the growth stage, then progressively to the venture and seed markets.
We’ll talk about this new 2023 reality in the following order:
In the hyper-frothy environment of 2019-2021, the world of data infrastructure (nee Big Data) was one of the hottest areas for both founders and VCs.
It was dizzying and fun at the same time, and perhaps a little weird to see so much market enthusiasm for products and companies that are ultimately very technical in nature.
Regardless, as the market has cooled down, that moment is over. While good companies will continue to be created in any market cycle, and “hot” market segments will continue to pop up, the bar has certainly escalated dramatically in terms of differentiation and quality for any new data infrastructure startup to get real interest from potential customers and investors.
Here is our take on some of the key trends in the data infra market in 2023.
Everybody is talking breathlessly about AI all of a sudden. OpenAI gets a $10B investment. Google is in Code Red. Sergey is coding again. Bill Gates says what’s been happening in AI in the last 12 months is “every bit as important as the PC or the internet” (here). Brand new startups are popping up (20 Generative AI companies just in the Winter ’23 YC batch). VCs are back to chasing pre-revenue startups at billions of valuation.
So what does it all mean? Is this one of those breakthrough moments that only happen every few decades? Or just the logical continuation of work that has been happening for many years? Are we in the early days of a true exponential acceleration? Or in the early days of a hype cycle and mini financing bubble, as many in tech are desperate for the next big platform shift, after social and mobile, and the crypto headfake?
1) Update public profiles: remove any reference to ever having ever liked crypto/web3, deny any rumors that I was claiming to be “down the rabbit hole” less than a year ago, say that I would have “definitely done deep due diligence” on FTX
2) Show thought leadership: tweet incessantly about Generative AI, change my PFP to a Lensa avatar, talk about how GPT-4 is “insanely mind-blowing” (reminder: cold email OpenAI to actually get early access to GPT-4)
3) Add value: advise CEOs to (a) grow at least as fast as before, but also (b) drastically cut all costs. Use terms like “EBITDA” and “FCF” which I just learned about in 2022. It’s all about “responsible growth” now (hint that I always advocated for it, deep down)
Cockroach Labs, the ambitious database company with a funny name, has gone from strength to strength over the last few years. Started with three ex-Googlers in 2014, it successfully navigated in its early years the perilous waters of being an early database company that customers need to trust for mission-critical applications. Over time, it’s gained tremendous momentum with a now long list of marquee customers, and was most recently valued at $5B.
In part because we at FirstMark are proud investors in the company, we’ve featured Cockroach Labs several times at Data Driven NYC over the years: in 2014 (video), 2018 (video) and 2020 (video), and it’s been really fun to see their tremendous progress.
It was great to host CEO Spencer Kimball once again and check in on the latest, as well as lessons learned building a successful open source enterprise software company.
We covered a bunch of really interesting things, including:
The origins of the company
The evolution of the database market from SQL to NoSQL to NewSQL to cloud
The current opportunity around serverless
Open source license questions
Go to market: community led, bottoms up, top down?
Who’s the perfect first sales hire for an enterprise software company
In addition to his role as co-founder and Chief Analytics Officer of Mode, a leading collaborative data platform, Benn Stancil is a prolific and thought-provoking writer about the broad data space. Over the last couple of years in particular, he’s produced a series of insightful and entertaining posts on his newsletter: https://benn.substack.com/
We had welcomed Benn at Data Driven NYC back in 2019 to talk about Mode (see the video, “The case for hiring more data analysts“), and it was great to have him back from a wide-encompassing conversation where he addressed some of the “sacred cows” of the data world.
One of the most interesting conversations on the space we’ve had recently, highly recommended watch!
The only thing better than VC pontification in general is VC pontification *in French*
Thanks Sabrina Quagliozzi and BFM Business for giving me the opportunity to chat about the current state of VC markets on Tech & Co. at the NASDAQ on Times Square.
Certainly a big pull back in venture and growth investing so far this year – the header above says “Markets: after the party, investors have a hangover” (!)… but many reasons to be hopeful, as the long term trends around digital transformation, AI and automation will only accelerate from here.
Also, the NYC tech ecosystem is on 🔥🔥 and a perfect home in the US for European startups
In a world where everything moves ever faster, it seems inevitable that data infrastructure will need to move sooner or later to a predominantly real-time paradigm. Yet the infrastructure for real-time data is still trailing far behind its batch processing cousin.
Enter Estuary, a real-time data ops platform, in which my firm FirstMark led a large seed round last year. Estuary enables you to synchronize your data products across all your systems (whether databases, SaaS, pub/sub, etc) in real-time, and also to join aggregate, join, or otherwise take action on, your data while in motion. Estuary is not a database – instead it makes your databases real time. It abstracts away the complexity of building real-time, data-intensive applications at scale.
It was a lot of fun to host at Data Driven NYC Estuary’s co-founder and CTO Johnny Graettinger for a fun, approachable and educational talk about the company, its product and the real-time data world.
Since its creation in 2014, Ledger (in which FirstMark is a very proud investor) has rapidly evolved to become one of the key global players in the entire crypto and web3 ecosystem.
Ledger is mostly known as the world’s top provider of hardware wallets. Over 15% of the world’s crypto assets are secured through Ledger Nanos, with more than 4 millions units already sold in 180 countries.
But Ledger goes much beyond hardware, providing apps and services through Ledger Live, enterprise solutions, and more.
It was great to get a chance to chat with Ledger’s CEO, Pascal Gauthier, in the context of Crypto Driven.
We had a wide ranging conversation, covering in particular:
the fundamental benefits of hardware to secure digital assets
what core technology exists within a Ledger device
Ledger Live, the company’s app and software platform for buying, selling, swaping and staking crypto
Ledger Enterprise, the company’s B2B offering
Some of the new products announced at Ledger’s bi-annual flagship event, Ledger Op3n, including Ledger Market, a new secure NFT Platform, and Ledger Enterprise Create, a secure platform for brands to scale their Web3 operations with a key focus on NFTs, giving them the treasury management, NFT creation and ownership capabilities they need.
In the world of data infrastructure, dbt Labs has undoubtedly been one of the most exciting startups to watch. The company is the creator and maintainer of dbt, a data transformation tool that enables data analysts and engineers to transform, test and document data in the cloud data warehouse. Beyond this, the company is empowering a new generation of data analysts and enabling them to create and disseminate organizational knowledge.
dbt’s CEO, Tristan Handy, is also one of the most thoughtful and interesting CEOs in the space, having played a pivotal role in the emergence of what’s often referred to as the “Modern Data Stack”, a suite of tools and processes that leverage the power of cloud data warehouses to bring data processing to the modern era.
We had the pleasure of hosting Tristan once during the pandemic in 2021 for a greatonlinechat with Jeremiah Lowin, CEO of Prefect. It was a particular treat to welcome back Tristan, this time for our first in-person event since 2020!