Red Hot: The 2021 Machine Learning, AI and Data (MAD) Landscape

Full resolution version of the landscape image here

It’s been a hot, hot year in the world of data, machine learning and AI. 

Just when you thought it couldn’t grow any more explosively, the data/AI landscape just did: rapid pace of company creation, exciting new product and project launches, a deluge of VC financings, unicorn creation, IPOs, etc.  

It has also been a year of multiple threads and stories intertwining.

One story has been the maturation of the ecosystem, with market leaders reaching large scale and ramping up their ambitions for global market domination, in particular through increasingly broad product offerings.  Some of those companies, such as Snowflake, have been thriving in public markets (see our MAD Public Company Index), and a number of others (Databricks, Dataiku, Datarobot, etc.) have raised very large (or in the case of Databricks, gigantic) rounds at multi-billion valuations and are knocking on the IPO door (see our Emerging MAD company Index – both indexes will be updated soon).

But at the other end of the spectrum, this year has also seen the rapid emergence of a whole new generation of data and ML startups.  Whether they were founded a few years or a few months ago, many experienced a growth spurt in the last year or so.  As we will discuss, part of it is due to a rabid VC funding environment and part of it, more fundamentally, is due to inflection points in the market.

In the last year, there’s been less headline-grabbing discussion of futuristic applications of AI (self-driving vehicle, etc.), and a bit less AI hype as a result.  Regardless, data and ML/AI-driven application companies have continued to thrive, particularly those focused on enterprise use cases.  Meanwhile, a lot of the action has been happening behind the scenes on the data and ML infrastructure side, with entire new categories (data observability, reverse ETL, metrics stores, etc.) appearing and/or drastically accelerating.

To keep track of this evolution, this is our eighth annual landscape and “state of the union” of the data and AI ecosystem – co-authored this year with my FirstMark colleague John Wu.  (For anyone interested, here are the prior versions: 2012, 2014, 2016, 2017, 2018, 2019 (Part I and Part II) and 2020.)

For those who have remarked over the years how insanely busy the chart is, you’ll love our new acronym – Machine learning, Artificial intelligence and Data (MAD) – this is now officially the MAD landscape!

Continue reading “Red Hot: The 2021 Machine Learning, AI and Data (MAD) Landscape”

Dataiku’s Series E: Ushering the Era of Everyday AI

Today, Dataiku is announcing a major new financing – a total of $400m at a $4.6B valuation, led by Tiger Global (which had also invested in the company’s Series D), alongside a great group of existing and new investors.

While financings are ultimately just milestones, this is certainly a testament to the remarkable progress the company has been making towards becoming a major global software player, as it has scaled to hundreds of customers around the world and some 750 employees (and yes, hiring a lot more).

Beyond the headlines and high-fives, what is the story? Here’s a quick industry backgrounder and reminder for anyone new to the company.

A huge part of the data world has been historically focused on business intelligence, with both historical players (Tableau, Microsoft’s Power BI, Google Looker) and newer players (SiSense, Mode, etc.). Business intelligence tools enable you to analyze the past and the present of your business: “which region performed best last quarter?”, “who are our best salespeople?” etc. This is sometimes referred to as descriptive analytics.

Dataiku is a leader in another part of the data world, which different people call different names: data science, enterprise AI (for artificial intelligence), enterprise machine learning. Beyond the semantics, the core idea is to make it possible to asnwer questions about the future of your business, based on the analysis of historical data: “which customers are most likely to buy this product?”, “which customers are most likely to churn?”, “which transaction is most likely to be fraudulent?”, “which region is most likely to show strong demand this month?”. This area is sometimes referred to as predictive analytics.

Continue reading “Dataiku’s Series E: Ushering the Era of Everyday AI”

New Investment: Synthesia or the Rise of “Video as Code”

We all have insatiable appetite for video, both in our personal and professional lives. Time and again, video is shown to capture our attention better than any other medium. This is increasingly how we learn, explore, collaborate and get entertained.

However, especially in an enterprise context, creating professional-quality video remains a complex and costly endeavor. For all the capabilities of smartphones, most companies still need studio-level equipment to produce enterprise-grade videos: cameras, sound equipment, actors, post-production editing. The process is time-consuming, and not very scalable. Shooting a video in multiple languages, for example, requires multiple actors or dubbing, Any update requires everyone to go back to the studio.

But what if video could be just… code? What if it could be infinitely flexible and customizable at scale, as simple as an API call?

Today we’re excited to announce that FirstMark led a $12.5M Series A investment in Synthesia – a fast-growing startup that offers exactly that.

Synthesia makes creating a business video as simple as writing an email or putting together a powerpoint presentation – a compelling “text to video” experience.

Continue reading “New Investment: Synthesia or the Rise of “Video as Code””

Introducing the MAD (ML, AI, Data) Public Company Index

Today, we are previewing a new public market index – the MAD (for machine learning, AI and data) index.

Readers of this blog know that we have been tracking the data ecosystem since 2012, through annual landscapes (see the 2020 Data & AI Landscape).

Over the last few years, a funny thing happened – some of the small startups we had started tracking grew up, did an IPO and became large public companies.

Not so long ago, public market investors used to say there’s was no good way of “playing” the Big Data and AI trends, due to the lack of public companies in the space. This is less true today.

However, there isn’t much out there in terms of looking at those public companies as a group. For example, see this Seeking Alpha piece, Top 3 Artificial Intelligence ETFs To Consider, where none of the companies listed are actually AI companies.

Hence the idea of the MAD Index. It’s still a small group of companies, but my colleague John Wu and I were curious to see how they fared in public markets, now and going forward.

This is just a start. We anticipate that a number of companies will join this group in the next year or two, and we’re excited to see how this index matures.

Continue reading “Introducing the MAD (ML, AI, Data) Public Company Index”

Quick S-1 Teardown: C3.ai

For anyone following the software industry, there’s been a little bit of snark about C3.ai (“C3”) over the years.  Here’s a company that was founded by Silicon Valley royalty (Tom Siebel, who sold Siebel Systems to Oracle in 2006 for just shy of $6B), with seemingly limitless access to capital, that somehow seemed to be pivoting every few years to something new – from energy at first, to the Internet of Things, to Artificial Intelligence. 

C3 also largely eschewed the startup echochamber – funded personally by its founder at first, it didn’t raise money from the usual VC suspects, target well-know startups as its first customers, or open source any AI frameworks, working instead with a small group of Fortune 1000 and government customers. As a result, it didn’t build the kind of buzz that often precedes the most notable startups on their way to becoming public.

Lo and behold, what emerges in this IPO is a solid company by enterprise software IPO standards, with $157m in revenue, growing 71% yoy, a 75% gross margin and a $69m loss. 

It will be interesting to see how the market reacts to this IPO.

On the one hand, C3 is not growing anywhere as explosively as a Snowflake, and in fact seems to have just had a bad quarter of decelerating growth. There are also other concerns, including account concentration and a substantial loss (not as pronounced as a Snowflake or Palantir, but still on the higher range of the software market).

On the other hand, the tailwinds around the deployment of ML/AI in the enterprise are very strong, and C3 is clearly positioning itself as one of the very first enterprise AI companies to go public: its ticker symbol on the NYSE will be “AI”, and the term “machine learning” is mentioned 56 times in the S-1.

This IPO will be an interesting test for the continued appetite of financial markets for all things AI.

Here’s a quick analysis of the S-1 and main characteristics of the business, put together by my FirstMark colleague John Wu and I.

Continue reading “Quick S-1 Teardown: C3.ai”

Resilience and Vibrancy: The 2020 Data & AI Landscape

2020 Data and AI Landscape

In a year like no other in recent memory, the data ecosystem is showing not just remarkable resilience but exciting vibrancy.

When COVID hit the world a few months ago, an extended period of gloom seemed all but inevitable.   Yet, as per Satya Nadella, “two years of digital transformation [occurred] in two months”.  Cloud and data technologies (data infrastructure, machine learning / artificial intelligence, data driven applications) are at the heart of digital transformation.  As a result, many companies in the data ecosystem have not just survived, but in fact thrived, in an otherwise overall challenging political and economic context. 

Perhaps most emblematic of this is the blockbuster IPO of Snowflake, a data warehouse provider, which took place a couple of weeks ago and catapulted Snowflake to a $69B market cap company, at the time of writing – the biggest software IPO ever (see our S-1 teardown).  And Palantir, an often controversial data analytics platform focused on the financial and government sector, became a public company via direct listing, reaching a market cap of $22B, at the time of writing (see our S-1 teardown).

Continue reading “Resilience and Vibrancy: The 2020 Data & AI Landscape”

When is AI not AI?

Earlier this week, Forbes published a piece on ScaleFactor, a startup using AI to automate accounting, which shut down after raising $100m.

Here’s the heart of the issue covered in the story: “Instead of [AI] producing financial statements, dozens of accountants did most of it manually from ScaleFactor’s Austin headquarters or from an outsourcing office in the Philippines, according to former employees. Some customers say they received books filled with errors, and were forced to re-hire accountants, or clean up the mess themselves.

Continue reading “When is AI not AI?”

Facebook as an AI company: In conversation with Jerome Pesenti, VP of AI, Facebook

While AI may seem like a futuristic goal for most companies around the world, Facebook has already been there for a while. “There’s pretty much a deep learning system in every single Facebook product and they are very much at the core of them” says our guest Jerome Pesenti, VP of AI at Facebook.

Jerome leads the development of artificial intelligence at Facebook, and oversees hundreds of scientists and engineers whose work shapes the company’s direction and impacts our world.

We had had the pleasure of welcoming Jerome at Data Driven NYC in October 2017, in his prior role as CEO, BenevolentAI, and we had chatted about using AI for drug discovery.

It was wonderful to welcome him back in his new capacity at our first **online** Data Driven NYC, courtesy of the coronavirus. It was a fascinating, in-depth conversation.

Below are: a) the video, b) some highlights and c) the full transcript.

Building a $12B Public Company: In Conversation with Olivier Pomel, CEO, Datadog

By any measure, Datadog is an incredible entrepreneurial success story. The company went from a tiny startup in 2010 that had trouble raising money, to a public company that, at the time of writing, has a market capitalization of $12.5B. It was a pioneer in the category of DevOps and observability, and it’s now a clear leader. With revenues hovering around $350M, it has 1,300 employees across 31 locations around the world.

Perhaps improbably, the founders built the company out of New York, which many people over the years have thought of as a hub for adtech, media and commerce startups only. Along the way, they faced a lot of skepticism: “Whenever we pitched West Coast investors it was sort of seen as a form of mental deficiency to be based in New York and doing infrastructure“, says Olivier. I wrote a few months ago about the significance of the Datadog IPO for the ecosystem and beyond. Ironically, out of the three top public tech companies in New York today, two are infrastructure software companies (Datadog and MongoDB).

Not one for gratuitous self-aggrandizing, Olivier has given surprisingly few interviews over the years, and it was a real treat to sit down with him for a fireside chat in front of a packed house of 350 attendees at our most recent Data Driven NYC.

We had an in-depth conversations and covered a lot of topics.

The first half of our conversation was focused on Datadog itself, starting with a high level overview of the observability and DevOps space to make the discussion approachable by people who don’t know the space.

The second half of the conversation was focused on all sorts of lessons learned along the way of building a major company- sales, marketing, fundraising, etc.

Below is the video. We have also provided a full written transcript to make the content easy to scan through (many thanks to Karissa Domondon for her help with this).

Continue reading “Building a $12B Public Company: In Conversation with Olivier Pomel, CEO, Datadog”

The Power of Open Source: In conversation with Mike Volpi, General Partner, Index Ventures

Our most recent VC guest at Data Driven NYC, Mike Volpi of Index, has had a pretty amazing last couple of years, with three of his venture investments going public:  Zuora, Sonos and Elastic. 

Before becoming a VC, Mike ran Cisco’s routing business where he managed a P&L in excess of $10 billion in revenues, and acquired over 70 companies (note: probably a pretty good way to make a lot of friends in Silicon Valley).

A partner at Index Ventures in San Francisco, Mike invests primarily in infrastructure, open-source and artificial intelligence companies, so he was a perfect guest to have at the event.  In particular, he invested in two prior presenting companies: Confluent and Cockroach Labs (in which FirstMark is also an investor). 

We had a really interesting conversation about open source, AI and venture capital.  Here’s the video below, and l have jotted down a few notes as well, below the fold. 

Notes from the chat:

Continue reading “The Power of Open Source: In conversation with Mike Volpi, General Partner, Index Ventures”

The Killer App for Machine Learning: In Conversation with Pedro Domingos, Head of Machine Learning, D.E. Shaw

Best-selling author, Professor of Computer Science at the University of Washington, recent recipient of the prestigious IJCAI John McCarthy Award for excellence in artificial intelligence research (among other awards) and Head of the Machine Learning Research group at D.E. Shaw:  Pedro Domingos has one of the most incredible resumes in the world of AI, and we were thrilled to host him for a fireside chat at our most recent Data Driven NYC. 

We covered a bunch of things, including why finance is a killer app for machine learning, his much-lauded book, ‘The Master Algorithm’ and what’s truly scary about AI (hint: not the Terminator).

Continue reading “The Killer App for Machine Learning: In Conversation with Pedro Domingos, Head of Machine Learning, D.E. Shaw”

AI’s Trust Problem: In Conversation with Gary Marcus (Video + Book Notes)

Should we be worried about the prospect of AI superintelligence taking over the world?

“In the real world, current-day robots struggle to turn doorknobs, and Teslas driven in ‘Autopilot’ mode keep rear-ending parked emergency vehicles […].   It’s as if people in the fourteenth century were worrying about traffic accidents, where good hygiene might have been a whole lot more helpful”.

This is one of my favorite quotes from “Rebooting AI: Building Artificial Intelligence We Can Trust,” a new book by Gary Marcus – scientist, NYU professor, New York Times bestselling author, entrepreneur – and his co-author Ernest Davis, Professor of Computer Science at the Courant Institute, NYU.

Gary did us a big honor recently: he chose to speak at Data Driven NYC on the evening of the publication of the book.  He also signed a few copies. Our first book launch party!

Particularly if you’re trying to make sense of the still-ongoing hype around AI, including predictions of global gloom, Gary’s book is a fantastic read: a lucid, no-nonsense and occasionally provocative take on the current state of AI, that distills complex concepts into simple ideas, and includes plenty of interesting and often funny anecdotes.

The book builds on Gary’s earlier assessment of deep learning (see Deep Learning: A Critical Appraisal), and advocates for a hybrid approach to AI.

Below is the video of his talk at the event, plus a notes I derived from both the talk and the book.  I’ll keep those brief as the book is worth reading in its entirety.

Continue reading “AI’s Trust Problem: In Conversation with Gary Marcus (Video + Book Notes)”

Decoding the Human Nervous System: In conversation with Thomas Reardon, CEO, CTRL-labs

In its largest acquisition since Oculus in 2014, Facebook just announced last night it acquired CTRL-labs, a 4 year old startup based in New York, for a reported $500M-$1B.

Coincidentally, CTRL-labs CEO, Thomas Reardon (who goes by Reardon) was our guest at Data Driven NYC just a couple of weeks ago. Reardon is a particularly compelling entrepreneur, and this was a fascinating fireside chat, where we dove into machine learning, neuroscience, VR and all sorts of cool topics.

CTRL-labs builds what it calls “neural interface technology”: algorithms that decode the activity of individual motor neurons and turns that into control over machines, thereby completely redefining the interaction between humans and machines. Because the technology captures your intentions without requiring any physical movement, you can do things that you could never do by moving, and you can start “imaging experiences where you would have 20 fingers… or 8 arms or legs”.

The video (below) is well worth a watch in its entirety, including the audience Q&A at the end, and I’ve jotted down a few notes as well, for a quick review.

Continue reading “Decoding the Human Nervous System: In conversation with Thomas Reardon, CEO, CTRL-labs”

Part II: Major Trends in the 2019 Data & AI Landscape

Part I of the 2019 Data & AI Landscape covered issues around the societal impact of data and AI, and included the landscape chart itself. In this Part II, we’re going to dive into some of the main industry trends in data and AI. 

The data and AI ecosystem continues to be one of the most exciting areas of technology. Not only does it have its own explosive momentum, but it also powers and accelerates innovation in many other areas (consumer applications, gaming, transportation, etc).  As such, its overall impact is immense, and goes much beyond the technical discussions below.

Of course, no meaningful trend unfolds over the course of just one year, and many of the following has been years in the making. We’ll focus the discussion on trends that we have seen particularly accelerating in 2019, or gaining rapid prominence in industry conversations.

We will loosely follow the order of the landscape, from left to right: infrastructure, analytics and applications.

Continue reading “Part II: Major Trends in the 2019 Data & AI Landscape”

A Turbulent Year: The 2019 Data & AI Landscape

It has been another intense year in the world of data, full of excitement but also complexity. 

As more of the world gets online, the “datafication” of everything continues to accelerate.  This mega-trend keeps gathering steam, powered by the intersection of separate advances in infrastructure, cloud computing, artificial intelligence, open source and the overall digitalization of our economies and lives. 

A few years ago, the discussion around “Big Data” was mostly a technical one, centered around the emergence of a new generation of tools to collect, process and analyze massive amounts of data. Many of those technologies are now well understood, and deployed at scale. In addition, over the last couple of years in particular, we’ve started adding layers of intelligence through data science, machine learning and AI into many applications, which are now increasingly running in production in all sorts of consumer and B2B products.  

As those technologies continue to both improve and spread beyond the initial group of early adopters (FAANG and startups) into the broader economy and world, the discussion is shifting from the purely technical into a necessary conversation around impact on our economies, societies and lives.

We’re just starting to truly get a sense of the nature of the disruption ahead. In a world where data-driven automation becomes the rule (automated products, automated cars, automated enterprises), what is the new nature of work? How do we handle the social impact? How do we think about privacy, security, freedom? 

Meanwhile, the underlying technologies continue to evolve at a rapid pace, with an ever vibrant ecosystem of startups, products and projects, heralding perhaps even more profound changes ahead. In that ecosystem, the year was characterized by the early innings of a long expected consolidation, and perhaps a passing of the guard from one era to another as early technologies are starting to give way to the next generation.

Continue reading “A Turbulent Year: The 2019 Data & AI Landscape”