In Conversation with Spencer Kimball, CEO, Cockroach Labs

Cockroach Labs, the ambitious database company with a funny name, has gone from strength to strength over the last few years. Started with three ex-Googlers in 2014, it successfully navigated in its early years the perilous waters of being an early database company that customers need to trust for mission-critical applications. Over time, it’s gained tremendous momentum with a now long list of marquee customers, and was most recently valued at $5B.

In part because we at FirstMark are proud investors in the company, we’ve featured Cockroach Labs several times at Data Driven NYC over the years: in 2014 (video), 2018 (video) and 2020 (video), and it’s been really fun to see their tremendous progress.

It was great to host CEO Spencer Kimball once again and check in on the latest, as well as lessons learned building a successful open source enterprise software company.

We covered a bunch of really interesting things, including:

  • The origins of the company
  • The evolution of the database market from SQL to NoSQL to NewSQL to cloud
  • The current opportunity around serverless
  • Open source license questions
  • Go to market: community led, bottoms up, top down?
  • Who’s the perfect first sales hire for an enterprise software company

Video and transcript below!

As always, Data Driven NYC is a team effort – many thanks to Katie Mills, Drew Simmons, Dan Kozikowski and Diego Guiterrez for all the work and support.


Matt Turck (00:02):

All right, Spencer, welcome back. So this is actually the fourth time that we feature Cockroach. So the first time was actually in January of 2014, which feels crazy.

Spencer Kimball (00:15):

Could it have really been January? I don’t think the company got started until February.

Matt Turck (00:21):


Spencer Kimball (00:22):

Maybe it was before the company started.

Matt Turck (00:24):

It was super early. I’m a proud investor. FirstMark is a proud investor in CockroachDB and the story, that’s how I crawled my way into the deal, is like, “Hey, that’s actually a data community in New York and that’s for real people come and want to learn about technology.” And I think you took pity on me. It’s like, “Oh, this guy’s trying hard so we can let him in.” So I think that’s…

Spencer Kimball (00:51):

No it’s actually an honor to be part of the DataDriven at that time. I think we got some really interesting leads and it was early for our product. But just interest from folks in the community. So it was well worth doing it and if anyone here is doing a startup, and you get a chance to participate with DataDriven, I recommend it.

Matt Turck (01:11):

Okay, well great answer. Thank you. So that’s 2014, and then you were back in 2018, then we had Nate Stewart, your chief product officer and board member, who was great during the pandemic online, so this is a foursome to dip, this is great. Maybe as you know, quick refresher on Cockroach Labs and CockroachDB, how you started it, why you started it, what the product does, all those good things.

Spencer Kimball (01:34):

CockroachDB is a relational database. For those of you that don’t know what that is, think Oracle’s flagship database, that’s probably the most famous IBM Db2, which you’d run on mainframes. Microsoft SQL Server, Postgres, MySQL. So these are all relational databases and the key, I think, if you take it one step back is these are operational databases, so they’re the ones that hold all the metadata for your use case. So the items that you have in your inventory, if you’re doing inventory management, the balances and the debits and things in accounts, if you’re doing some financial services use case, that’s what you’d put in the operational database you contrast that to an analytical database like Snowflake or that’s probably the one of the most common ones, but there’s plenty of them.


The reason there’s so many databases out there is because everything needs a database. Every single use case in the world has one of these things powering it and that market’s growing very quickly because people are building new use cases. So there’s a lot of competition and there’s also a lot of room in the solution space to find the perfect combination of capabilities to push the envelope. And as things are changing very rapidly in the ecosystem, there’s a lot of room to improve how operational databases work, in particular, to use the cloud to really leverage it to make things, like with all products or all infrastructure, we want to make things better in terms of capabilities, faster in terms of how they perform and cheaper. We do strive to do all three of those things, some better than others, but it’s always a work in progress.

Matt Turck (03:12):

Do you want to double click on that history from the relational database of yesteryear, Oracle, not to pick on them, to document databases to NewSQL? What was the evolution?

Spencer Kimball (03:27):

Ooh. Yeah, there’s lots of different threads that you could weave through the evolution of these systems. I think the Oracle maybe is we start there. They certainly weren’t the first of these relational operational databases, but they really did become ascendant through the nineties and the early odds. Then I think where Cockroach’s story really starts is when the worldwide web came to prominence and all of a sudden there were use cases that were actually bigger than what you might call an enterprise use case, where you had a certain number of customers for a big company and all of a sudden you could reach most of the people in the world that had a desktop computer and then a mobile app of some sort a little bit later.


That actually opened up a gap between what the capabilities of the existing operational databases like Oracle or MySQL could provide for and what the use case demanded. I was at Google in 2002 and we ran head first into this with their AdWords system, which very quickly grew beyond the capacity of a single MySQL instance. So they started adding more MySQL instances, they divided the customers between the MySQL instances they had and then they had to double that and double it again and double it again, and that actually started to create all kinds of other problems. And so Google started to innovate with databases as a result of that. So they built Bigtable.


Bigtable, is really, I think one of the examples you could point to, and nothing’s new in computer science, but it was definitely a prominent example that introduced the idea of NoSQL. So Google actually was very intent on making a very scalable operational database and Bigtable was their answer to that, their first answer. The interesting thing about Bigtable is that they went for scale and they dropped all of the things that a relational database had evolved in terms of its capabilities that weren’t directly related to just making the thing really, really big. So it didn’t have transactions, it didn’t have a relational language in which to query with and it didn’t have a lot of these schema management tools that help you manage complexity.


But that was okay because Google just needed something that get very, very large. But even two years later they said, “You know what? We can’t build application use cases without transactions.” So they built something called Megastore and then they decided that Megastore was only a half measure and they wanted to redesign from the ground up and they built something called Spanner.


Spanner is still what Google is building things internally and also providing on GCP. And it’s Spanner that really inspired Cockroach. And to just give you a little bit of the genesis of Cockroach, after 10 years at Google, myself and both my co-founders left to build a private photo sharing company. We went from doing mostly infrastructure at Google to thinking, “You know what? We want to build something for people to use and particularly for us to use,” because we didn’t like sharing things publicly, but we wanted to share all of our photos when we were on a weekend trip with friends, and that was called Viewfinder. We didn’t get product market fit on it. Snapchat I think was the alternative at the time and I think it was probably more in tune with the pulse of the general public than our more sophisticated I think, but probably overly complicated solution. But we definitely wanted to build the backend for Viewfinder such that it would scale the way Google’s infrastructure scaled.


That’s where the idea of Cockroach was born because we realized coming out of Google, that those kinds of capabilities that Google had pioneered internally were not available in open source, at least not yet. So the idea of Cockroach was born and we said, “You know what? The Spanner-like capability should be brought to market for everyone else and certainly as an open source product.” We didn’t build it a Viewfinder because we were trying to build a private photo sharing application and platform, but we were acquired by Square two years into that journey and at Square that’s where we saw, “You know what? This problem is way bigger than our startup as some ex-Google engineers.”


Square was struggling with databases as well we said if you looked at all the problems Square was having and they had something like 70 externally facing use cases when we join them, most of the problems that they were struggling with could have been solved with the use of something like Spanner. So that really brought the idea of Cockroach back into our minds and we stayed at Square for about 14 months and then we said, “You know what? Based on the signals we’re seeing and you talked to folks that were working at Dropbox and Pinterest and Yelp and everyone had these kinds of problems, we said we do need to follow this dream and this ambition and build a company around it.” And that’s pretty much when I met you.

Matt Turck (08:08):

Yeah. So the fundamental premise of CockroachDB is to be best of all worlds of scalability and transactional reliability. What does a product do today, I guess?

Spencer Kimball (08:23):

Yeah, so that actually brings up a question that some of you might have bugging you in the back of your mind. Why would we call anything CockroachDB? It’s not exactly a popular insect. It’s really around survivability and that was one of the key things that we sought to build into the product from the start, and it was one of the things that motivated Megastore at Google and then Spanner at Google. This was an idea of like, “Hey, in the public cloud things are just different. You have data centers just on the east coast, you’ve got many data centers to choose from and they’re very close together and if you can balance data across them, you could lose an entire data center and actually not miss a beat. No postmortems, no running around trying to get the things back online, potentially losing data. The thing can just continue maybe with a couple seconds of latency.” So that was really cool. We built that in.


The other big challenge we started out to solve was scale. So we really wanted to be able to support huge use cases, but you don’t necessarily know whether your use case is going to be huge. A great example of that is if you’re trying to build a game, if you build that with the wrong backend infrastructure that doesn’t scale properly, then you’re going to run into a success disaster if your game’s popular and so your things just going to fall flat on its face re-architecting something like that is not something that you do overnight. So you could really lose the momentum that a game might have in the early stages that you really would like to capitalize on that.


I think that’s true for any startup as well, any SaaS use case, anything you’re building, if you have success in aggregate, your data needs are going to be big. So Cockroach is really built to scale, can start small and can get very, very, very large, much bigger than one of those traditional relational databases I mentioned, like Oracle for example. Those do have upper limits on how big they can get.


The interesting thing is that those capabilities were really the starting point as we’ve been on this now eight year journey, we’ve realized that the architecture supports other really interesting capabilities. When I say the architecture, the way to think about Cockroach is it’s distributed, there’s lots of nodes that participate, that’s part of how it gets so big and it’s also part of how it can survive. You lose a data center? Well there’re other nodes of Cockroach that are running that have some of the redundancy that are running in other data centers and those can pick up the slack. We also realized that the companies we were talking to increasingly, were multinational companies or they were even startups, but they very much wanted to entertain customers that might join them or use their massive multiplayer gaming platform from Brazil or from Turkey or from Japan. You really would like to try to support those more global use cases. So we realize, “Hey, we’ve got a distributed architecture, we should be able to introduce new capabilities into the operational database to support that.”


So if you think about something like Twitter or Quora, if someone posts something, you want that to be visible everywhere and ideally you’d like that to be consistent around the world. At the same time, you might have data that you absolutely do not want replicated all over the world. You’re building a private wealth management system, you definitely want to keep all the data replicated in the user’s legal jurisdiction. And balancing those things, having those concerns and having a database fundamentally support them is quite important, and we’ll talk… I know that you’re planning to ask me about some other even more recent capabilities of Cockroach, but I think the larger lesson here is just that the work’s never done.


The world’s changing very rapidly. Infrastructure has to change as well we’ve just seen over, well the 25 years I’ve been trying to solve problems with databases, you improve the state of the art in the database and the application use cases quickly use those capabilities and then you design the next version of the database and then the applications use that and want more and it just goes on and on and it’s an arms race.

Matt Turck (12:26):

Let’s get into that we started talking about this evolution of SQL to NoSQL to NewSQL and category in which Cockroach arguably falls into. You seem to be going towards this concept of a data cloud. Where does a cloud fit into this? And then the next step after that, which is serverless, but let’s talk about the data cloud.

Spencer Kimball (12:51):

You hear a lot about the data cloud these days. I’m not exactly sure what it is. One observation I’ll have about the idea of convergence and data infrastructure is that it’s very, very difficult to build a piece of infrastructure that serves as an operational database. Just like it’s very difficult to build a piece of infrastructure that serves as a data warehouse or a data lake or an analytic system of some sort. In order to be the very best in that you have to, I think, have a somewhat single threaded focus in the product category that you’re trying to compete in. Otherwise you become a jack of all trades and a master of none, I think is the way people put it. So I see consolidation in some products, but in general the industry leaders in each product category will continue to have a more narrow focus.


I mentioned before the cloud is fundamentally changing things and offering incredible opportunities to do things again, faster, better, cheaper. The realization that we had is the cloud allows you to get resources almost anywhere programmatically and in seconds or minutes even. That’s a fundamental change from the way the world used to work in fact, companies that still do have their own data centers struggle with this problem continuously, which is it can take months to get a new piece of hardware or to find the floor space in your data center to put it in. Anyone that uses the public cloud, which I assume is almost everyone in this room, those concerns seem fairly ancient, but the reality is that’s a relatively recent improvement in terms of what the cloud can bring and how you can build on it.


I mentioned before, well the public cloud has data centers, multiple data centers in single regions and regions all over the planet and over every continent. That’s also a fundamentally big change. But also the public cloud has many other services that you can start to build on. So if everyone here is aware of Snowflake, I mean, they’re building on the cloud data storage primitives like S3 or Google Cloud storage and that’s a huge benefit. By having that primitive that’s allowed them to do things much more efficiently than earlier systems that had to essentially build those kinds of capabilities into their product. So I think that’s the future of things. How can you leverage the cloud and continue to leverage it every time someone else in the ecosystem builds something that could be useful? It’s an opportunity.

Matt Turck (15:27):

Yeah, so everything as a serverless, we talked about distributed, do you want to talk about serverless and maybe start with a definition because not everybody may know what that means.

Spencer Kimball (15:39):

Yeah, serverless is an overloaded term at this point. It was introduced with… Like I said, nothing’s new in computer science I don’t know what the very first usage was, but the one that I became aware of was AWS’s Lambda. The right way to think about that is it’s a serverless execution layer so that you could actually run your application code in a little snippet, a function basically that could be called and you don’t have to run a server that has your application logic permanently resident on it, ready to serve queries. Instead, what happens is a query comes in and it might just be one every week, it might be a hundred a second, might be 10,000 a second, whatever it is, the execution layer that serverless, it uses some server capacity somewhere to execute your logic on demand and it charges you only for what you used.


That was the initial introduction for most people the concept of serverless and that’s at the execution layer. But every execution layer has to deal with data, otherwise it’s not a very interesting application use case. Like a mortgage calculator, it doesn’t store any data. You put in the little things and it spits something out. That’s a very simple application, but virtually every application needs to go hit a database somewhere.


And databases are very much seen as being residents somewhere, and that’s very true. There needs to be at least something that is holding the data and making it available. However, a lot of the principles of serverless are applicable to data storage, in particular, you want to be able to start very small and get very large without having to worry about how many nodes you have, where they are, how big the nodes are, how they have to be upgraded in terms of their operating system and so forth. In other words, the idea of serverless abstracts you above the concerns of dealing with actual servers and everything that’s associated with them.


Also, you really want to be able to pay for exactly what you use and pay as you go, so that’s another really amazing feature of serverless, and that of course, applies to the database or at least it can and that’s something that Cockroach brought to market. So this idea is just that, if you want to store just a tiny bit of data when you start, way less, for example, then you would have the capacity to store if you had just the smallest node possible running your database. The smallest node that’s available on AWS is actually still a potentially way more powerful database, a much more capacity than you might need for your use case that doesn’t have any users on it yet. Let’s say you’re a startup and you’re trying to work to product market fit and you release your very first version and you haven’t done very much advertising yet or anything, it’s just friends and family that are on it. It’d be nice not to pay for a resident VM that’s running your database permanently, but that’s the non-serverless version of things.


With serverless, if you use literally a single bite of data, that’s all you get charged for and that’s an interesting way to start, but then you have a very smooth way to scale up so that you’re elastically using exactly what you need. And when we started looking at the problem of doing multiple regions, so you’ve got users in Western Europe, users in the United States, maybe users east coast and west coast are separate because the latency’s important, you start to realize that to service all those customers, if you’ve got a use case that like a game as I mentioned before, that doesn’t have many users yet and you don’t really know where they’re going to show up, then serverless really becomes obvious as being something that’s critically useful. Because if Australia is not where you have users yet and there’s only 10 users there, you’d like to not be charged for a bunch of resources that are sitting in Australia and not being used. Right?


With serverless, you have an ability to have a very large physical Cockroach cluster which Cockroach Labs would run that’s available in the cloud and all of the customers can use that physical infrastructure but only use a fractional virtual cluster that slices through the physical infrastructure. So if there’s just a tiny bit of usage in Australia, you pay for a tiny bit of usage. If most of your usage is in North America, you can scale as big as you need to there. But again, across the entire global footprint, you’re using only the resources that you need and you’re only paying for the resources you use.

Matt Turck (20:07):

When did you launch the serverless product?

Spencer Kimball (20:09):

Serverless came out in beta in, I don’t know the exact month, but it’s been more than a year now. We released a general availability version of it in July of this year.

Matt Turck (20:23):

So one of your key customers, at least publicly is Netflix. I think it’d be really interesting to use this as an example. How does a company like Netflix use Cockroach?

Spencer Kimball (20:36):

Well, actually that gets to another interesting point we have a number of different flavors of Cockroach because that’s actually been necessary in our evolution as a company. We started off and Cockroach was something that you ran yourself, we call that self hosted, because when we got started, that’s how most of the bigger customers we had were insisting that they wanted to use databases. These are our operational databases, this is the thinking and this is what… We’re used to running these ourselves. This is storing our most valuable crown jewels, the data for our operational use cases and if you’re we’re going to use a new technology, we’re going to run it in our information security envelope with the people and the processes that we trust. So we had that self-hosted product.


We quickly started realizing that there was the new wave and certainly the future even for those existing self hosted customers was going to use a cloud product that was a service that was managed. In other words, the way that AWS offers their databases to all of their customers, so we started building that cloud product. And then we started realizing is that serverless was going to be an improvement on that and we started building the serverless products. So we actually have those at least three broad categories of how our product is offered to customers.


You mentioned Netflix, Netflix is one of these self-hosted customers, that’s how they still want to run their databases, but they are moving in the direction of using cloud. So there’s going to be a hybrid reality for some time and I think, if you look at the horizon, everything will be cloud. We do support a very flexible way of deploying Cockroach. Netflix, as you all might imagine, has probably thousands of use cases. I’m not exactly sure how many, I think that’s probably accurate, but a lot of things that they offer and some of those things are massive and some of those things are very small and Cockroach is solving a number of different problems for them.


I think the most difficult problem, obviously, scale is one and survivability or business continuity is clearly another. So those are the bread and butter of Cockroach, but the multi-region is also a major concern and that’s an area where Cockroach is quite differentiated in the market. I think they give a recent talk which is on YouTube, so this is not any private information, but they have hundreds of Cockroach clusters already, so you can just see how quickly the usage of this can increase within an organization that has a lot of use cases that need these capabilities.

Matt Turck (23:15):

Yeah, and building on this point of self-hosted to cloud to serverless, If you were going to start a database company today, would you go directly to the cloud as the market evolved that way?

Spencer Kimball (23:27):

That’s a really good question. I think maybe not, but my God, if you thought about having to build all the things that we’ve built over eight years, I don’t know if I’d want to start the company. The reason I say it would be hard to imagine just going straight to serverless, although that would be the only way that you could think about doing it for the reason I just mentioned, the reason that would be difficult is, there’s a lot of competition if that’s the only way that you run. If you want to win at least in 2022, the global 2000 is customers, you really do have to have a product that runs in a variety of different configurations because people are, I think, reasonably hesitant to adopt a solution that only works in a single fashion.


I’m not saying it’s not possible, I agree with you, we’d probably go directly to serverless if we were starting today, but I’m glad we don’t have to make that choice because the fact that we run in as many different configurations as we do is extremely appealing to the high end of the market, which is I think, where also the differentiators I mentioned scale resilience, multi-region, those are incredibly important differentiators to the high end of the market. Little less to the low end, although you do see it in the emerging companies that are going to become part of the Fortune 500 in the next five years, five to 10 years. Many of them do have those kinds of use cases, so we have a nice distribution of companies across those two segments, but the world’s biggest companies are prime candidates for our software.

Matt Turck (25:13):

At the very beginning, and I guess still today you were a very successful open source company, do you think the world has evolved as well? There was a time everybody hated open source business model and then it switched to everybody loved open source and open source was the only way. Do you think that has evolved?

Spencer Kimball (25:33):

Yeah, unquestionably it’s evolved. When we started, we adopted what’s called the open core business model, so the idea here is that you have an open source product that drives really broad adoption. So you get some level of ubiquity. Many, many, many people are using it because hey, it’s open source, it’s very, very easy to download to work with. You’re not paying up front for the software, you may eventually pay for support. That was the Red Hat model for open source. But the idea with the open core model is that open source product would just be the core what you do when you started getting that ubiquitous adoption is you start to introduce enterprise features, which would be a different license. Most people would adopt with the core and then you’d up sell them to the constellation of enterprise features that essentially form the basis of your enterprise offering, let’s say.


That business model, I think, lasted about four or five years. When we started the company, it was I think a good bet that was the right way for us to do it and we operated under that until it started to become clear that open source business models were under threat, in particular, from some of AWS’s actions, so they really went after Elastic Search. That was one that they I think changed the nature of the open core business model and made it I think less likely that you’d succeed. What Amazon did is they said we can repackage the open core, put our own enterprise things around it, and most of the work’s already been done for us to create this piece of software and we’re going to repackage it. And with that, in addition to the incumbent cloud platform, means that we’re going to be able to get huge numbers of customers just because everything’s integrated, it’s all part of the same billing system, all the identity access management works together. So you have all the advantage of the cloud platform combined with the quality of the open source offering. So as soon as people started to wrestle with that, the open core model became less tenable.


Interestingly, at exact same time, the idea of really offering things as a service in the cloud, first and foremost, and worrying a little less about open source was also quite ascendant. Again because of Amazon I think more than any company. So they offered both the twilight of one business model and really ushered in the future there. And I do think that if you think about the progression here, you had closed source software, open source software, and then let’s say cloud services, they make sense because they’re moving along a gradient of essentially less cost. The cost isn’t always measured easily in dollars and cents, it could even be measured in time, for example. Time to value and the resources required to run something in production. You went from close source software, which was incredibly expensive to actually buy it and to use it because you actually had to go through procurement. So you’d talk to some salesperson that might have a relatively long process, then you have to go through legal wrangling, go through procurement and eventually they send you a bunch of printed manuals. And there wasn’t really a community necessarily that was online, but this is just the dominant mode of how software was purchased.


You can see why that was so ripe for disruption. And when open source came along, it was very easy to both get that community to very rapidly try out the software to run with it. You didn’t actually pay for the software up front, of course, you paid for the hardware and so forth. The idea of services actually takes that a step further, not because the ideas are free, that was some of the nice things about open source, but because the process of actually running the software is no longer something you had to learn how to do. the time to value and the day one plus operations is something that was respectively decreased and on both dimensions, right?


I think what we see with serverless and our serverless offering for example is free, so it’s perpetually free relational database cluster up to a certain threshold of utilization. So it’s like what we think is available in this next generation of value proposition for infrastructure, is that you can both acquire the software very rapidly because just a service, you don’t have to learn how to run it. There’s even a free tier, which is at least as free as open source was in the sense that you always had to pay for the hardware with open source and the support. I think that same idea, you have the pass through costs of the cloud and you also have the support. It’s like what you’re moving along there is just less resources required to successfully implement a use case using infrastructure that’s available. It’s like open source ate the software world now I think cloud services are very much cannibalizing the open source business model. That’s not to say that open source is going away, I don’t believe that’s true at all.

Matt Turck (31:12):

So you’d still recommend open source as a strategy for most enterprise software?

Spencer Kimball (31:17):

That’s a good question because people ask me that all the time and they’re doing startups, “Should we open source this or not?” I think the answer is, are the other core benefits of open source really important to that community? Because sometimes they are, I’d say it’d be hard to imagine a relational database at this point that isn’t open source, but that might be the case. I do think that you really just need to look at what’s the best way to deliver value to the customer, I think that that can be done quite easily without open sourcing code. So the mandate to open source is not nearly as strong as it was when we started Cockroach.

Matt Turck (32:02):

Maybe last question or theme from me because then I want to open up for questions. What are some lessons learned on the go to market side, particularly in the light of the three of you founders were super deeply technical people who had to learn a lot of the go to market and in a context of a shifting environment from open source to cloud and all the things. So how did you start? How did you get the first customers? What worked? What didn’t? And then as you evolve towards more of a sales organization, when did you do it? Why did you do it? How did you do it?

Spencer Kimball (32:42):

That’s a good question. When we started Cockroach Labs, I realized that we would probably be an enterprise software company and that made me very nervous because I’d never really dealt with that problem before. I’d built software at Google for example, for Google engineers and that was more the mental model I was comfortable with and the idea of having potentially hundreds or thousands of customers that needed to be supported was something I had to get my head wrapped around. I will say that it’s very easy when you’re the chief technical evangelist to go and talk to customers and it’s something you should do very early and often and try to find those design partners. It’s very hard though to sell, especially to a larger organization. I quickly realized that the Gulf between being able to get somebody very interested in your software and actually getting an MSA and assigned contract and money in the door was not something that I was going to cross on my own so we hired our first account executive and SE pair and I watched how these two went after some of the customers that were interested in Cockroach-

Matt Turck (33:58):

Actually can we double-click just on that piece because that’s a question that comes up all the time. You’re young startup, you are very technical founders, who’s your first AE? Are they young with high slope? Are they experienced? Who are they?

Spencer Kimball (34:13):

That’s an interesting profile. You definitely don’t want somebody that has been working at a scaled organization and really understands how to manage sales folks, scale the team, expects marketing to have a certain amount of leads, inbound and so forth. In those early stages, you need somebody that specializes really in an exploratory sales motion, because you don’t know how much you can charge for your software yet you certainly don’t know what messaging is going to work, who your ideal customer looks like. You’re trying to figure these things out so you need somebody that can go into any customer and really just listen.


I mean, to be fair, that’s always what you should be doing in a sales motion I think some people are really geared towards listening with their ears perking up when someone mentions something that just might have something to do with your product, because you just don’t know exactly what that motion looks like yet and you have to figure it out and there’s a lot of things it could be. So there is a certain early sales leader that specializes in that, but as soon as that person starts to figure out what that motion looks like, you’re probably going to need to replace them because the person that can figure that out is not usually the person that can mentor other sales people and start to scale an organization and really codify that motion into something that can be taught through enablement to a larger sales organization.

Matt Turck (35:42):

And then fast forward to today, you have more of a top down sales led motion or do you still get juice from the community and some bottoms up inbound? What does it look like at scale?

Spencer Kimball (36:00):

Yeah, it’s a combination of a lot of different things. We definitely still get open source lift, which is interesting. We get it through increasingly product led growth motion with our serverless platform and we’re extending some of the principles and product led growth even to upmarket in terms of, for example, how is the product experience, let’s say a really big Fortune 10 bank is betting on your product strategically and it’s being rolled out within the bank. You want all the individual teams in that organization to experience the benefits of a product led growth motion.So those principles apply if it’s all top down and sales led, it’s very hard to scale or very expensive to scale, so you do want to balance those. But it depends on your use case With CockroachDB and probably any database that’s operational, it’s a solution sale, it’s very involved, there’re multiple stakeholders, it’s a double edged sword.


It can be very difficult to get past all the hurdles and all the technical evaluations and just even the contracts and things because this is a very important part of the stack. If it goes down, everything goes down, so the contracts become more fraught as a result. So you do need to have the right kind of sales organization to accomplish that sale. I’ll just say that in the go to market, maybe the most counterintuitive learning that I’ve had, and it should give people that are on this journey maybe a little bit of an optimistic perspective, but you’d think that when something does go wrong with your operational database, that customer is not going to be happy at all. In fact, they might churn on you because you’ve failed them in a very critical thing and Cockroach is not supposed to go down.


I think at first blush, a failure with your operational database means you’re going to churn a customer. In fact, it’s not true. You’re actually more likely to churn a customer if they never have a problem with your software, because they look at it and, “Why don’t we just using the open source version of this, there’s nothing that’s wrong with this, we don’t need support. What are we paying all this money for? This is a very expensive line item.” What we found is that when, not that we encourage things to go down by any means, no, we take every customer’s problem as our failure and work around the clock to fix them.


But when you do have a problem, the right way to look at is it’s an opportunity. It’s an opportunity to build substantial trust with the customer. If they see that you are partnering with them at the level that their issue is… That your top concern is their top concern, then that actually sets you up for a very long relationship with trust and also a huge opportunity for expansion because you’re now seen as a partner that they can rely on for the long term. They say that all of these crises are opportunities and I think with infrastructure at the very least, which is what I’ve been having my head in for the last eight years, this is absolutely true. It’s not that you ever welcome a failure, but you want to put all your energy behind solving it.

Matt Turck (39:26):

That’s such an interesting insight. A last question from me because I just think it’s so interesting and so relevant to what a lot of people are trying to do in terms of building companies. To support a customer in that scenario, what did you do and what do you do? You take your engineers and you assign them, or do you have a customer success team that’s deeply technical? Who does this and how does it work and who do they report to in the organization?

Spencer Kimball (39:54):

Well, obviously, all things this evolves, just like I mentioned, the exploratory sales leader, which then evolves into somebody that can scale the organization and run the enablement. The customer success side of the story also evolves. At the beginning, literally the database engineers, at least in our case, that are working on these things. Cause we didn’t have a customer success team. But wow, that’s pretty interesting customer success. I mean, certainly if something goes wrong with Oracle, you don’t have the chief Oracle database engineer like working day and night on your problem. If you did, it would probably get fixed more definitively. That’s something that you can actually bring into the early sales conversations and well, “You are extremely important to us as a customer. You’re a partner, you’re going to influence our roadmap and we care more about your problems than any other vendors ever going to.” You can actually sell that.


So in the early days it really was, including me, everyone would be on these problems and would be working to solve them. But when you hire your first customer success, your first… Actually it was technical support that we hired first, then customer success, in terms of escalations, you have to be careful as you get bigger how you do that. You want, I think, to continue to have your engineers that know the product better than anyone available when an escalation demands it. You also want to create a little bit of a wall so that they don’t get distracted to the point where they can’t do their work on the roadmap, which is also incredibly important. So there’s a balance there and ultimately what you want to do is to increasingly push solutions into knowledge bases and into the product itself.


In terms of observability in particular, you want to see that there’re classes of errors that you start to recognize or problems that customers have where first, you can get your technical support and customer success folks to do what before you needed engineers to do, because now they have tools internally where they can actually see some of the things much more clearly than they previously were able to. Because you’re actually saying, “You know what? This is a class of problems that we can surface very transparently if we built this new thing into the dashboard.” So that’s great.


And eventually you want to push that so that the customer can easily diagnose their problems and has ways to fix it that they understand. And eventually you want to make it so that you eliminate classes of problems and maybe you’re trying to do all of those at once to some extent, but you get better and better at that cycle. It’s one of the really chief inputs in any product development cycle. It’s not just the new capabilities, but it’s how do you make the product more and more bulletproof and observable.

Matt Turck (42:33):

Super. Great, thanks for sharing. All right, questions? One here.

Speaker 3 (42:41):

I just saw that Google introduce what they’re calling Blockchain Node Engine for Web 0.3, like a database that can be used for the Web3 for large application use I was just wondering if that’s the market you’re looking at because it does scale?

Spencer Kimball (42:59):

That’s an interesting question, which we’ve been getting since the advent of the Blockchain I’d say. Right now, the answer is no. I think there’re ways probably that Cockroach absolutely would be used in a Web3 context, and we actually have a number of companies that are trying to build Web3 type solutions, which is supposed to be completely decentralized. But the companies that are building that often need their own metadata for their customers, that’s where Cockroach will be used.


As relation relational operational databases go, Cockroach is pretty decentralized. So you would actually have the ability, even in that case where you’re trying to create centralized metadata for your larger decentralized system, you might still want to, for example, geo partition so that you’re really keeping the data close to the customer and within their legal jurisdiction and so forth. But I’d say that maybe the… I’m trying to think of the right way to say it. The underbelly of the promise of Web3 is just that typically, even for these decentralized use cases, you want some centralization. And I think that’s really where Cockroach is focused at the moment and less around trying to store things on the Blockchain. It’s a little bit different in terms of how they’d be used.

Matt Turck (44:20):

You’re welcome. All right, one more question.

Speaker 4 (44:23):

Hello? Spencer, thank you for the talk today. So Vendor lock in is one of the things that enterprises try to avoid during the sales cycle. How do you think about it and talk about it with the prospects? And also once you have a customer, how do you talk to them about not getting locked in? At the same time you do want lock in? So how do you balance the two during the sales cycle and also when it comes to retention?

Spencer Kimball (44:51):

Yeah, that’s a really good question. There’s a whole bunch of facets to it. One is that, well, we’re open source, so you don’t have to keep using our service, you don’t even have to keep using our support. There is an off ramp, and at the same time, of course, we do have some enterprise features and that’s probably the answer for you about how you actually maintain some degree of lock in that’s useful to your business. You got to keep innovating, right? You do need to reserve some of what your value proposition is. That’s only available if they remain a customer. So there’re different ways to do that.


We also, Cockroach looks like Postgres and Postgres, of course, is a very widely adopted database and many databases that aren’t Postgres look like Postgres. We’re not the only one. I mean, Google has them and there’re other startups and so that’s another answer. I think the largest answer, especially when you’re talking to big companies is they’re not worried about the vendor lock in for Cockroach Labs. I mean maybe mildly.


What they’re worried about is the vendor lock in from the hyper scale cloud vendors. They’re very worried about that the right way to assuage their concerns isn’t so much to convince them that they’re not going to be locked into your system is to convince them that if they use your system, they’re not going to be locked into any particular cloud vendor, that even they have the opportunity to repatriate off the cloud vendors and run their own data centers if they get big enough where that actually becomes economically advantageous we have a number of those customers. So it’s like that’s the elephant in the room, you want to speak to that as opposed to your own vendor lock in and I think you get a lot more benefit.

Matt Turck (46:35):

All right, one last question.

Speaker 5 (46:38):

Thanks. I just had two question. Talk about the… Oh, sorry about that. Can you talk about what in the MySQL architecture actually limits its ability to scale? I just curious to hear your take on that. Is it something like charting just not being natively supported or something else that didn’t allow you guys to scale at Google?

Spencer Kimball (47:05):

Yeah, so MySQL is an example of what’s called a monolithic architecture. So really it’s addressing the resources that are available in a single integrated machine. So you can scale those machines up. Is that 128 cores? I don’t know what to the limit is today with Oracle and Db2, you’re actually potentially running on hardware that far exceeds what the capacity of the maximum commodity rack hardware would be in a cloud vendor. You’re using an IBM mainframe where you’re using a Cray supercomputer or something like that even those have a super linear cost curve and they have a definitive ceiling on how big they can get.


When you’re using a monolithic architecture, you’re really limited to how big you can scale either one machine or a very tightly coupled set of machines if you start to distribute. MySQL is, I think, best described as not really paying as much attention, for example, as Oracle or Db2 has to that scaling problem. When we were building AdWords at Google, that was in the year 2002, I guess 2003, when we were doing that, and I don’t think all that much has changed in terms of the internal architecture of MySQL.


Google was solving that problem with MySQL by doing the charting outside of the database. The realization at Google at that time was that’s a fraught architecture. If you don’t solve the problem of scale inside the database, you lose the database guarantees and you’re spending a huge amount of time at the application level and in the operational level of managing many independent MySQL shards without the benefit of things like transactions. Just to give you an example, with AdWords, they had eBay as a customer and eBay didn’t fit into a single shard, so you had this other weird problem, not just how do you put lots of customers onto one shard and you have many shards, but you actually have customers that are so big they don’t even fit into one shard, so you’re breaking the customer up between shard. So you can see the complications of not solving that problem at the database level, actually result in tremendous costs in the software engineering and the SREs and things to run the system that you’ve had homegrown, which then you keep running. Google didn’t replace that system with Spanner I think for almost 10 years, by that time it had gotten to a thousand plus shards of MySQL.


In contrast though, just to give you an idea that I’m not… I am actually a pretty big fan of MySQL, it’s an amazing system in its own right. Facebook has hundreds of thousands, maybe millions of shards of MySQL now and they’ve gone and implemented a meta database using MySQL as the per node constituent and it blends all those together into truly massive systems that have certain properties that make sense for Facebook’s use case. Facebook’s so large, Cockroach has never been shown to work at that kind of scale. I mean, it’s literally millions of nodes. So that’s an interesting problem and that they have a purpose built solution for it. So MySQL is still very much useful, but I’d say that where Cockroach shines is if you’re not Facebook serving 3 billion active users, I think that is a more common company. There’s only one of those in the world.

Matt Turck (50:42):

All right, on that note, that’s a wrap for today. Thank you so much for sharing all of this from a tech perspective, market perspective, go to market perspective, super great. I hope you come back soon for a fifth time and I think you need to run now. Unless that’s changed, I think you need to go to dinner. So thank you so much for your time, really appreciate it.

Spencer Kimball (51:06):

It’s my pleasure, Matt. Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *