Our January NYC Data Business Meetup was focused on data analytics.
Here are the slide decks:
Here are the videos:
The December NYC Data Business Meetup was focused on big data infrastructure companies, with the co-founders of Sqrrl, Infochimps and MemSQL presenting to a full house. We started the evening with a presentation by prominent data scientist Joseph Turian.
Here are the videos:
Joseph Turian, “How to do AI in 2013”
Oren A. Falkowitz, Co-Founder & CEO, Sqrrl
Dhruv Bansal, Co-Founder & Chief Science Officer, Infochimps
Eric Frenkiel, Co-Founder & CEO, MemSQL
And here are a few pics (photo credit: Shivon Zilis):
The November NYC Data Business Meetup was focused on “vertical-specific” applications of big data – startups leveraging the big data stack to offer new solutions to specific industries, such as finance and government (Recorded Future), the legal industry (Lex Machina), energy (DataMarket, although it offers data sets for other industries as well) and sports (numberFire).
Here are the videos:
Christopher Ahlberg, CEO, Recorded Future:
Josh Becker, CEO, Lex Machina:
Hjálmar Gíslason, CEO, DataMarket:
Nik Bonaddio, CEO, numberFire:
Here are the videos from the NYC Data Business Meetup that was held on October 23, 2012, in order of appearance:
Jeff Carr, COO, Precog
Max Yankelevich, co-founder, CrowdComputing Systems
Roger Ehrenberg, Founder and Managing Partner, IA Ventures; Ping Li, General Partner, Accel Partners; Matt Ocko, Co-Founder and Partner, Data Collective (from left to right):
So here we are again. My colleague Shivon and I had made a first attempt at making sense of the rapidly evolving big data ecosystem back in June. Based on some very helpful feedback from readers of this blog and others, a number of additional meetings with interesting startups and more in depth research, we’ve come up with this second version.
Our goal is to continue updating this chart from time to time, and perhaps make it evolve visually, as we’ve probably reached the limits of what we can reasonably fit on one slide. It was suggested that we try to visually distinguish on premise offerings vs. cloud based solutions, which we may try to do.
To enlarge, click on the arrows at the bottom right of the chart.
Comments, thoughts, questions? Please add to the comments section.
Here are the videos and some pictures (scroll down) of the NYC Data Business Meetup held on September 25, 2012
In order of appearance:
1) Rick Smolan told us about his fascinating new project, the “Human Face of Big Data” – see the NY Times coverage here: http://nyti.ms/TO5MDd.
2) Mortar (presenter: K Young, CEO). Mortar (www.mortardata.com) provides a platform-as-a-service for Hadoop. They take care of all of the necessary infrastructure (via AWS) and allow any software engineer to run jobs on Hadoop using Apache Pig and Python without special training.
3) Datadog (presenter: Alexis Le Quoc, co-founder). Datadog (www.datadoghq.com) is a service for IT, Operations and Development teams who write and run applications at scale, and want to turn the massive amounts of data produced by their apps, tools and services into actionable insight. Datadog helps software developers and web ops understand their IT Data by putting it all in context.
4) We finished with a fireside chat with Dwight Merriman, CEO and co-founder, 10Gen. 10Gen (www.10gen.com) develops MongoDB, and offers production support, training, and consulting for the open source database. Dwight is one of the original authors of MongoDB. In 1995, Dwight co-founded DoubleClick (acquired by Google for $3.1 billion) and served as its CTO for ten years. Dwight was the architect of the DoubleClick ad serving infrastructure, DART, which serves tens of billions of ads per day. Dwight is co-founder, Chairman, and the original architect of Panther Express (now part of CDNetworks), a content distribution network (CDN) technology that serves hundreds of thousands of objects per second. Dwight is also a co-founder and investor in BusinessInsider.com and Gilt Groupe.
Mark Birch has a good summary of a recent panel organized by the NYC Enterprise Tech Meetup (which also has a video of the panel on its site, unfortunately with poor audio quality). In addition to Mark, the panel featured David Aronoff (General Partner, Flybridge Capital Partners), Jeanne Sullivan (General Partner, StarVest Partners), Raju Rishi (Venture Partner, Sigma Partners) and myself. Many thanks to Jonathan Lehr, the organizer of the event, for putting it together. Couple of pics below and also one here (I know! Panel pics are just so exciting!).
One key takeaway for me is that the NYC area used to have a pretty vibrant enterprise tech scene (with Computer Associates, etc.) in the eighties and up until the mid-nineties (before my time), which makes the relative dearth of enterprise tech startups in NYC over the last dozen years somewhat odd. I’m excited to see a whole new wave of NYC startups rising to prominence, including 10Gen, Opera Solutions, Enterproid, Nodejitsu, AppFirst, Datadog, Mortar, etc.
Here are the some videos, slides and pics from the most recent NYC Data Business Meetup. The videos are unfortunately not of the greatest quality, but are good enough to watch.
Also, note to self: make sure that our audience of 200+ sits closer to the stage, so that the room doesn’t look tragically empty on camera (rookie mistake)!
In order of appearance:
1) Todd Papaioannou, CEO, Continnuuity, a stealth big data startup, based in Palo Alto, CA and backed by Andreessen Horowitz, Battery Ventures, Data Collective and a number of high profile angels. Todd was previously Chief Cloud Architect for Yahoo.
2) Neil Capel, CEO, and Daniel Krasner, Chief Data Scientist, Sailthru, a New York based startup backed by RRE, AOL Ventures, Lerer Ventures, DFJ Gotham, Thrive Capital, Metamorphic, etc. Sailthru provides fully automated, 1:1 email and onsite recommendations using a unique behavioral targeting platform. Sailthru helps brands cut through the clutter and build trust with their customers by recognizing and acting upon their individual interests. Sailthru’s technology creates individual user profiles associated with each person’s email address and online behavior. Sailthru’s algorithms gauge each individual user’s intent and match appropriate content and frequency of email communications such that every email is tailored to the unique user. That means they send as many permutations of an email as there are recipients. All simultaneously, all automated and all in real time.
3) Dennis R. Mortensen, CEO and Jeroen Janssens, Data Scientist,Visual Revenue, a New York based startup backed by Lerer Ventures, SV Angel, IA Ventures and Softbank. Visual Revenue increases front page performance for online media organizations. Their platform provides Editors with actionable, real-time recommendations on what content to place in what position right now and for how long. Visual Revenue’s predictive analytics technology allows media organizations to proactively manage the cost of exposing a piece of content on a front page, whilst maximizing the return they expect from promoting it.
4) Panel discussion and Q&A with the audience
I have been very intrigued by the recent emergence of “data driven” firms, aiming to use data to reinvent venture capital.
While they certainly review various data points and metrics before deciding to invest in a startup, as of today venture capital investors largely operate based on “pattern recognition” – the general idea being that, once you’ve heard thousands of pitches, sat on many boards and carefully studied industries for years, you become better than most at predicting who will make a strong founder/CEO, what business model will work and eventually, which startup will end up being a home run. The trouble is, the model doesn’t always work, far from it, and many VCs end up making the wrong bets, resulting in disappointing overall industry results. Could VCs be just like the baseball scouts described in Moneyball, who think they can spot future superstars because they’ve seen so many of them before, but end up being beaten by a cold, objective, statistics-based approach?
Enter several firms trying to do things differently:
Since I’m a big fan of anything data-driven (decisions, product, companies), the concept resonates strongly with me. Predictive analytics have been successfully used in various industries, from retail to insurance to consumer finance. Other asset classes are highly data driven – fundamental and technical analysis drive billions of dollars of trade; hedge fund quants spend their lives building complex models to price and trade securities; high-frequency trading bypasses human decision making altogether and invests gigantic amounts of money based solely on data. In this world where everything gets quantified, why should venture capital be an exception?
However, as much as I like the idea, I believe venture capital doesn’t lend itself very well to a model-heavy, quasi “black box” approach. The creation of a reliable, systematic predictive model is a particularly challenging task when you consider the following obstacles:
In addition, it would be interesting to see how startups react in the long run to investors who are interested in them mostly because they scored well on a model, as opposed to spending extended time getting to know them. Unlike public stock markets, venture capital fundraising is a two-way dance, and startups often pick their investors as much as their investors pick them.
However, while I have my doubts about using data models as valid predictors of the overall success of an early stage startup, my guess is that there are still plenty of interesting insights to be gleaned from the data, and that forward-thinking VC firms could gain a competitive advantage by actively crunching it – my sense is that very few firms have done so at this stage.
Interestingly, there are some good data sources and emerging technologies out there that could be leveraged as a first step, without engaging into a massive data gathering or technology development effort:
If anyone is aware of other efforts around crunching data relevant to VCs, or other ways VCs have been used a heavily data-driven approach, I’d love to hear about it in the comments.
Three days in, Brewster, the new personalized address book, has become an instant classic for me. Perhaps I lucked out, but I didn’t experience much of the delay in processing my contacts that many others reported – I had to wait about 90 minutes which, while not ideal, was fine. Everything since then seems to have been working like a charm – the de-duplication and reconciliation of contacts across social networks, in particular, was beautifully done, and that’s not a trivial data problem.
I have always liked the concept of a personalized, always current address book. In a way, it is sort of like the old Plaxo idea, which was probably before its time. There were various startups that tried to fix the address book, including Sensobi (that eventually was acquired by GroupMe). The next iteration of the social concept that I’m aware of is Everyme – at least in the initial vision the founders had for it when they were at Y Combinator in the Summer of 2011. I was a bit bummed when it pivoted (or evolved) to become a private social network.
I really like that Brewster came out of the gate very “feature-rich”. While I’m all for MVPs and generally agree that “if you are not embarrassed by the first version of your product, you’ve launched too late”, for something like this, I think the founder(s) made the right call to wait until the product was ready before launching. At this stage of the game, anything that sounds like yet another hyped up app, and asks me to connect all my social networks when I first log in, etc. had better deliver some real value quickly for me to give it a real chance, and that was the case here. As the founder Steve Greenwood has apparently been mulling over this concept for many years, the temptation to release early must have been strong, particularly as it sounds like several startups are working on related concepts, including for example FullContact, but from my user’s standpoint, it was well worth it.
A few other aspects of the product (and its launch) that I like:
– I like that Brewster was clearly thought through as a data product – while the “Favorites” tab has an emotional and aesthetically pleasing aspect to it (depending on how attractive one’s friends are, at least…), the rest of the app is very data-centric: the “Lists” tabs has some interesting automatic categorization (I have 171 friends who are ‘Managing Director”, apparently, does that mean I’m old?), while the “Search” tab is awesome, with good suggested searches and the ability to uncover all sorts of interesting common interests across my contact list.
– While everything is automated, I like the fact that the product made me work manually to create my list of “favorites”. That actually increased my personal investment into the product, and makes me less likely to discard it.
– I really like that Brewster did not use any of the tired “virality” tricks that have become so common place. No automatic posting on my Facebook newsfeed; no “Sent using my Brewster address book” tag line in emails, etc.
– I was impressed with the email I got to announce that my account was ready, personalized with pictures of some of my key contacts – great way of delivering a unique experience before I even started using the product in earnest.
The data privacy issue (and the fairly dramatic reactions to it) are of course a concern. I’m actually surprised that I don’t care more about it, personally — I guess I have gone pretty far down the path of accepting some privacy risk (as long as it’s not banking information), in return for getting a lot of value from the product, which I feel is the case here. But obviously many people will feel differently, and this could sink the company entirely, if not properly addressed.
One functionality that I don’t find as impressive, at least as of now, is the “Updates” section — what it has surfaced so far (birthdays essentially) is not particularly interesting. What would be really cool, eventually, would be an integration with Newsle, to get news about your friends. Oh wait, add to this an integration with Cue, as well. All built in natively into my iPhone address book and calendar. Ok, so, maybe that’s a bit much to ask. In the meantime, Brewster is already one of the most interesting apps I have seen in a long time.
My colleague Shivon Zilis has been obsessed with the Terry Kawaja chart of the advertising ecosystem for a while, and a few weeks ago she came up with the great idea of creating a similar one for the big data ecosystem. Initially, we were going to do this as an internal exercise to make sure we understood every part of the ecosystem, but we figured it would be fun to “open source” the project and get people’s thoughts and input.
So here is our first attempt.
A few things became apparent very quickly:
1) Many companies don’t fall neatly into a specific category
2) There’s only so many companies we can fit on the chart — subcategories as NoSQL or advertising applications, for example, would almost deserve their own chart.
3) The ecosystem is evolving so quickly that we’re going to need to update the chart often – companies evolve (e.g., Infochimps), large vendors make aggressive moves in the space (VMWare with Serengeti and the Citas acquisition)
What do you think? (click on the bottom right to expand)
It’s been a few days now since their acquisition was formally announced, and I continue to be fascinated by the Buddy Media story. But what fascinates me is less the company itself and all the things that make it great – and instead the fact that its success tests the conventional wisdom of what makes a venture successful. Rightly or wrongly, investors, prospective employees, the press, and anyone who tries to predict the highly unpredictable fate of startups, tend to default to some common assumptions about what’s going to work and what isn’t. The Buddy Media story challenges that conventional wisdom in some interesting ways:
1. NYC is not a good place to start an enterprise software company
It’s a bit ironic that, for all the talk about NYC being a media and eCommerce hub, the largest acquisition in five years would be an enterprise software company.
2. It takes forever to build a successful enterprise software company.
It took Buddy Media less than 5 years from start to success, including an initial pivot.
3. To build a successful enterprise software company, you need technical co-founders, or at least a technical CEO
Buddy Media’s CEO is a serial entrepreneur with two degrees in journalism. Buddy Media’s COO is a serial entrepreneur with a background in business development and marketing and a degree in economics. The other co-founder and Chief Strategy Officer is a digital branding and marketing expert with a degree in Broadcasting and Mass Media.
4. Selling to marketers and advertisers is a really tough business.
Fortune 500 marketers and advertising agencies are indeed a tough audience – long sales cycles, often low budgets, a preference for homegrown solutions, a reluctance to buy what others in the industry purchase: not easy. But the Buddy Media success shows that it can be done, with the right execution: build the best product in your category, focus on sales, make friends in the right places, hire some key people from agencies, and work really hard.
5. Be really careful with strategic money
Buddy Media took a strategic investment from advertising leader WPP, which ended up substantially accelerating their business.
6. Service companies can’t become product companies
After an initial pivot, Buddy Media had to turn themselves into a service company to survive the 2008 economic recession. James Altucher has a really interesting post on Techcrunch that describes this phase. Somehow, they were able to gradually build a product offering.
7. The best founders are young and single
Two of the co-founders of Buddy Media are married. On top of that, they have three children. While there are famous examples of homeruns started by married founders (Cisco, VMware, etc.), in my experience, behind closed doors most investors think it’s a terribly risky idea. The Buddy Media story shows that where there is will there is a way: founders with family obligations can still endure the rollercoaster lifestyle of the startup world.
Here are the presentations from the NYC Data Business Meetup on May 21:
VoltDB – Presenter: Scott Jarr, co-founder
Datastax – Presenter: Matt Pfeil, co-founder
RJMetrics – Presenter: Robert J Moore, co-founder and CEO
Custora (presentation coming soon) – Presenters: Corey Pierson, co-founder, and Aaron Goodman, data scientist.
And here are a few pics!
Hope to add videos soon.
Thoughts, feedback, questions? Topics you’d like to discuss at the next NYDBM? (or data-related stuff you’d like to discuss, regardless of whether you attend the NYDBM or not?). Feel free to opine in the comments section.