A few days ago, I was invited to speak at a Yale Entrepreneurship Breakfast about about one of my favorite areas of interest, Artificial Intelligence. Here are the slides from the talk — a primer on how AI rose from of the ashes to become a fascinating category for startup founders and venture capitalists. Very much a companion to my earliest post about our investment in x.ai. Many thanks to my colleague Jim Hao, who worked with me on this presentation.
Category: Big Data
x.ai and the emergence of the AI-powered application
Continue reading “x.ai and the emergence of the AI-powered application”
The State Of Big Data in 2014: a Chart
Note: This post appeared on VentureBeat, here.
It’s been almost two years since I took a first stab at charting the booming Big Data ecosystem, and it’s been a period of incredible activity in the space. An updated chart was long overdue, and here it is:
(click on the arrows at the bottom right of the screen to expand)
A few thoughts on this revised chart, and the Big Data market in general, largely from a VC perspective:
Can the Bloomberg Terminal be “Toppled”?
The field of bioinformatics is having its “big bang” moment. Of course, bioinformatics is not a new discipline and it has seen various waves of innovations since the 1970s and 1980s, with its fair share of both exciting moments and disappointments (particularly in terms of linking DNA analysis to clinical outcomes). But there is something special happening to the industry right now, accelerated by several factors:
Thomson Reuters CTO Series (Podcast)
Thomson Reuters CTO James Powell runs a great series of podcasts where he interviews people in the technology world about topics of relevance to his organization. I was fortunate to be invited to speak with James about the Internet of Things and Big Data, and it was a lot of fun. Below is the podcast, uploaded on SoundCloud. Thanks to James Powell and Dan Cost for the opportunity.
Launching New Sites for Data Driven NYC and Hardwired NYC
Some updates on the event/community front:
1) A little while ago, I changed the name of the data event I’ve been organizing from “NYC Data Business Meetup” to “Data Driven NYC”. I originally started the event mostly as experiment, and didn’t give much thought to branding (so yeah, that was a terrible name). The event has now grown quite a bit (over 3.700 members as I write this), so it was time for a better name; also at this stage, it feels more like a community than “just” a meetup, so I wanted a name that reflected this reality.
2) Back in June, I launched a new community called “Hardwired NYC”. It covers startups, technologies and products at the intersection of the physical and digital worlds, including topics like 3D printing, Internet of Things, wearable computing, etc. I developed a strong interest in those areas through my involvement in the Big Data world – the Internet of Things, in particular, is deeply intertwined with Big Data (the proliferation of sensors has been contributing to the Big Data “problem”; equally the Internet of Things will be highly dependent on Big Data technologies if it is to deliver on its promise).
3) As Hardwired NYC is taking off fast (more than 700 members after just two events), I figured that both events/communities should have their own website with full video libraries, including for people who don’t live in New York and are interested in the content. So, with the great help of my FirstMark colleague Dan Kozikowski, I’m launching this week www.datadrivennyc.com and www.hardwirednyc.com. Both sites have a “Watch” section where, from now on, I will post pictures and videos of events (as opposed to this blog).
Big Data 101 Presentation
A few weeks ago, I was invited to do a couple of guest lectures at NYU (as part of the excellent “Ready, Fire, Aim” entrepreneurship class that Lawrence Lenihan, now my partner at FirstMark, has been doing for a while there) and at The New School (as part of a Big Data course organized by Debra Anderson and Greta Knutzen). Thought I’d share the slide deck I had prepared for those classes. Very much a Big Data 101 class for a college-level audience that had had little or no exposure to the key concepts prior to the class.
Quantopian, Plaid and ZestFinance
Our February NYC Data Business Meetup was focused on the intersection of data and finance (both market and consumer finance). Quantopian, Plaid and ZestFinance presented.
We also had a great panel presenting the customer perspective on Big Data (hype vs. reality), from a financial institutions’ viewpoint, with the following speakers: Mike Simone (Global Head of CitiData Platform Engineering), Emile Werr (Head of Enterprise Data Architecture, NYSE EuroNext) and Raj Patil (up until recently Data innovation CTO at UBS, now an entrepreneur). Unfortunately, due to standard policy at some of those institutions, we can’t publicly post the video of the panel.
Here are the videos, in order of appearance (we also had a great “customer panel
Bloomberg App Portal:
SumAll, SimpleReach, Hadapt and ClearStory
Our January NYC Data Business Meetup was focused on data analytics.
Here are the slide decks:
Here are the videos:
Joseph Turian, Sqrrl, Infochimps and MemSQL
The December NYC Data Business Meetup was focused on big data infrastructure companies, with the co-founders of Sqrrl, Infochimps and MemSQL presenting to a full house. We started the evening with a presentation by prominent data scientist Joseph Turian.
The slides are here: Joseph Turian, Sqrrl, Infochimps and MemSQL.
Here are the videos:
Joseph Turian, “How to do AI in 2013”
Oren A. Falkowitz, Co-Founder & CEO, Sqrrl
Dhruv Bansal, Co-Founder & Chief Science Officer, Infochimps
Eric Frenkiel, Co-Founder & CEO, MemSQL
And here are a few pics (photo credit: Shivon Zilis):
Recorded Future, Lex Machina, DataMarket and numberFire
The November NYC Data Business Meetup was focused on “vertical-specific” applications of big data – startups leveraging the big data stack to offer new solutions to specific industries, such as finance and government (Recorded Future), the legal industry (Lex Machina), energy (DataMarket, although it offers data sets for other industries as well) and sports (numberFire).
The slides are here: Recorded Future, Lex Machina, DataMarket and numberFire.
Here are the videos:
Christopher Ahlberg, CEO, Recorded Future:
Josh Becker, CEO, Lex Machina:
Hjálmar Gíslason, CEO, DataMarket:
Nik Bonaddio, CEO, numberFire:
IA Ventures, Accel, Data Collective, Precog and CCS at the NYC Data Business Meetup
Here are the videos from the NYC Data Business Meetup that was held on October 23, 2012, in order of appearance:
Jeff Carr, COO, Precog
Max Yankelevich, co-founder, CrowdComputing Systems
Roger Ehrenberg, Founder and Managing Partner, IA Ventures; Ping Li, General Partner, Accel Partners; Matt Ocko, Co-Founder and Partner, Data Collective (from left to right):
A chart of the big data ecosystem, take 2
So here we are again. My colleague Shivon and I had made a first attempt at making sense of the rapidly evolving big data ecosystem back in June. Based on some very helpful feedback from readers of this blog and others, a number of additional meetings with interesting startups and more in depth research, we’ve come up with this second version.
- It’s still a work in progress (and will presumably always be, that’s the nature of the beast)
- It’s even more crowded than the first time around, which reflects the incredible vitality of the big data space
- We’ve created some new subcategories such as NoSQL/NewSQL and analytics services (reflecting the reality that, for the time being, the last mile of data analysis is very much performed by humans)
- We have the occasional company that appears in different categories (Infochimps or Autonomy for example)
- We have learned more about companies that were already on the first version of the chart, and have positioned them differently. For example, Metamarkets now falls in the “Cross Infrastructure/Analytics” category as they offer a stack that includes a data store (Druid), predictive analytics and visualization. Another example is Collective[i] – they have built an entire proprietary big data stack from the ground up, that includes infrastructure, analytics and applications – making the company a rare example of an “Application Service Provider”.
Our goal is to continue updating this chart from time to time, and perhaps make it evolve visually, as we’ve probably reached the limits of what we can reasonably fit on one slide. It was suggested that we try to visually distinguish on premise offerings vs. cloud based solutions, which we may try to do.
To enlarge, click on the arrows at the bottom right of the chart.
Comments, thoughts, questions? Please add to the comments section.
10Gen, Mortar, Datadog & Rick Smolan at the NYC Data Meetup
Here are the videos and some pictures (scroll down) of the NYC Data Business Meetup held on September 25, 2012
In order of appearance:
1) Rick Smolan told us about his fascinating new project, the “Human Face of Big Data” – see the NY Times coverage here: http://nyti.ms/TO5MDd.
2) Mortar (presenter: K Young, CEO). Mortar (www.mortardata.com) provides a platform-as-a-service for Hadoop. They take care of all of the necessary infrastructure (via AWS) and allow any software engineer to run jobs on Hadoop using Apache Pig and Python without special training.
3) Datadog (presenter: Alexis Le Quoc, co-founder). Datadog (www.datadoghq.com) is a service for IT, Operations and Development teams who write and run applications at scale, and want to turn the massive amounts of data produced by their apps, tools and services into actionable insight. Datadog helps software developers and web ops understand their IT Data by putting it all in context.
4) We finished with a fireside chat with Dwight Merriman, CEO and co-founder, 10Gen. 10Gen (www.10gen.com) develops MongoDB, and offers production support, training, and consulting for the open source database. Dwight is one of the original authors of MongoDB. In 1995, Dwight co-founded DoubleClick (acquired by Google for $3.1 billion) and served as its CTO for ten years. Dwight was the architect of the DoubleClick ad serving infrastructure, DART, which serves tens of billions of ads per day. Dwight is co-founder, Chairman, and the original architect of Panther Express (now part of CDNetworks), a content distribution network (CDN) technology that serves hundreds of thousands of objects per second. Dwight is also a co-founder and investor in BusinessInsider.com and Gilt Groupe.