The future of entrepreneurship data – getting to know CrunchBase CrunchBase logo Arnobio Morelix explores CrunchBase data and reports on co-hosted CrunchBase-Kauffman session in San Francisco. Written by Arnobio MorelixFebruary 11, 2016 Share: Facebook LinkedIn Twitter Here at Kauffman we are actively thinking about ways of in which we can better measure entrepreneurship activity and ecosystems. In many ways, entrepreneurship data beyond the traditional public and private data sources is still in relative infancy, and researchers are still learning how to use things like social media, crowdsourced, and news-based data. As one of our forays into exploring the future of entrepreneurship data we partnered with CrunchBase for a session in San Francisco last month. Getting to know CrunchBase Last month in San Francisco we held a session with our friends at CrunchBase. The session was focused on sharing insights by the CrunchBase team on how the data is assembled and how it can be used; Kauffman’s perspective on emerging datasets like CrunchBase and thoughts on further exploration and funding; discussion on the advantages and constraints of the data; and presentations by academic and industry users of the data. About CrunchBase CrunchBase is a leading platform to discover innovative companies and the people behind them. The CrunchBase Dataset is constantly expanding through contributions from their community of users, investment firms, and network of global partners. It now covers millions of users and businesses around the world. CrunchBase also has an open-access data license available to academic users. About the session Below is the agenda for the session, with slide decks and/or working papers shared below when possible. Welcome and Introduction Arnobio Morelix, senior research analyst & program officer, Kauffman Foundation EJ Reedy, director and program officer, Kauffman Foundation An Inside Look at CrunchBase Dataset Gené Teare, director of content, CrunchBase Comparing CrunchBase Fundraising Data Against to Another Source Yas Motoyama, director of research and policy, Kauffman Foundation Very Early Venture Finance: Pitch Competitions and their Judges Sabrina Howell, assistant professor of finance, New York University Stern School of Business Organizational Decision-Making and Information: Angel Investments by Venture Capital Partners Andy Wu, Ph.D. candidate in applied economics, University of Pennsylvania, and founding director and investor, Identified Technologies Ecosystem Attraction Metrics J-F Gauthier, CFO & head of BizDev, Startup Compass Inc. How Do Accelerators Impact High-Tech Ventures? Sandy Yu, postdoctoral fellow, UC Berkeley | Coleman Fung Institute Thoughts on the data One of the main strengths of CrunchBase is, in my opinion, the fact that they have data on both the people (e.g., founders, employees, investors) and the companies. This allows for data users to get at some stuff not easily accessible, such as the connections among different ecosystems players. As any dataset, it has limitations. One of the main limitations for academic research is that they are reporting is typically private, and we do not fully understand potential reporting biases. Usually, when that type of private information is made public, it is because of strategic reasons for the parties involved (e.g., a startup wants to show traction). Andy Wu, PhD student at Wharton, sums it up well: “The primary challenge for using Crunchbase is that we don’t fully understand the extent of missing data and more broadly the limitations for crowdsourced data. I suspect that we are missing a huge amount of data on the smallest investment events that go undisclosed without press releases or without SEC Form D filings; to be fair, this is a huge problem with all datasets in entrepreneurial finance. Furthermore, since the data is continually being backfilled, there is an implicit selection bias towards the inclusion of the most successful firms that are easiest to find historical information about. Regardless, Crunchbase is definitely something for all entrepreneurship researchers to keep an eye on.” How to access CrunchBase data Just yesterday CrunchBase launched a new way of accessing their data, and you can learn more about it here. Summing it up CrunchBase is a really exciting dataset for entrepreneurship researchers – even though we are still learning about what are their main strengths and constraints. If you are a researcher using the data and would like to share your thoughts on it or propose ways in which we can better understand and augment the data, I’d love if you would let me know here. Written by Arnobio MorelixDirector, ResearchStartup Genome Next Capital Access Investing in minority entrepreneurs: an economic imperative for the U.S. February 10, 2016 Economic Opportunity The gig (economy) is up! February 9, 2016 Future of Learning 100 reasons to be proud (and 700 more on the way) February 2, 2016