Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.
It intersects with just about every piece of software, from systems architecture to APIs, and enterprises are adopting it more than ever. Open source software, judging by just about every estimation in recent years, is eating the world.
But sifting through the vast array of open source projects out there, sorting the wheat from the chaff, can be a challenge, which is partly why early stage VC firm Two Sigma Ventures has launched a new index designed to surface “high-level trends” in the open source sphere.
It’s worth noting that there are already all manner of indexes and charts out there that deliver useful insights for the open source world, such as the Open Source Contributor Index, which ranks commercial organizations by their employees’ open source contributions (Google’s in the lead). And GitHub itself charts things like trending repositories. It’s possible to slice, dice, and present GitHub data in any way you see fit through its publicly available API, which is exactly what Two Sigma Ventures has done with the Open Source Index. But rather than relying on “stars,” it uses “watchers,” which it argues provides a more accurate reflection of a project’s true popularity.
Star watchers
GitHub, for the uninitiated, allows logged-in users to either “star” or “watch” a project — the former can perhaps best be likened to bookmarking, as the user saves the project to their profile so they can easily check in on it without having to search. It can also be used as a show of respect, similar to how someone might “like” a Facebook post or tweet — “I dig what you’re doing for open source, keep up the good work.” When someone chooses to “watch” a project, however, they are likely taking a more active interest, as they essentially sign up to receive project notifications. As such, the Open Source Index is based on the top GitHub projects as per the number of people that are “watching” a project.
While there are of course broad correlations between “stars” and “watchers,” i.e. top projects will likely have a high number of both, they aren’t always totally aligned. Moreover, Two Sigma Ventures wanted to showcase what’s popular today, rather than what has built a high “vanity metric” by virtue of having launched 10 years ago.
“A stars-based ranking tends to prioritize older projects that have been around for a while, since stars are more cumulative in nature,” Two Sigma Ventures VC Vinay Iyengar told VentureBeat. “With watchers, we believe we have a better sense of the projects that are ‘hot’ right now, as opposed to those that have been around for a while.”
And so the Open Source Index, which is continuously updated, showcases the 100 “most popular and fastest-growing” open source projects, allowing users to sort and filter by various criteria (Two Sigma Ventures filtered out all the non-technical projects, such as books and educational content from the index).
For the index, Two Sigma Ventures has produced its own TSV (Two Sigma Ventures) ranking, which is weighted as an average of five variables: Watchers (40%); Watcher growth (25%), which considers the variance in watchers over the past quarter; Contributors (15%); Release cadence (10%), which is the number of commits over a project’s lifetime; and Community health score (10%), which is based on GitHub’s own metric for how well-maintained a repository is.
“We think our TSV Score metric is somewhat of a ‘super’ metric, in that it takes into account several factors that we believe lead to building a great open source project/community,” Iyengar said.
None of this is an exact science of course, and Iyengar acknowledges that these weightings are somewhat “arbitrary,” reflecting “just one perspective on what’s important in building a great open source community.”
The index defaults to the TSV score ranking (highest to lowest), and doesn’t reveal too many surprises — TensorFlow, React, Vue, Angular, and Kubernetes all rank highly, and they all have high numbers of stars and watchers.
But playing around with the various filters is where things start to get a little more interesting. Chinese tech titan Baidu’s open source autonomous driving project Apollo, for example, ranks 41st when using the TSV ranking and 72nd by number of watchers. And in terms of stars, Apollo comes in last at 100th.
However, if you filter the index by the quarterly watcher growth metric, Apollo is in pole position.
There could be a number of reasons for this surge in interest. Two months ago, Baidu’s Apollo became the sixth company in California to get approval to test fully driverless cars on public roads, while the company has launched all manner of autonomous vehicle programs and projects in its domestic China too.
Whatever the reason behind this surge, it serves as an interesting data point for any developer, company, or entrepreneur wanting to keep their finger on the open source pulse. “It [watcher growth metric] gives us an important signal on which projects have momentum in the developer ecosystem,” Iyengar noted.
Other interesting observations including Bitcoin, which is ranked 40th in the index by number of stars (48,000 stars) and 33rd by TSV ranking. However, it’s in seventh place by number of watchers, ahead of JQuery, Kubernetes, and Visual Studio Code, among other arguably “more relevant” projects.
The Two Sigma factor
So why has Two Sigma Ventures taken the time to create this list, and what relevance does it hold? Well, as an investor, the firm has backed several startups that commercialize open source projects, such as GitLab, Timescale, Radar Labs, NS1, and Replicated. Playing around with the various menus and filters on the index reveals some interesting insights related to this, such as that seven of the top 100 projects were either created by private VC-backed startups or are maintained by commercial companies created by the original project creators — these are Redis, Grafana, Vercel, Hashicorp, Confluent, Databricks, and Preset.
But the VC entity is a separate business simply called Two Sigma, which is an investment management company that applies “cutting-edge technology to the data-rich world of finance,” according to Iyengar. It counts 1,700 employees — more than half of whom are software developers and use open source software on a daily basis. They are also creators of a number of open source projects, such as Flint and BeakerX.
“We have seen firsthand how software created by developers, for developers, leveraging community-based development, leads to incredible innovation,” Iyengar said in a separate blog post announcing the index. “Moreover, we are excited about how enterprise software is moving toward bottoms-up adoption, and how an open core business can lead to remarkably efficient customer acquisition and growth.”
This new index also constitutes part of a growing trend in the technology realm that strives to make sense of the open source world. Just last week, OpenLogic launched an upgraded tool it calls Stack Builder, which helps enterprises choose the right open source software. And earlier this year, Openbase emerged out of the ether to serve as a sort of Yelp for open source software packages.
If nothing else, the Open Source Index serves as a useful accompaniment to these other efforts, helping companies and developers dig down into the best — or most popular — open source projects on GitHub right now. There are plans to add more data to the mix in the future, according to Iyengar, such as downloads, community engagement in external channels such as Slack or Discord, and even mentions in job advertisements.
The Open Source Index is available now and free to use for anyone.
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more