Why analyses of GitHub’s Open-source software teams of developers will continue it’s growth.

1) GitHub is *the thing*. It have a modern UI which follows current trends. It’s easy in use, it have only one mechanism of version control, which is of course – Git. It have it’s own culture and fans (e.g. octocat, gadgets, stickers, etc.). Despite the fact it is sometimes blocked (e.g. in China) and have short shortages, it is highly reliable and refreshes data on the web pages immediately after a single change made from the git client / protocol (Yes, Git is also a protocol).

2) GitHub have biggest number of users and projects. More than SourceForge.

3) GitHub don’t have advertisements on their website. And will never have as there is no such need for them. While SourceForge is currently packed with wide blocks of different advertisement (probably to keep their funds running), GitHub webpage is clean and feature-oriented.

4) Probably most important – rich API for developers and researchers. It made for creating solutions like GitHub Torrent (http://ghtorrent.org/). It allowed Google BigQuery to use GitHub timeline data. It’s possible to create your own local instance of MongoDB or MySQL database holding all events from the GitHub timeline. Thanks to fast and secure OAuth for webapps application like Open Source Report Card (https://osrc.dfm.io/) could be created.

5) Trend analysis on Google Scholar proves my point. Number of papers involving GitHub is increasing, while number of articles on SourceForge is decreasing. There is a small number of people in the World who make high quality FLOSS* research using only GitHub data, and they work is quite often cited, despite the fact it’s a new research (papers from 2014, 2015).

Source: self-made in Feb 2015
Source: self-made in Feb 2015

6) There are many externals apps which support continuous integration and management of OSS teams. Example of an automatic-build system is drone.io. There is research in Academia about possible task-assignment strategies in OSS teams as well as creating central planners for work distribution. And what’s most important – papers regarding possible quality models in FLOSS teams and results from analyzing teams on GitHub.

7) GitHub employees are present at many important conferences regarding FLOSS and / or web technologies. Ivan Žužak will be a speaker at one of workshops at 11th Intl. Conf. on Open Source Systems (Florence, 2015). They are very keen about making an impact and helping the open-source community.

There are 257 people from all over the globe working at GitHub. Meet them here.

8) There is a high quality manual for mining the GitHub, as well as know-how of avoiding perils in OSS analysis (e.g. forks vs mother repositories, push model vs. fork-push). Check out:

Eirini Kalliamvakou et.al., “The promises and perils of mining GitHub” – http://dl.acm.org/citation.cfm?id=2597074
“Analyzing Millions of GitHub Commits – Ilya Grigorik” – https://www.igvita.com/slides/2012/bigquery-github-strata.pdf

You can read more on my other post @(https://oskarj.wordpress.com/2014/05/07/playing-with-github-data-and-researching-oss-current-state-of-art/) from the previous year.. Feel free to leave a comment below.

*FLOSS – Free/Libre Open Source Software