Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Scalability to large numbers of files and large numbers of changesets.

For the large number of files one https://code.facebook.com/posts/218678814984400/scaling-merc... is one source. There's some earlier discussion of the same issue at https://news.ycombinator.com/item?id=3549679 that goes into some technical details.

There has been some work on git since then to address some of those issues (e.g. see https://blogs.msdn.microsoft.com/bharry/2017/02/03/scaling-g... ) but it's not clear to me that it helped enough to catch up to where Mercurial is for large repos.

For large numbers of changesets, just try running "log" or "annnotate" on any file with a long history in git. I just did this simple experiment:

1) hg clone https://hg.mozilla.org/mozilla-central/

2) git clone https://github.com/mozilla/gecko-dev.git

3) (cd mozilla-central && time hg log dom/base/nsDocument.cpp)

4) (cd gecko-dev && time git log dom/base/nsDocument.cpp)

It's not quite apples to apples because the git repo there has some pre-mercurial CVS history in it. But note that I'm not even using --follow for git and the file _has_ been renamed after the mercurial repo starts, so git is actually finding fewer commits than mercurial is here.

Anyway, if I do the above log calls a few times to make sure the caches are warm, I end up seeing times in the 8s range for git and the 0.8s range (yes, 10x faster) for mercurial.

That all said, most repos do not have millions (or even hundreds of thousands) of changesets or files. So the scalability problems are not problems for most users of either VCS.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: