Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Git is pretty good at handling large files right now.

We use git to push configuration data to (near) real-time systems that keep this data in local memory mapped key-value stores. It's essentially a fully replicated, eventually consistent key-value store, and git plays a staring role.

We regularly push multi-gigabyte files through this system with ease, with only a few tweaks on git's configuration. Git has some major advantages. It's fast, can handle large files, and you can piggyback on the versioning system to ensure that multiple writers aren't competing. Also, git is highly configurable and has a lot of out-of-the-box options for easily setting up git servers.



Git, in common with every other version control system I ever used, considers adding a column to a CSV file to be "every line was altered" because Lines are all-important.


How do you deal with old data, like multigigabye files deleted last year that everyone has to download because it's git?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: