Git is pretty good at handling large files right now. We use git to push configu... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		t1m on April 26, 2014 \| parent \| context \| favorite \| on: Dat – A git-like tool for large datasets Git is pretty good at handling large files right now. We use git to push configuration data to (near) real-time systems that keep this data in local memory mapped key-value stores. It's essentially a fully replicated, eventually consistent key-value store, and git plays a staring role. We regularly push multi-gigabyte files through this system with ease, with only a few tweaks on git's configuration. Git has some major advantages. It's fast, can handle large files, and you can piggyback on the versioning system to ensure that multiple writers aren't competing. Also, git is highly configurable and has a lot of out-of-the-box options for easily setting up git servers.

handelaar on April 26, 2014 | [–]

Git, in common with every other version control system I ever used, considers adding a column to a CSV file to be "every line was altered" because Lines are all-important.

skybrian on April 26, 2014 | [–]

How do you deal with old data, like multigigabye files deleted last year that everyone has to download because it's git?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact