The simplest storage, is a bunch of dumb servers (well beefy dumb servers) with some application aware scripts to move/copy the dataset.
A place I worked at had a wrapper around rsync that would split up the directory and spawn multiple rsyncs to do a parallel copy.
The Directory structure was effectively copy on write, so backup to the nearline was <15 minutes
The simplest storage, is a bunch of dumb servers (well beefy dumb servers) with some application aware scripts to move/copy the dataset.
A place I worked at had a wrapper around rsync that would split up the directory and spawn multiple rsyncs to do a parallel copy.
The Directory structure was effectively copy on write, so backup to the nearline was <15 minutes