Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Exactly. It works fine for most tasks, of course, but if you ever want to process the contents of the S3 bucket in bulk, nothing will ever be able to parallelize that one list request to /path/to/big-dir.

If you don't use the evenly-distributed-prefix trick, your only chance of speeding it up is knowing the file names beforehand. If they're all sequentially numbered, you might do that, of course.

The shardable prefix doesn't need to be at the top level. So you could also organize it like so, for example:

    /secret/documents/2016-01-01/00000001.doc


Thanks! I've read the docs and blog posts, but it was interesting to see a real live antipattern.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: