Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
IPFS, The Interplanetary File System, Simply Explained (achainofblocks.com)
261 points by acob on Dec 10, 2018 | hide | past | favorite | 52 comments


IPFS "solves" the problem of distributing static files - basically, a CDN built on content addressable storage. But if you don't have nodes pinning your content, or you're going to loose that data.

The article bills it as some alternative to HTTP, but it's a subset of functionality. You can't do dynamic pages with IPFS. There's no concept of cookies, so stateful sessions aren't a thing.

Why do articles insist on comparing IPFS to "the web", whatever that means? It's a distributed filesystem. Not at all representative of most content on "the web".


The original "web" was built to serve files, with a special kind of markup called Hyper Text that could reference another file with a Hyper Link. Browsers today will still treat servers like a filesystem if it's old enough.

It took a decade to get the JavaScript cruft and layers of browser features that basically makes the modern browser a native gaming-like app with a gigantic scripting engine and a fat pipe internet connection.

It's possible iterations on IPFS could parallel the current web. But I am not a fortune teller. I just wanted to be sure we all understand where the web came from: serving files as if a gigantic connected filesystem.


Browsers don't treat any server like a filesystem. If you see a directory index; that is generated purely on the server.


In the process of researching and writing a one-line quip about FTP and Gopher, I found that Google Chrome will tell you not to enter credit card details or sensitive information over the insecure connection... to an FTP server.

That being said, FTP though.


No dynamic pages? Great. No cookies? Excellent! Back to the origins of the web!


The web is also started out as a distributed filesystem; it's just not replicated (unless you use a cdn). In the begining the web had no cookies, javascript, animated gifs, etc. So, ipfs adds resilience and scaling to the web.

Using ipfs for serving up static content instead of a traditional http server can make a lot of sense. Long term browsers might learn to interface with ipfs directly and simply specify hashes of what to fetch in their dynamic content.

Eventually that too might be content addressed. If you think about a modern docker based system, basically all you tell your server is an image name and a version tag that points to some content hash of the image it then goes and figures out what to run. Right now mostly you still get that stuff running on dedicated hardware provisioned ahead of time. But with e.g. lambda's that is less of a thing and the code gets provisioned to hardware just in time.

Building a docker registry against ipfs probably isn't rocket science and it wouldn't surprise me if that hadn't been done already. Scripts in a browser are just files, so they could come from ipfs. Sessions are often ids or hashes identifying something in a DB. That could be a file in ipfs instead or in a distributed db on top of ipfs. Not saying all of that is a good idea but ipfs can be a bit more than just a file distribution mechanism.


The dynamic part can be done client side, and state can be stored as new permanent (so long as it is mirrored) data.


For interested Bay Area folks, we're hosting our monthly decentralized web meetup at the Internet Archive tomorrow night, and Jeremy Johnson from IPFS will be discussing some of their recent stuff. Details and tickets here: https://www.eventbrite.com/e/decentralized-web-meet-up-ticke...

(we ask for a small donation to cover food & facilities, but if this is a hardship for you, e-mail me and I can help out)


> For interested Bay Area folks, we're hosting our monthly decentralized web meetup

For some reason this reminded me of the scene from Stranger Than Fiction where she's teasing Harold, "Anarchists have a group? They assemble? Wouldn't that defeat the whole purpose?"

(I don't suppose I know how you would handle a distributed meetup, unless IPFS has some facilities I'm unfamiliar with...)


By using a decentralized communication protocol like matrix?

matrix: https://matrix.org/blog/home/


Client: www.riot.im

Not affiliated, just a fan plugging.


Jeremy Johnson from IPFS will be discussing some of their recent stuff.

Heh, emphasis on "some," since he's only talking for 15min. ;)


Still nice, but 15 mins will barely scratch the surface I'm sure.


Wish I could make that. If someone gets a video, can you please post the link back on this thread?


Previous HN discussion on IPFS whitepaper and protocol https://news.ycombinator.com/item?id=16430742

Based on the review at https://muratbuffalo.blogspot.com/2018/02/paper-review-ipfs-...


Can someone explain to me how IPFS is superior to BitTorrent from a usability perspective?

I get all the additional functionality IPFS comes with, but it just feels too complex to get adopted. I can go out right now and start interacting with BitTorrent networks, but I don't even know where to get started with an IPFS system


BitTorrent started out without a DHT, and so requires metadata and a torrent file to start downloading.

IPFS has a unique, global DHT as a matter of principle, and so, only requires the hash of the file to start downloading. Everyone gets the same bootstrap nodes, and then is part of the same DHT, with a common lookup mechanism relying on Kademlia.

On the other hand, I feel like IPFS is not as advanced as dat in relevant ways. IPFS is not good for file edition, while dat has systems that simplify modification of even large files and merging of structured files.


Bittorrent transport layer, peer-to-peer is not encrypted, neither was IPFS last time I checked. Is it now? Has that changed?

There is nothing stopping Bittorrent clients from agreeing on "this dht will be global", and similary, what stops IPFS clients/peers from forming a different DHT? Nothing as far as I know, not like scuttlebutt which has a transport-layer network-wide key.

What about trackers, bittorrent can but does not work well without trackers, so its not really according to me, decentralized, since some peer is more equal than another peer.

IPFS is meh, datproject is where its at.


IPFS has perhaps 2,000 times the resources ($) of dat. (Raised via the filecoin sale.) Surely that must count for a little.


Considering how much resources, github starts, and fanbois IPFS has, and yet the network layer is still shit, bandwidth not utilized to fullest.

Instead of just reusing bittorrents piece-per-peer-exchange-protocol (that greedy not perfect algorithm, forgot its name) - which can saturate even high end links, they invented bitswap, full with bugs, then decided lets make filecoin instead, like a bartering engine of exchanging pieces.

datproject also can saturate my 250mbit connection.


Thanks for posting. Never heard of DAT before but it looks to be exactly the BitTorrent alternative I've been looking for


Check out hyperdb and hyperdrive (implementations of the core replicable data structures used in dat)!


One more note: dat's implementation is quite modular, the hyper* libraries are designed to be replicated in a network-agnostic way.

(Theoretically you can do this with IPFS too, but I love how dat is implemented!)


I'm trying to get a foothold in this space (this space being whatever the area of camlistore/perkeep - ipfs - dat - etc.

I've also been excited that Dat seems to be investing a Rust implementation that would presumably make Dat accessible from any language that can interoperate with C/Rust.


Yeah, I'm keeping an eye on the rust implementation of hyperdb!


It might make it count for less, considering the liability it now has to make investors whole


It was an ICO, there are no such liabilities.


It should, but it might end up not.


BitTorrent has Infohashes which allow you to download a torrent merely by entering the hash (not to be confused with magnet links, which may also be just pointing towards the infohash)


True, but one big caveat with BitTorrent is that if you change 1 byte in the whole (often multi-file) thing the info-hash will be different and you'll end up in a new DHT entry all by yourself. With IPFS you'd still share all the other blocks of content with the whole network, only the modified block will be “yours alone”.

Where this happens to be particularly annoying with BitTorrent are aggregate Torrents: You have a bunch of separate files available for download (think TV series or books) each forming its own DHT node. Then somebody decides to create an aggregate torrent (such as libgen's “1000 books” torrents) and offers that for convenience or efficiency reasons. Will the clients that previously downloaded the individual files share them with this new torrent? No! Because the new Torrent “is different” even if each of it's files are byte-by-byte identical.

Could BitTorrent be upgraded to allow for this? Probably, but it'd be radical paradigm-shift requiring a redesign of everything (“BitTorrent 2.0”). So why not start from scratch with a new protocol entirely?


There are actually several protocol updates in the works (and some fundamental work like pubsub has been finished) to enable torrents that can be upgraded, it doesn't require a radical paradigm-shift at all.

A protocol that does what torrent does but better would be dat://, IPFS is terrible at being BitTorrent but better.


I've started using the `ipfs add` command to share files that are too big to email. The users can just download the file from a gateway like ipfs.io without installing anything, and the file is only hosted on my laptop.


This should get you started: https://docs.ipfs.io/introduction/usage/

It's quick and easy. I recommend the browser extensions.


Excellent resource, thank you for sharing.


https://d.tube is probably the best open example of ipfs thats fairly easy to use.

Otherwise, its first and foremost for geeks, then other geeks implement apps for the normals using the technology.


We (qri) have built a front-end webapp (free & open source) for publishing, sharing, and versioning datasets on IPFS. When you download qri, you're working on IPFS (distributed web).

Still very geeky, but we're trying to make sure it's very simple to use for data scientists/analysts/researchers alike.

here's more on us: https://qri.io & https://github.com/qri-io


D Tube actually runs on top of Steem Blockchain and uses IPFS as a Filesystem.

This is similar to how Tron will work.


I know that, you don't need steem to view videos on the site, so its a good accessible example of an easy use site that uses ipfs.



IPFS is like if BitTorrent had a single swarm and the hash chose a set of files to download from that.


They're not mutually exclusive. There is a popular private tracker that uses IPFS to host screenshot previews, while serving the actual videos via torrent.


There is a single global swarm.


I'm much more familiar with Bittorrent than IPFS, but as I understand it IPFS treats each object as its own swarm. This is quite the opposite of having a single global swarm. In Bittorrent terms it's as if each piece and the torrent itself are all separate swarms.

What makes IFPS more global is that they have defined a set of standards for linking objects together to form potentially complex structures. The holy grail is to scale this up to create a whole new World Wide Web based on linked IFPS objects rather than linking to resources on specific hosts.


Its still not clear to me if its true, but my impression from reading the whitepaper a couple years back was that you could fetch hashblocks for your target file, from any arbitrary node that happens to be serving it, regardless of what the original target was. Eg a block composed purely of white pixels can be equally shared across thousand of gifs, independent of the original file.

Which if true would make it more of a global swarm

Oddly, it would also imply you could potentially generate entirely new files out of the network; in an infinitely large network, you wouldnt even have to specifically upload a new file — the hashtree would be enough, and the network would find the pieces to compose it

Never got anyone to confirm/deny my understanding of it, and its not clear to me why it wouldnt be the case, if it weren’t.


Given that there would never be an infinitely large network, how likely is it that this sort of de-duplication would offer benefits? We use hashes to ensure data integrity because if a single bit is changed in the data the hash will reliably change. GIF compression already reduces "256 white pixels in a row" to a few bytes (other compression algorithms aren't so straightforward). Even with all the white gifs people are sharing online, you'd need vanishingly small block sizes for this strategy to achieve anything measurable, approaching the point where the hash doesn't just represent the block, but is the block.


It's more of it's not clear to me why you wouldn't implement it in this fashion, my impression from the paper was that it was implemented in this fashion, and as a side-effect of that implementation, this kind of de-duplication would be possible somewhat by accident.

More specifically, its not clear to me what benefit exists to associating a block with its originating file; my understanding of bittorrent requires you to receive/send out the total list of blocks anyways (the .torrent file), as you're requesting the missing blocks (and at start, thats all blocks), and others are responding with blocks the have to offer (which may not be all blocks, since you can request from partially-finished peers aka leechers).

In IPFS, you don't have an associated server (a tracker) to tell you who your peers are for a given file, so first you'll ask around for the list of blocks required for that file, and then you'll ask anyone who responded for your missing blocks, same as bittorrent. But you'll have to keep asking for new peers, since you aren't being updated by the server, and sending out the list of missing blocks all the same.

It seems to me that there's not much benefit to requesting specifically the file, and then the blocks, rather than just requesting the blocks directly, when doing that updated request.

And if you're asking for the blocks directly... then that kind of de-duplication would be a natural outcome.

Also a spot where it might actually be impactful today -- Memes.

And if its truly an interplanetary mindset, there might come a point long into the future where such a feature would be an actual feature, rather than an accident. But regardless, I only mean it as an implication of how I believe IPFS is implemented, not an intended benefit/feature of it. I specifically mean it as an easy way to point out where my understanding failed, if I'm wrong; or an easy way to verify my understanding is correct ;)


I wrote the original data exchange (bitswap) code. Your first paragraph is true.


> Oddly, it would also imply you could potentially generate entirely new files out of the network; in an infinitely large network, you wouldnt even have to specifically upload a new file — the hashtree would be enough, and the network would find the pieces to compose it

It's vanishingly unlikely that this would yield useful results in practice.

In a moderately-sized network, however, if only a few contiguous bytes in a GB file change, then only a single block would need to be uploaded. (if the original file is sufficiently available already)



Are there any easy-to-use clients that interface with IPFS?

Is it possible to put a site on it that does any kind of computation that is not frontend JS or that stores state in some fashion?


We (qri) have built a front-end webapp (free & open source) for publishing, sharing, and versioning datasets on IPFS. When you download qri, you're working on IPFS (distributed web).

here's more on us: https://qri.io & https://github.com/qri-io


My excitement about IPFS is tempered by this issue:

https://github.com/ipfs/go-ipfs/issues/3429

I was running IPFS on a VPS last year serving no content, and was consuming about half a gig a day in bandwidth.

Is this a fundamental issue with the underlying concept?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: