1. If you're only interested in saving bandwidth and don't care about cache hits from overlapping with other sites, maybe you can host static content somewhere free (GitHub Pages?) or even just set a long cache header (ensure version numbers in filenames, cache for > 1 month) since presumably you're going to serve them the first time before the user has answered anyway?
2. I'm thinking of putting a "Click to load comments" box in place of Disqus on my blog so nothing gets loaded unless the user clicks. Seems better than bothering the user up-front.
3. I use Google Analytics - I figure it's common enough that if people don't like that, they'll already have it blocked, so there isn't really any additional tracking they won't want (unless the twitter timeline widget is tracking; which it might be, but I suspect I'll remove it soon anyway).
1. That's a possibility - though any time you're sending off to a third party for content, there's no way of knowing what they're doing around cookies and browser fingerprints across their properties. A step up from running scripts loaded from those sites though. And yeah, the default will be serving the content locally until explicit consent is received.
2. I like that idea. I also kind of like the idea of just not using comments - when I used disqus years ago it was mostly spam - but I think I want to try again and see if it's worth it.
3. Also a good point, but that only accounts for those people who are aware of the tracking as a point of concern. Given that the blog will be technical with a personal bent and vice-versa, one or two of my ten readers may not be aware of tracking as a thing :)
On this front, though, I'm probably just going to start with log analytic tools. It's really the only way to get a fully accurate picture across visitors (server side logging can't be blocked, but GA and even self-hosted data gathering can), and I don't really care too much about the additional info that analytics can provide.
2. I use Disqus and don't actually get much spam (maybe 1 spam post every 6 months, and it always gets flagged by Disqus) but there is often useful stuff in the comments. I think my blog would be much worse without the comments (and I wouldn't get the occasional "Thanks!" comments, which help me know that my posts aren't useless) :-)
Given the option, I would probably also just parse logs - I don't think Analytics is adding much on top of that; I just don't have that option using GH Pages. The reason I moved from AppEngine to GitHub was to stop messing with the code for my blog in an attempt to make me write more posts instead! =D
1. Serving from github still shares the tracking information. It can be argued that github is better than cloudflare/facebook, however bear in mind github has politically motivated staff. Long cache is a great idea. Alternatively cut out unnecessary js.
2. Nice idea, it does hamper the ease of use of your blog though - I would never click to view, though I did read some that were visible when I finished the article.
3. Do you find the information from this useful? In a way that isn't trivially parsable from server logs? I ask because we are reviewing the quality of our user analytics, and our ga seems rather pointless atm.
1. Good point; I'm not really sure where I was going with this now; GitHub and another CDN are basically the same. I must've been distracted while replying!
2. Yeah, it's not ideal. In this case, it looks like Disqus are gonna fix stuff though (they've commented on my post; there's a link right at the top of the article now).
3. I don't have access to the server logs as I'm running on GitHub Pages, so something like Analytics is all I have. I do find it useful (given no server logs), it's nice to see the traffic to my blog; there's no point posting if nobody is reading! :-)
3. That is very interesting, now knowing your stack (pages + disqus + adverts) I see one side of the 'problem' is that bloggers don't have much choice in terms of revenue, so the infrastructure charges with user data . The other side is likely the complexity, incompatibility, and time wasting of home rolled solutions.
The really nice part of a CDN deployed blog is handling the traffic spikes though.
They receive a large amount of internet traffic and have the potential ability to fingerprint users and subvert privacy protections. AFAIK they don't do anything malicious, but I don't know they don't.
In fact I would say CloudFlare are better than both GitHub and Facebook, and I am only wary of them because of their position of power and the potential they have (ie. they are a victim of their own success). Both Facebook and GitHub have shown themselves to make political decisions at the expense of their users.
Depends on the definition of wrong! CloudFlare is a bit of an HN darling thanks to their employees' active contributions and submitting every technical post on their blog. Free distributed DNS and potential DDoS protection is also a tempting offer.
To privacy-conscious users: CloudFlare is the man-in-the-middle for more and more of the Internet, potentially tracking at Google-like levels.
CloudFlare may: ... Add script to your pages to, for example, add services, Apps, or perform additional performance tracking. (Unfortunately this is opt-out rather than opt-in.)
To Tor users: CloudFlare implements a captcha to protect servers from malicious traffic; the implementation has caused tremendous annoyance in the past and the company may have been slow to address this problem.
To CloudFlare customers: CloudFlare has a "target on its back" and has faltered against DDoS in the past, causing outages for all of its customers. AFAIK: It's been a while.
To CloudFlare freeloaders like me: CloudFlare doesn't have much incenctive to protect its free-tier users from DDoS.
Ah, thank you for the detailed reply. I started using CF more extensively yesterday, due to their free CDN (which is working great), but I agree that their MITMing the internet is worrisome. Maybe I should switch to MaxMind, if it's cheaper than CloudFront.
Like Ghostery, it is important to be aware of the cons but I'm still using CloudFlare.
In my book CloudFront easily ranks ahead of had-been "do no evil" Google's irrevocably merging it's entire history on me ex post facto. https://news.ycombinator.com/item?id=12760003
2. This sounds off to me. Imagine if a restaurant's menu said they don't know where their ingredients come from or what they may actually consist of - that's probably true a lot of the time, but it makes the customer wonder why the restaurant brings it up but doesn't do anything about it...
2. I'm thinking of putting a "Click to load comments" box in place of Disqus on my blog so nothing gets loaded unless the user clicks. Seems better than bothering the user up-front.
3. I use Google Analytics - I figure it's common enough that if people don't like that, they'll already have it blocked, so there isn't really any additional tracking they won't want (unless the twitter timeline widget is tracking; which it might be, but I suspect I'll remove it soon anyway).