This work lacks grounding because the measure of interest, Alexa rank, is not situated in measures that are familiar.
The author, dmor, is not familiar with the semantics of the Alexa rank. Nor are most readers. We know loosely that low rank is good. But what does going from rank 1000 to 100 really mean? How hard is that? Is it a traffic increase of 10x? 100x? What exponential base does it follow?
So this chart shows the YC companies that have the largest Alexa rank delta. Frankly, the rankings look plain wrong to me, based upon what I know of these companies.
Other commentators have suggested that, given the power law distribution, we actually care about the delta log-rank. A priori, I would agree. But then we get a new ranked list, and we still don't know if we're actually learning something or our log-metric is messed up. That's actually quite pernicious, if the results look vaguely correct at a coarse level, so we trust the results at a fine-grained level, and don't realize that the methodology and results are wrong.
We're all just blowing hot air because we don't really know what Alexa ranks mean. Only SEOs who work with Alexa ranks frequently, understand the undocumented warts, and have a gut sense of what they mean, can interpret this.
But right now, we really have no point of reference and the table leaves us without having gained any insight.
Perhaps I'm actually happy that the results are so clearly wrong and not helpful. At least that way, no one will trust them. It would be much more insidious if the results were commonly thought to be instructive, but in fact were misleading.
While I understand your point, I disagree with the conclusion. Simply because a system is complicated that doesnt mean that we should give up. You create a first order approximation and then iterate (very much like how it seems dmor is trying to do). If you know something about how the Alexa ranking system works that we do not, then by all means share it. Simply saying that the author does not know and therefore all conclusions are useless is unhelpful and probably incorrect. To be honest, this reminds me of the elderly who dont understand computers and just give up without trying.
When you don't have deep (insider) access to key metrics, you have to rely on other indicators of success. Alexa provides one piece of data that is semi-meaningful: if nothing else, I'd wager it's roughly correlated (on average) with market cap and/or exit price. (dmor: this would be a cool graph!)
Plus, these companies are in drastically different markets. Their key metrics are probably quite different, or (at the very least) not directly comparable to growth in value / profitability.
Alexa provides one piece of data that is semi-meaningful:
That's my point. This measure is semi-meaningful.
How meaningful is it? Very? Somewhat? Well, we don't know.
The way you determine how meaningful it is, is by connecting the dots with another measure. For example, we have this measure that is cheap to acquire but perhaps inaccurate (Alexa). Can we connect it with another measure that is harder to acquire but more accurate (e.g. exit value)?
if nothing else, I'd wager it's roughly correlated (on average) with market cap and/or exit price. (dmor: this would be a cool graph!)
I would wager that too. But we don't know until you run the numbers.
Grounding your measures is what separates cool hacks from data you can actually draw meaningful inferences from. I think dmor is trying to do something real here, which is why I think it useful to help her actually push the ball forward and really make something much more valuable.
I agree that exit value / market cap is a good auxiliary measure. I think I also suggested this in another comment.
It sounds like the value may be in who is on the list and growing. Maybe the order is not as important as the fact that these are all companies worth watching?
Let me say first of all that I have great respect for the way you have engaged with criticism on this post and on your previous ones. Even when I disagree with your approach or whatever, I appreciate your curiosity and eagerness to learn.
Some ways that you could make weaker claims but with more confidence in your results:
* Have an unordered list of companies that are growing. This was your suggestion. I don't know how interesting that is.
* Group companies into five or ten buckets, based upon Alexa ranks. (i.e. 1=low traffic bucket, 5=high traffic) Find the top three movers-and-shakers within each bucket. This makes the weaker assumption that Alexa ranks deltas are comparable within each bucket, as opposed to your original assumption that they are comparable across the entire spectrum of Alexa ranks.
There are a handful of other things you can do that are more complicated. For example, if you can correlate Alexa ranks with Compete unique visitor estimates or some other number (company's exit value). Compete estimates are also biased, but at least people have better intuition of what unique visitor numbers mean.
Thank you, especially for hearing my earnestness through all the noise - I'm glad it comes through. Without revealing too much, I can tell you that I have a much varied and interesting data set to use to rank companies for the next full YC index. I expect there will be feedback on the method for that one too, but with every iteration it is getting massively better.
Crazy idea - javascript startups can put on their site to report their actual data to me?
I basically agree with this, putting an order on this list might have irked some people, but a list of companies to follow in no particular order might have worked better.
Comparing absolute change in Alexa rank seems fundamentally broken to me. It weights massively to early companies.
Snipshot's rank rose by nearly 80,000 - that's undeniably good, but 37 of the companies on the list had a previous rank of less than 80,000...
I would suggest comparing the delta of the log of rank, this gives a more interesting (to me) metric...
For example this bumps Newsblur up from a "meh" 23 to 1! And AnyPerk from an exciting 6 down to a not so cool (unless you like catches) 22 (not picking on AnyPerk, it's just the most demoted of the original top 10)
Under this change of methodology the biggest winner is WorkFlowy from 61->24, biggest loser's are Circle (39-67); FundersClub (40-68); and Cloudant (44-72)
Presumably, the sites on Alexa follow a power law (based on traffic), so deltas on log-rank would indeed make a lot of sense. Of course... the data itself is quite interesting. Mad props to Danielle for compiling all of it -- this makes running our own quick calcs that much easier.
I considered it but was on the fence so I wasn't going to use log this first time, but it sounds like that really might be the best approach. Give me a minute and I'll make a spreadsheet you guys can check out to see if it makes more sense.
App store rank to download volume follows a power curve almost exactly, I think it's safe to assume similar here. I was going to make the same recommendation as beambot.
I think that its both. Current traffic is like position and the delta is velocity. To know who is bigger now, the velocity is relatively unimportant. To try and guess who is going to be bigger next year, current position is less important.
This is cool and all, but you probably want to compare April of this year to April of last year (and previous years). My company (Virtualmin, 76 on the list) has pretty big fluctuations throughout the year, with early months (specifically January) being the best, and December being the worst. So, if you were to compare December to January for us, you'd see a huge spike upward...but if you compared January to February, you'd see a small drop downward (and that trend might even continue throughout the year...though it seems to not be happening that way between March and April).
I'm just saying this is a pretty limited view of these sites, and it'd be difficult to draw useful conclusions from it. Our traffic probably isn't growing faster than more recent YC companies, even if we show up above them on the list. We're a pretty stable entity at this point. Not to say I didn't get a kick out of seeing our numbers getting better at a pretty rapid clip, even knowing more detail about what our traffic actually looks like.
I absolutely agree with you regarding seasonality, I simply don't have a snapshot from a year ago. But a year from now I will be able to do this now that I am collecting the data each month.
Thanks for putting this together.
Also if you can't get your hands on the data from 2012 maybe taking the launch date vs their current Alexa position and averaging this out to see their Alexa growth over their lifespan. That would be really nice to see.
I don't think growth in Alexa rank is a good metric, because it does not measure growth as typically defined (in %age increase in raw traffic). Unfortunately that makes the entire table mostly meaningless.
I don't see how traction relates to traffic. Traction rates to financials. Traffic relates to momentum, which is more of a measure of visitors, and not of growth. Not a bad data set per we, either. I'd love to see this type of approach done with each company's growth, but I am sure most will shy away, including the YC founders.
Anyone can pop up their alexa ranks in the short term, but long term you still need retention, which equals traction. Typically higher the rank, the better you're doing.
No, that is not a truism. It is a falsism if it is any kind of ism.
Instagram has yet to make a single dollar, for an extreme example, and yet nobody would argue it has gained insane amounts of traction since in the mobile photo sharing space since launch. I would use Facebook as an example, but they had small revenue generating efforts early on so it doesn't make as clean of an example.
Not necessarily, in fact that was was of the fallacies of the dot com bubble. Revenue is traction if it is gained at a reasonable cost. There are definitely unit economics that look like growth but come at an unsustainable cost.
I think traffic for lists like these is something of a red herring. Small fluctuations for low-traffic sites are almost entirely a result of randomness or sampling error.
A better metric would be to divide the companies by traffic into nontrivial traffic(Alexa Rank<10,000) and trivial traffic for the rest. It would require some more research, but it would also be very helpful to separate consumer vs B2B companies and only compare consumer companies on traffic. Many of the companies on that list(including my company MixRank) appear to be doing quite poorly in terms of traffic until one realizes that each visitor to a B2B product is worth orders of magnitude more than a visitor to a consumer product.
The problem is that the results are relative. One way would be to pick a keyword with a relatively high search volume which could be used as baseline. Another issue that I can think of is that companies named after common english words i.e. Pebble will have their results artificially inflated. This process might be hard or tricky to automate.
Growth in followers/likes/tweets/+1 counts could be used as a measure of traction.
I like most of your lists and creativity, but this one isn't very helpful and it feels like you're reaching. After a site cracks the Alexa top 10,000, it becomes a lot harder to move up and should be worth a lot more on rankings.
Since everyone else seems to be finding a reason to be critical, I'll just say: I found this really interesting, and I'm glad you presented the data this way so that I could make my own analysis. It wasn't to say that #1 on the list had made the most progress (although who that was was surprising) but rather than these are the companies from YC that are growing in terms of traffic. Make of it what you will.
Quite interesting that Dropbox jumped an entire spot in a month for a non-content based site.
Can you sort by the delta in monthly pageviews? Absolute change in rank offers little meaningful information. (I would assume is much easier to go from 10,000 to 5,000 than 200 to 100.)
And going from 300,000+ to 100,000 is orders of magnitude easier than going from 10,000 to 5,000.
#1 on this list went from 2xx,xxx to 2xx,xxx - that could seriously be just noise in the Alexa ranking methodology. That sites traffic could have even gone down slightly and still produced that result.
Alexa rank is not accurate when it's higher than 20,000. If higher than 100,000 it is not accurate at all and can be gamed easily. Just get a few friends with the alexa toolbar to visit your site and your alexa rank will magically get a whole lot lower.
It would be nice to see a second tab on that spreadsheet, without any filtering. In particular, I'm curious to see where reddit would rank on that list. It seems arbitrary and incorrect to consider them less of a startup than, say, DropBox.
I feel pretty good about this given that our biggest source of traffic is the Google Chrome Web Store and use of our Chrome application doesn't show up in Alexa rankings. We're fairing way, way better than Alexa would have you believe.
9gag's global traffic rank dropped from #328 to #329 in the sample period. The list does not include companies whose traffic went down, even if they are in the top 250,000 of all sites globally. I included an explanation of this in the post.
These ranks are so high that any metric based on them must surely have a lot of noise. I guess that doesn't prevent dmor's voting ring from surfacing this post, though.
These are great and I look forward to you iterating on this - but I fear your over-reliance on Alexa is producing semi-worthless results. I can go into all of the reasons why Alexa data is unreliable but I assume you already know (the big one is selection bias based on who has the toolbar installed).
If you combine Alexa data w/ data from Quantcast, Statcounter and others, it would probably be valuable. But even Alexa themselves would tell you their dataset has massive holes in it when trying to use it as the foundation for this kind of thing.
I hear you, and for the startup index we are now collecting 9 data points and v1 of the new one is a labor of love that I will hopefully have out in the next week or so. Iterating like a mofo :)
The author, dmor, is not familiar with the semantics of the Alexa rank. Nor are most readers. We know loosely that low rank is good. But what does going from rank 1000 to 100 really mean? How hard is that? Is it a traffic increase of 10x? 100x? What exponential base does it follow?
So this chart shows the YC companies that have the largest Alexa rank delta. Frankly, the rankings look plain wrong to me, based upon what I know of these companies.
Other commentators have suggested that, given the power law distribution, we actually care about the delta log-rank. A priori, I would agree. But then we get a new ranked list, and we still don't know if we're actually learning something or our log-metric is messed up. That's actually quite pernicious, if the results look vaguely correct at a coarse level, so we trust the results at a fine-grained level, and don't realize that the methodology and results are wrong.
We're all just blowing hot air because we don't really know what Alexa ranks mean. Only SEOs who work with Alexa ranks frequently, understand the undocumented warts, and have a gut sense of what they mean, can interpret this.
But right now, we really have no point of reference and the table leaves us without having gained any insight.
Perhaps I'm actually happy that the results are so clearly wrong and not helpful. At least that way, no one will trust them. It would be much more insidious if the results were commonly thought to be instructive, but in fact were misleading.