Is there any way to extract movie ratings vs time from your data-set? I've subjectively noticed a few patterns in movie ratings, and would love to see quantitative confirmation.
1. Rating quality tends to spike at or shortly after a film's release, but then drops off gradually as the glow of the PR campaign wears off, or perhaps because people who go to see movies in their opening week are more predisposed to like them (e.g. fanboys, etc.). I usually expect a film that has just been released to be similar in quality to a film that's been on home video for a while with a rating half a point or more lower. (I am ashamed to admit that I watch enough movies that I can usually peg it's IMDB rating to within a quarter of a point independently from how much I personally enjoyed the film. )
2. Rating quality for older films tends to be held back by poor quality home video releases and often improves significantly, but gradually, after quality transfers are released. e.g. I distinctly remember seeing Zulu several years ago and thinking, "Gee, that film was way better than the IMDb rating!". Back then, it was in the high 6 range. Now it's up to 7.8!
I suspect IMDB does not keep track of times associated with votes, or at least does not provide that data publicly, so the best you could do would be to crawl IMDb periodically. It shouldn't take more than a year's worth of data to see if point #1 holds up. #2 is a lot more difficult because the effect is more gradual, and you'd need to start bringing in other data sources, like audio/video quality ratings from home video review sites that do tend to be somewhat unreliable. I find #2 to be a question of interest though. If you found any other form of correlations with titles whose ratings improve substantially after quality home video releases, you would have potentially found a way to identify under-appreciated films. If you pulled that off, in addition to discovering some good flicks to watch, companies like Kino and Criterion would probably start knocking at your door!
> Is there any way to extract movie ratings vs time from your data-set? I've subjectively noticed a few patterns in movie ratings, and would love to see quantitative confirmation.
I'm not the OP, but I'm familiar with the data IMDb provides. The short answer to your question is: not easily.
IMDb provides a plain text dump of a subset of their data here: ftp://ftp.fu-berlin.de/pub/misc/movies/database/ --- it's updated (usually) once a week. So to get temporal data, I guess you'd have to track it yourself.
However, diffs are provided for each data update: ftp://ftp.fu-berlin.de/pub/misc/movies/database/diffs/
It looks like this could give you temporal data at the granularity of once per week.
1. The timestamps are only years so I cannot use at least the information that IMDB provides for such analysis. That would be quite interesting, though. Also, what time of year that movie is released and how does that affect to the movie's overall success could be another interesting point.
2. That is a good suggestion but I think we need more votes and more importantly the users whose votes could be taken as a basis of quality of the movie.
Is there any way to extract movie ratings vs time from your data-set? I've subjectively noticed a few patterns in movie ratings, and would love to see quantitative confirmation.
1. Rating quality tends to spike at or shortly after a film's release, but then drops off gradually as the glow of the PR campaign wears off, or perhaps because people who go to see movies in their opening week are more predisposed to like them (e.g. fanboys, etc.). I usually expect a film that has just been released to be similar in quality to a film that's been on home video for a while with a rating half a point or more lower. (I am ashamed to admit that I watch enough movies that I can usually peg it's IMDB rating to within a quarter of a point independently from how much I personally enjoyed the film. )
2. Rating quality for older films tends to be held back by poor quality home video releases and often improves significantly, but gradually, after quality transfers are released. e.g. I distinctly remember seeing Zulu several years ago and thinking, "Gee, that film was way better than the IMDb rating!". Back then, it was in the high 6 range. Now it's up to 7.8!
I suspect IMDB does not keep track of times associated with votes, or at least does not provide that data publicly, so the best you could do would be to crawl IMDb periodically. It shouldn't take more than a year's worth of data to see if point #1 holds up. #2 is a lot more difficult because the effect is more gradual, and you'd need to start bringing in other data sources, like audio/video quality ratings from home video review sites that do tend to be somewhat unreliable. I find #2 to be a question of interest though. If you found any other form of correlations with titles whose ratings improve substantially after quality home video releases, you would have potentially found a way to identify under-appreciated films. If you pulled that off, in addition to discovering some good flicks to watch, companies like Kino and Criterion would probably start knocking at your door!