Open Cricket is a new project by Devanshu Mehta.
- Short term: build better tools to collect and analyze data from openly available databases such as statsguru.
- Long term: build a free and open database of cricket statistics, that is maintained by the community and owned by no one.
Everything generated by this project will be released under a free and open license, such as GPLv2 or Creative Commons, so that there are few restrictions on reuse.
For the moment, it's just Devanshu Mehta. I am a cricket-blogger, hacker, and a believer in a simple fact: our culture should not be locked up in private silos. More on that in a moment.
In short order, I hope that Open Cricket will be more than just myself. I hope to grow it into a self-sustaining community of people who have a head for statistics, and some technical skill that can be applied to building tools that anyone can use.
We want to jump start amateur cricket staistical analysis. At times, the statistical analysis of cricket seems to be a bigger passtime than cricket itself. Firing up statsguru to settle a debate–or to start one–is a practice older than statsguru itself. In another time, in another place, I used to pull out old copies of Wisden from the British Library in Ahmedabad.
While the internet has democratized the statistical information and analysis, it is still held in a few places with tight controls on what questions you can ask. Don't get me wrong, I love statsguru. And if statsguru is all you ever need, and you count on ESPN maintaining it forever, then godspeed.
But there are so many questions that statsguru cannot even ask. And even if we could ask the questions, there are so many answers that statsguru simply does not have.
Open Cricket will start by providing tools that make statsguru and other internet databases better. But ultimately we want the community to own the statistics. It's hard, perhaps impossible, but we'll try.
Some possibilities, in the near-term:
- A Chrome extension that makes statsguru easier to use (e.g. shortcuts to save this table to excel, filter this table to exclude Zimbabwe/Bangladesh, find this scorecard on Cricket Archive)
- Go take a look a cricsheet.org. Stephen has amassed a data set of ball-by-ball information from many, many cricket matches. But not all matches in history have recorded ball-by-ball data. One short-term goal is to come up with a Cricsheet style data format for matches that don't have ball-by-ball data.
- Speaking of Cricsheet, another goal is to work with Stephen at Cricsheet to develop a set of tools for easy analysis of his data.
- A database that simply matches Cricinfo match/player IDs to Cricket Archive match/player IDs.
- A app for scoring matches, that stores the data in an open sharable format.
- Crowd-sourced databases of stadium details, match anecdotes, unusual statistics.
Glad you asked. Do you have code, spreadsheets or a database that would be useful to the community? Would you like to work on any of the projects listed above? Would you like to suggest a project of your own? Doesn't matter how big or small your contribution is, GET IN TOUCH!
- Mailing list
Programmers, spreadsheet gurus, and stats geeks, to start with. Obviously web-based programming will prove more useful in being able to share the result, i.e. it's best if our users can use the tool through their browsers. We don't want folks at home installing gcc on their machines. Folks could get hurt.
In the short term, any data we can generate will become stale without maintenance. But any tools we can build for the community will be useful forever.
- Pappu Bahry’s Cricket Statistics database
- Russ Degnan’s Cricket Statistics database