Microsoft Research Outreach team worked with the community to enable adoption of cloud based research. As a result they have launched Microsoft Research Open Data, a new data repository for the global research community. Microsoft wish to bring processing to the data rather than rely on data movement through the internet. This useful addition allows the data sets to be copied directly to the Azure based Data Science virtual machines. More details can be read here. The aim is to provide anonymized curated and meaningful datasets that are findable, accessible, interoperable and reusable. This follows on from the data-intensive science fourth paradigm of discovery discussed by Jim Gray.
The open data set categories can be seen below.