The “free” image website Unsplash – a site that remains as controversial as it is popular – has released what it describes as “the most complete high-quality open image dataset ever”. The “dataset” in this instance is essentially the keywords and search metadata of a whole bunch of images that can be downloaded in one big lump.
There are two datasets available. The “Full” version contains information for over 2 million images from more than 200,000 photographers around the world and is available for non-commercial use only. It covers over 5 million keywords and 250 million+ searches. The “Lite” dataset is limited to 25,000 nature-themed images and keywords and 1 million searches.
Unsplash, if you’re one of the few who hasn’t heard of it before, is a website that allows users around the world to download images under a CC0 (Creative Common Zero) license. Anybody can use these images for any purpose they wish, with or without attribution. Needless to say, this has caused quite a stir amongst photographers – probably most notably, Zack Arias – and it is not without its legal issues.
In 2016, Unsplash launched an API that allowed developers to talk directly with their database. This API has been incorporated into services by the likes of Adobe, Facebook and more. The new datasets are kind of an extension of that, allowing users to download the entire database locally – although the complete database is limited to non-commercial use.
Unsplash says that the data stored within is anonymised and private, except for the photographer attribution of the images and that it is made up from “billions of searches across thousands of applications, uses, and contexts”.
The database currently sits at 2 million images, after hitting 1 million last May. Unsplash says that the database of images is now doubling in size each year, and the dataset will be updated to reflect the new keywords, searches and images.
You can read Unsplash’s complete announcement here and if you want to check out the database for yourself, head here. But, be warned, it’s not for the faint of heart. You’re not just downloading a zip file full of images. You’ll want some coding experience. So, there’s also a GitHub repository where you can find out more about that, too. Also, only the Lite version is freely available. You must request access to the Full and complete database.