User:Fæ/Flickr

From NC Commons
Jump to navigation Jump to search
President Cyril Ramaphosa receives J&J Coronavirus vaccination. GovernmentZA Flickrstream.

Overview

A generalized Flickr upload script is used to upload from any suitable Flickrstream of photographs.

Filenames are created using the Flickr title truncated to around 220 characters. Each filename has the Flickr "Owner" name and the Flickr unique photo ID in it, as titles from Flickr are not unique.

The copyright license is generated directly from Flickr as declared against the photograph at the time of upload. This license may be changed later on Flickr, but the release at the time of upload is irrevokable so is safe for reuse.

The upload process checks for albums (sets) in the Flickrstream, and uploads these sequentially. Due to Flickr's ordering, this means that the most recent photographs will upload first. The default is to upload all photographs with a copyright license suitable for NC Commons. Each album has an automatically generated category based on the album title and this itself is categorized in a parent category under the Flickr account name, e.g. Category:Photographs by GovernmentZA.

Technical

The uploads run in Python3 using Pywikibot and the flickrapi python module.

Processing of the Flickrstream is linear, but the NC Commons upload takes advantage of multiprocessing so the files are not queued. The Flickr API throttle is 3600 per hour which should not be a practical consideration if only one batch upload project is run at a time as the number of photos per minute is observed between 20 to 50 and the number of API accesses is counted as 2 per photo, getInfo + getSizes, and 1 per walk_set (i.e. album list). Exceeding the throttle is expected to result in the FlickrAPI key being automatically 'expired'. Consequently, a monitor of API access rate is maintained and the run is throttled to ensure the rate stays below 3,300, meaning a photo upload rate automatically below ~1,500 per hour.

Source code.

Photos in no sets

It is possible that photographs in a Flickrstream are not in any albums (sets), making them harder to categorize. These have an added difficulty of finding them, as the FlickrAPI only has methods to find photos not in any set for your own Flickrstream but not externally. Consequently, these photos can only be discovered by running a complete photo search, testing each one for which sets it is in. Photos of this type are added to a 'No album' subcategory like Category:GovernmentZA - No album.