Aberdeenshire Settlements on Wikidata and Wikipedia

Introduction

This project was part of Code The City’s #CTC20 History and Culture hack weekend.

The Challenge

To identify (all of) the settlements – towns, hamlets, villages – in Aberdeenshire and ensure that these are well represented with high quality items on Wikidata and Wikipedia.

Aims

Identify one or more lists of settlements in Aberdeenshire
Use those lists to identify gaps in Wikipedia and WIkidata for Aberdeenshire settlements.
Create Wikidata items, update Wikipedia with a more comprehensive list of settlements and, time permitting, enhance existing Wikipedia articles with Infoboxes, and create new Wikipedia articles where these are missing.

Approach

We began by importing a list from Wikipedia  into Google Sheets using its function

=importHTML(url, item, position)

This gave a list of 183 settlements – with five having missing Wikipedia articles.

To compare, we then wrote an initial Wikidata query  which only returned 10 results. It turned out that there are two (or more) Aberdeenshires in Wikidata (each representing something subtly different) and we used the wrong one.

Amending our query and running the new one   gave us 283 settlements. On checking we saw that  they included the 10 above too. It also included whether the item had a Wikipedia article associated with it. We used this Wikidata list (with a quick python script) to update the original Wikipedia list page above.

Adding Images / WikiShootMe

We further updated the query adding whether there was a photo associated with the item giving firstly these results and, by changing the default view to map, we could see where the coordinates were placing each point. The vast majority (est. 90% ) of items had no photograph.

By following this tutorial that Ian had created recently,  we were able to create a custom clickable map in the WikiShootMe tool. This means that anyone can click on a red dot, and choose to take or upload a photo of the settlement and have that added to Wiki Commons, and associated automatically with the Wikidata item.

We published that on Twitter and asked for contributions. Not only could someone take and upload a photo, but it also meant that one could search Wiki Commons for a matching image (which hadn’t yet been associated with the Wikidata item) and tell it to use that. Where none existed it was possible to search on Geograph for a locality. The licensing on Geograph is compatible with Wiki Commons’s terms, so if a suitable image was available, we could use the Geograph2Commons tool and import it.

Over the next few days (i.e. beyond the weekend itself), we went from a starting point of about 10% of settlements in Wikidata having photos to about 90%. You can see this on an image grid, or table.

Red dots show missing photos; green, ones found
Red dots show missing photos; green, ones found

Updating Coordinates

Looking closer at the mapped Wikidata, a number of the items’ coordinates were well out (e.g. Rosehearty, Sandhaven New Aberdour etc). We started to fix these. We did this by finding the settlements in our WikiShootme map, right clicking on the correct position and selecting show coordinates, and pasting those back into the Wikidata item.

Where the original coordinates were imported from Wikipedia it raised a warning. We fixed each one in Wikipedia too, as we went. This needs much more error checking and fixing.

Fixing coordinates and uploading images
Fixing coordinates and uploading images

Missing Places

Our list of places started at 183 links on wikipedia, it grew to 283 with wikidata but still it was clear that many of the populous settlements are missing from Wikidata such as Fintry.

Fintray missing
Fintray missing

These can be added manually but we figured there must be a larger list available from another source like OpenStreetMap (OSM). Not knowing how to get this list we put out a tweet for help.

A tweet for help
A tweet for help

@MaxErickson was one those that came to our aid with a query search for overpass turbo (a web-based data filtering tool for OpenStreetMap) which listed all its identified places in Aberdeenshire with coordinates and place types (town, village, hamlet). This gave us over 780 results but many of these were farm steadings or small islands (islets) in the Ythan, with a bit of filter we got it down to 629 places. We plan to add these to Wikidata, but first it’s worth gathering more data on them.

MySociety

We wanted to add more information to these place such as which constituency each was in for Scottish and UK elections. The Boundary Commission for Scotland website has a tool which lets you enter a postcode and returns this information:

Querying the Boundary Commission for Scotland website
Querying the Boundary Commission for Scotland website

After digging around their website we found that they use mapit.mysociety api to do this. Mapit is open-source software but there is a charge for using their api, luckily CodeTheCity is a charity and eligible for free usage so Ian signed us up!  The API accepts a variety of inputs including lat/lon which we got from the turbo query of OSM.

With a bit more python scripting we now have a CSV with 629 places each listed with coordinates, Scottish Parliament region, Scottish Parliament constituency, UK parliament constituency, Health Board and Unitary Authority.

A spreadsheet of enhanced data for Aberdeenshire settlements
A spreadsheet of enhanced data for Aberdeenshire settlements

What Next?

We are going to get the csv uploaded to Wikidata via Quick Statements, to add the missing places, update existing places with Mysociety data and correct any wandering coordinates in wikidata/wikipedia.

  1. Check the Wikidata list with the OSM list for any missing places in the OSM list (ensuring that core data for each place is included).
  2. Add more information to our CSV to allow us to populate Wikipedia infoboxes for these places. This would include
    • Altitude
    • Distance from London (UK Capital)
    • Distance from Edinburgh (Scotland Capital)
    • Postcode district(s)
    • Dial Code(s)
    • Population (may be difficult for smaller settlements)
    • Area (may be difficult for smaller settlements)
  3. Update Wikidata with new places and any edits required to existing places
  4. Update Wikipedia List page as a table from this data.

Gavin Barnett and Ian Watt

06 August 2020