Aberdeen Plaques – Part One

On Saturday 14th December 2019 we ran a one-day mini hack event. The idea behind it was for people to come along for a day to work on their side projects and, if they needed support, attempt to persuade others to assist them.

That’s what I did with my Aberdeen Plaques project: something I’d had on the back burner for more than a year.

Why do it?

The commemorative plaques which are dotted around the city are a perfect candidate for open data. They have a subject, usually some dates, are located somewhere, and are of different types etc. Making that all available as open data would open up a whole range of possibilities.

Some Aberdeen plaques
Some Aberdeen plaques

If we captured all of that well then we could do analysis on the data (ratio of women to men, most represented professions), create walking routes (maybe one for the arts, one for the sciences and so on), create timelines to see what periods are more represented.

Having recently trained as a WikiMedia UK trainer – and having experimented with some of the tools (Wiki Commons, Wiki Data, Wikipedia, Histropedia) I was convinced that these were the right way to go.

Pre-event prep

So, in advance of the hack day I’d done a bit of prep in the two weeks running up to the day iteself.

I’d created a spreadheet which recorded the
* subject (person or ‘thing’)
* Gender if known
* the link to the now-retired city council plaques system (hidden from public view)
* The location if known
* The geo coordinates (to be determined)
* Whether the subject had a Wikipedia page (tbd)
* Whether there was an image of the plaque on Wiki Commons (tbd)
* Whether the subject of the plaque was represented on Wiki Data (tbd)
* Any identifiers on Open Plaques (tbd)
* Any external links (eg to Flickr for photos)

I’d then populated some of the data (eg whether there were images of the plaque on Wiki Commons) as well as some other bits. But most cells were blank.

Pre-event spreadsheet
Pre-event spreadsheet

As a keen walker and photographer I had also photographed and uploaded seventeen plaque images to Wiki Commons in the lead up, so that we would have some images to work with.

How to use our time most effectively on the day?

Our aim for the day was then to find out what data / info / images existed, fill in the gaps, and explore how to use WikiData to store and retrieve data, and how we could potentially create maps, timelines and similiar new products.

What we did on the day

At the start of the event we pitched our project ideas, and I managed to persude five others (Angela, Mike, Stephen, James and Steve) to join me in working on the plaques project.

Angela and Mike, and later Angela and Stephen would go out and take photographs. Steve, James and I would work on the data capture, completing research on what existed, creating new entries for the data on Wiki Data, and testing queries on the Wiki Data query service.

How we did it

We used the spreadsheet that I had set up to capture all of the data we’d gathered – and as it eveolved it would show progress as well as what was still lacking. We had no expectations that we would do it all on the day, but we could pick away at it in future weeks and months.

In the run-up to the event I’d discovered The Pingus’ album of plaques photographs on Flickr. Sadly these had not been published with a licence that would allow us to use them. I’d sent a request, a few days before CTC18, for them to change the licence for the Aberdeen plaques pictures to a CC-SA one. This would have allowed our republishing on Wiki Commons. Sadly it didn’t elicit a response. But the album did show that there were many more plaques than the old ACC system listed. And it was possible to get co-ordinates from them. So the number of plaques to deal with kept growing.

During the day James filled in loads of gaps in which subjects were on Wikipedia and which on Wikidata.

Steve and I experimented with capturing and querying the data. Structuring that in a way that aids recall through Wiki Data Query Service was an interative process. Firstly I tried adding a statement ‘commomorative plaque image’ (P1801) into the wikidata record for the subject as you can see in this first example https://www.wikidata.org/wiki/Q2095630. But that limited what we could do.

So, we discovered that we could create a new object which was an instance of commemorative plaque. Our first attempt was https://www.wikidata.org/wiki/Q78438703 and we evolved what we captured there – adding statement, and Steve discovered the ‘openPlaques plaque ID'(P1893). Incidentally we also tried ‘openplaques Subject ID’ (P1430) but adding that to the plaque object throws an error. The latter should be added to the person record not the plaque.

At the end of CTC18

We ended the day with

  • 138 plaques listed.
  • 57 sets of co-ordinates identified
  • 68 Wikipedia articles identified as matching plaque subjects (and eleven plaques subjects who had NO wikipedia page)
  • 36 Images in WikiCommons
  • 77 WikiData entries for the subject of the plaques (existing or created)
  • 11 new wikidata entries for the plaques themselves

This was a great leap forward in one day and would pave the way for future work.

What next?

Since CTC18 ended, I’ve got firmly stuck into this project over the xmas break. Over the last three weeks I have now photographed over a hundred plaques (plenty of walking) and have created wikidata entries for most plaques and also their subjects in wikidata.

I’ll cover all of that, and how we can now use the data in part two, coming soon.

2019 – the year in review

Intro

The year just past has been a pivotal one for Code The City, we’ve moved into a new home, expanded our operations, engaged with new communities of people, and started to put in place solid planning which will be underpinned by expansion and better governance. 

Here are some of the highlights from 2019.

Sponsors, volunteers and attendees

We couldn’t do what we do without the help of some amazing people. With just three trustees (Bruce, Steve and Andrew) and Ian our CEO, we couldn’t cover such a range of activities without serious help. Whether you come to our events, volunteer, or your company sponsors our work, you are making a difference in Aberdeen. 

Listing things is always dangerous as the potential to miss people out is huge. But here we go! 

The Data Lab, MBN Solutions, Scotland IS, InoApps, Forty-Two Studio, who all provided very generous financial support; H2O AI  donated to our charity in lieu of sponsorship of a meet-up;  and the James Hutton Institute and InoApps who also donated laptops for us to re-use at our code clubs. Codify, IFB, Converged Comms who provided specific funding for projects including buying kit for code club, and paying for new air quality devices – some of which we have still to build.

Our regular volunteers – Vanessa, Zoe, Attakrit, Charlotte, and Shibo –  plus the several parents who stay to help too, all help mentor the kids at Young City Coders club. 

Lee, Carlos, Scott, Rob who are on the steering group of the Python User Group meetup. 

Naomi, Ian N, David, and Gavin who are on the steering group for Air Aberdeen along with Kevin from 57 North who supervises the building of new sensor devices. 

The ONE Tech Hub, and ONE Codebase have created a great space not only for us to work in, but also in which to run our public-facing events. 

Everyone who stays behind to help us clear away plates, cups and uneaten food – or nips out to the shops when we run out of milk.

Apologies to anyone we have missed!


And finally YOU – everyone who has attended one of or sessions – you’ve helped make Aberdeen a little bit better place to live in. Thank you!

Hack weekends

We ran four hack events this year. Here is a quick run-down. 

Air Quality 1

We kicked off 2019 with the CTC15 AIr Quality hack in February. This saw us create fourteen new devices which people took home to install and start gathering data. We also had a number of teams looking at the data coming from the sensors, and some looking at how we could use LoraWAN as a data transport network. We set some targets for sensor numbers which were, in retrospect, perhaps a little ambitious. We set up a website (https://airaberdeen.org

Air Quality 2

Unusually for us we had a second event on the same theme in quick succession: CTC16 in June. Attendees created another fourteen devices. We developed a better model for the data, improved on the website and governance of the project. We got great coverage on TV, on radio and in local newspapers. 

Make Aberdeen Better

CTC17 came along in November. The theme was a broad one – what would you do to make Aberdeen a better place to live, work or play? Attendees chose four projects to work on: public transport, improved methods of monitoring air quality, how we might match IT volunteers to charities needing IT help, and the open data around recycling.

Xmas mini-hack

CTC18, our final hack of the year was another themeless one, timed to fit into a single day. We asked participants to come and work on a pet side-project, or to help someone else with theirs. Despite a lower turnout in the run-up to Christmas, we still had eight projects being worked on during the day.

New home, service

In the late summer the ONE Tech Hub opened and we moved in as one of the first tenants. So far we rent a single desk in the co-working space but we aim to expand that next year. The building is great, which is why we run all of our events there now, and as numbers grow it promises to fulfil its promise as the bustling centre of Aberdeen’s tech community. 

Having started a new Data Meet-up in 2018 we moved that to ONE Tech Hub along with our hack events. We also kicked off a new Python User group in September this year, the same year as we started to deliver Young City Coders sessions to encourage youngsters to get into coding, using primarily Scratch and Python. 

We also ran our first WikiMedia Editathon in August – using WIkipedia, WIki Commons and Wikidata to capture and share some of the history of Aberdeen’s cinemas using these platforms. We are really supportive of better using all of the wikimedia tools. Ian recently attended a three-day course to become a wikimedia trainer. And at CTC18 there were two projects using wikidata and wiki commons too. Expect much more of this next year! 

Some recognition and some numbers

We’ve been monitoring our reach and impact this year.  

In March we were delighted to see that Code The City made it onto the Digital Social Innovation For Europe platform.  This project was to identify organisations and projects across the EU who are making an impact using tech and data for civic good. 

In July we appeared for the first time in an Academic journal – in an article about using a hackathon to bring together health professionals, data scientists and others to address health challenges. 

We will be launching our  dashboard in the New Year. Meantime, here are some numbers to chew on. 

Hack events

We ran four sessions, detailed above. We had 102 attendees and 15 facilitators who put in a total of 1,872 hours of effort on a total of 20 projects. All of this was for civic benefit. 

Young City Coders

We ran six sessions of our Young City Coders which started in September. The sessions had a total of 114 kids attending and 28 mentors giving up two hours or more. 

Data Meet-ups

In 2019 we had 12 data meet-ups with 28 speakers and 575 attendees! This is becoming a really strong local community of practitioners and researchers from academia and local industry. 

Python Meet-ups

Each of our four sessions from September to December had a speaker, and attracted a total of 112 attendees who were set small project tasks. 

The year ahead

2020 is going to see CTC accelerate its expansion. We’re recruiting two new board members, and we have drawn up a business plan which we will share soon. That should see us expand the team and strengthen our ability to drive positive societal change through tech, data and volunteering. We have two large companies considering providing sponsorship for new activities next year.  We’ll also be looking at improving our fundraising – widening the range of sources that we approach for funding, and allowing us to hire staff for the first time. 

Open Data

We’re long-term champions of open data as many of you will have read in previous posts. We’ve identified the need to strengthen the Open Data community in Scotland and to contribute beyond our own activities. Not only has Ian joined the Civic side of Open Government Partnership, and is leading on Commitment three of that to improve open data provision, but he has also joined the board of the Data Commons Scotland programme at Stirling University. 

Scottish Open Data Unconference

Beyond that we have created, and we are going to run, the Scottish Open Data Unconference in March. This promises to be a great coming together of the data community including academia, government, developers, and publishers. If you haven’t yet signed up please do so now – there are only 11 tickets of 90 still available. We’ll also need volunteers to help run it: scribes for sessions, helping to orientate new visitors, covering reception, photography, blogging etc. Let us know how you could help. 

We look forward to working with you all in the New Year and wish you all a peaceful and relaxing time over the festive period. 

 

Ian, Steve, Bruce and Andrew

[Photo by Eric Rothermel on Unsplash\

A timeline of Female Aberdeen Uni Graduates

Background

Earlier this year Code The City held an Editathon with Wikimedia UK. The subject was the history of Aberdeen Cinemas. We ended up with 16 people all working together to create new articles, update existing ones, capture new images for Wiki Commons, and generate or enhance WikiData items. This was a follow up to previous sessions that Dr Sara Thomas of WikiMedia UK led for us in the city, mainly for information professionals.

This has led to significant interest from cultural bodies in the city in using the suite of WikiMedia platforms and tools to improve access to their collections in Aberdeen. We expect to do quite a bit more of this with them in 2020.

Two weeks ago I attended a Train the Trainer 3-day workshop in Glasgow for Wikimedia UK to become a trainer for them in Scotland.  That will see me training professionals and volunteers in how to use Wikipedia, Wiki Commons and Wikidata in particular.

In this blog post I explain why you might want to use some of the fancy features of WikiData query service, show you how to do that, using on my adaptation of others’ shared examples, and encourage you to experiment for yourself.

Wikidata

Wikidata uses a Linked Open Data format to store data. While I have added quite a number of items to Wikidata I’ve not had a chance to really study how to use SPARQL (the query language behind the scenes) to to execute queries against the data. This is done in the Wikidata Query service. This is a key skill to using some of the more advanced features. Without the means to extract data there is little point in stuffing data into it. In fact WikiData allows us to do some very fancy things with the data which we retrieve.

So, I decided this week to start working on that. This describes the first steps  I have been doing. It should also provide a simple introduction to any else wanting to dip their toe in the SPARQL waters.

Where to start?

This 16-minute tutorial on Youtube is a great place to begin; it is where I started. It describes how to create a simple query and build it up to something more powerful.  I copied what it did then adapted that to build a query that I wanted. I suggest that you watch it first to understand what each line of SPARQL is doing.

Here are the steps, mainly frown from and adapted from that tutorial.

Find all female graduates of Aberdeen University
Find all female graduates of Aberdeen University

In the query above we use the Educated at statement (P69) and the identifier for Aberdeen University (Q270532 ) in combination with the Sex or gender statement (P21) with the Female identifier (Q6581072).

You can run this for yourself here using the white-on-blue arrow. I’ve used one of the great things you can do with Wikidata which is to share this query  using the link symbol on the left of the page just above the arrow:

Save a Wikidata query
Save a Wikidata query

Changing the parameters of the query means that we can check males (Q6581097) against females (Q6581072). Or you can compare different universities. To do this go to the Wikidata homepage and search for the name of the institution. The query will return a page with the Q code in the title. Thus we can compare various universities by amending the Q code in the query above: University of Aberdeen (Q270532) with University of Glasgow (Q192775) or Edinburgh University (Q160302).

Running these queries we can see that the number of both male and female graduates with entries on WikiData of Aberdeen University  is significantly smaller than from either Glasgow or Edinburgh, and we can see that the proportion of females of all graduates for each university is smallest for Aberdeen.

 

University Male Grads Female Grads % Female
Aberdeen 944 125 11.7
Edinburgh 3804 571 13.1
Glasgow 1562 291 15.7

The results of these queries should themselves cause us to reflect on the relatively smaller number of results of either gender from Aberdeen compared to the other universities;  and also the smaller proportion of women. It suggests that there is some work to do to ensure that we get better representation of both genders in Wikidata.

Enhancing our query

Now that we have a basic query we can retrieve additional bits of data for the subjects of the query including place of birth, date of birth and images.

These are represented by P19 (birth place), P560 (date of birth) and P18 (image). As we see in the example below, when we query these we follow them with a name we assign to the item returned (e.g. ?person wdt:P19 ?birthPlace ) and we add the name we give it, in this case ?birthPlace to the Select statement on the first line of the query, ensuring that it will feature in the data returned in the table or other format output.

enhanced wikidata query

You will note that the above example now uses the ?birthPlace  to create a new query to get the co-ordinates (P625) of that place which we assign to coordinates:

> ?birthPlace wdt:P625 ?coordinates

and we include coordinates in the first line of things we will display.

Advantages of extra data elements

By having birthplace coordinates we can plot the results in a map which is easily done using the tools built into the wikidata query service.

Run the query (white arrow on blue on the left menu) and observe the table that was returned. You can see that the first line of the Select statement formed the columns of the table.

Table of wikidata query results
Table of wikidata query results

Note that instead of 125 results as we had in the simple query, we only get 20 results. My understanding of this is that we are specifying records which must have a place of birth, an image etc. Where these do not exist then they records for that person are not returned. This in itself shows that there is a piece of work to do to identify where records in the batch of 125 lack these elements and fix them.

In fact you could say that there is a whole cycle of adding data, querying it, spotting anomalies, fixing those and re-querying which leads to substantial enrichment of the data.

Map results

Now click on the dropdown by the eye symbol, on the left immediately above the results, and choose the map option. The tool will generate a map with a pin in the location of each place of birth. You can pan and zoom to the UK and click on each pin. Try it. To get back to the query, click on the arrow, top-right.

wikidata map view with clicked point
Wikidata map view with clicked point

A timeline

Now click on the eye symbol to show other options, and choose Timeline.

As we can see below, the Wikidata query service will construct a rudimentary timeline with relatively little effort.  This is one of its great features. So far we have the same 20 complete records – and the cards or tiles are titled by the place of birth but we can change that.

Wikidata timeline
Wikidata timeline

Enhancing the timeline on Histropedia

To improve on our timeline we can construct a better query using the Wikidata Query Service then paste it into the Histropedia service to run it.  Our first version which makes small improvements on our previous timeline produces the results below. This labels by the person’s name, and colour codes the individual records by place of birth label. To see the code, click the gear wheel at the top right of the screen. Note we still only retrieve 20 results.

A first query on Histropedia
A first query on Histropedia

We can substantially enhance this query as we have done on the following version. This makes certain items optional, gets the country of birth and colour-codes by that, and ranks the records by prominence (with the most prominent at the front). If I understand it correctly by using optional elements it also retrieves 76 records, much more than previously.

enhanced Histropedia timeline
enhanced Histropedia timeline

I would encourage you to watch the tutorial video at the start of this post, then try to hack some of the queries to which I provided links. For example how many female graduates of the Robert Gordon University would each query generate? How would you find the Q code of that institution?  Have fun with it!