The Open Government Licence, and the re-use of Public Sector Information

– why so poorly understood or adopted in Scotland?

Public Sector Information (PSI) is information that has been created by Public Bodies. In 2003 The EU published “Directive 2003/98/EC on the re-use of public sector information, known as the PSI Directive. This is an EU directive that stipulates minimum requirements for EU member states regarding making public sector information available for re-use. This directive provides a common legislative framework for this area. The Directive is an attempt to remove barriers that hinder the re-use of public sector information throughout the Union.” [1]

In 2005 the UK Government introduced its own regulations. [2] And ten years later the it introduced the Reuse of Public Sector Information Regulations 2015 [3] which set out the framework for UK public sector bodies sharing information and data in line with those requirements. The National Archives produced guidance for both public bodies, and for would-be re-users of the info and data. [4]

Re-use means using public sector information for a purpose different from the one for which it was originally produced, held or disseminated.

Why is PSI valuable

“Public sector information constitutes a vast, diverse and valuable pool of resources. In Market Assessment of Public Sector Information (commissioned by the Department for Business Innovation and Skills in 2013), the value of public sector information to consumers, businesses and the public sector itself in 2011/12 was estimated to be approximately £1.8 billion (in 2011 prices). [5]

“Re-use of public sector information provides enormous opportunities for economic and social benefits, while also promoting transparency and accountability of the public sector.” [5]

Why PSI must be reusable

Re-use of public sector information stimulates the development of innovative new information products and services in the UK and across Europe, thus boosting the information industry.

“The Re-use of Public Sector Information Regulations 2015 [have been] in force from 18 July 2015. They build on the prior 2005 Regulations which removed obstacles to the re-use of public sector information. The 2015 Regulations harmonise and relax the conditions of re-use for public sector information, and bring the cultural sector into scope. The 2015 Regulations continue to improve transparency, fairness and consistency among public sector bodies and re-use of their information. [6]

How is information and data licensed for re-use?

Public bodies, up until 2010 were obliged to make information and data available under a PSI Click-Use Licence. [7] In 2010 this was superseded by V1.0 of the Open Government Licence ( or OGL). The latter has been updated twice and is now at version 3.0. [8]

Where is information and data in Scotland licenced for re-use?

While several local authorities, The Scottish Government and a few government agencies have published open data either on dedicated portals or in sections of their websites explicitly licensed under Open Government Licence (OGL 3), almost no organisations make website content so available for re-use despite an obligation to do so for almost 20 years.

When I worked at Aberdeen City Council, trying to get senior managers in numerous services and departments to make information and data available under OGL or its predecessor was a non-starter. The legal department at the time didn’t support it. And any conversation pushed it further down the line as a ‘nice-to-do’. And we weren’t alone. No council that I can recall was doing this as standard for their websites.

The one notable, and welcome exception, that I am aware of, was The Scottish Government. Since at least 2015 they’ve permitted reuse under OGL, albeit behind a Crown Copyright link at the foot of each website page. But Since April 2022 they explicitly had an open licence statement on each page. Unfortunately this doesn’t carry through to other public body sites, at least not explicitly. With over 180 to check [8] it is difficult to be absolutely certain.

The OGL and Copyright statement on the current Scottish Government Website

I personally, and with various groups, have campaigned for the Scottish public sector to meet their obligations under RPSI 2015. In November 2018 I responded to the consultation on the Scotland Draft Action Plan on Open Government[9]:

“There is one simple thing that could be done with immediate impact, and minimal effort, to free up large amounts of data and information for public re-use: adopt an Open Government Licence (OGL) for all published website information and data on the Scottish Government’s website(s), and other public sector sites, the only exception being where this cannot legally be done, as would be the case when personal data is involved.”

I continued:

“At present, websites operated by Scottish Government, local authorities, health boards etc.  all appear to have blanket copyright statements. I certainly could find no exception to that. With OGL-licensed content, where data is not yet available as Open Data (OD), a page published as HTML could be legitimately scraped and transformed to open data by third parties as the licence would permit that. “

And

“The Scottish Government should mandate this approach not just for the whole of the public sector but also for companies performing contracts on behalf of Government, or who are in receipt of public funding or subsidy.”

In work in 2019, revisited in 2020, I looked in more detail at how local authorities in Scotland conformed to their obligations under RPSI regulations. [10] At that point 25 of 32 did not licence web content under OGL at all. Only one authority got it right, and the remainder, six authorities, had a stab at granting permission.

Why mention this now?

In March 2022 was lead author of a report for The David Hume Institute: “What is Open Data and Why Does it Matter?”[11] In that report, which set out the economic, social, environmental drivers for opening government data, I also highlighted that most councils failed to make their website data and information available for re-use under a clear licence.

Today in a Tweet, Councillor Anthony Carroll, stated “At today’s Glasgow Digital Board, a paper I’ve pushed for passed on @GlasgowCC ‘s website content to have an Open Government License. This means anyone can re-use website content (with attributation) instead of it being copyrighted”. [12] He then reference the DHI paper which I authored.

So, well done to Councillor Carroll and to Glasgow City Council.

Now we need COSLA to push all local authorities to do the same. And, as I asked in 2018 and again in 2022, for Scottish Government to compel all public bodies to fulfil their legal obligations and do this properly.

Ian Watt

10 March 2023

References

  1. https://en.wikipedia.org/wiki/Directive_on_the_re-use_of_public_sector_information (CC-BY-SA V3)
  2. https://www.nationalarchives.gov.uk/information-management/re-using-public-sector-information/regulations/
  3. https://www.legislation.gov.uk/uksi/2015/1415/contents/made
  4. https://www.nationalarchives.gov.uk/information-management/re-using-public-sector-information/about-psi/psi-valuable/
  5. https://www.nationalarchives.gov.uk/information-management/re-using-public-sector-information/about-psi/psi-must-re-usable/
  6. https://www.nationalarchives.gov.uk/information-management/re-using-public-sector-information/uk-government-licensing-framework/open-government-licence/
  7. https://opendata.scot/organizations/
  8. https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
  9. https://codethecity.org/2019/11/15/response-to-scotlands-draft-action-plan-on-open-government/
  10. https://github.com/watty62/SOD/blob/master/OGL_content.md
  11. https://davidhumeinstitute.org/research-1/research-what-is-open-data
  12. https://twitter.com/_anthonycarroll/status/1633903590740770841?s=61&t=XsLcJSnpR1TgzOpce8dhhQ

Header Image: http://www.nationalarchives.gov.uk/information-management/government-licensing/ogl-symbol.htm

This page contains quoted text from government pages which are licensed under the Open Government Licence, and a Wikipedia page which is licensed as CC-BY-SA, all of which are referenced at the foot of the article.

[Edited for clarity 11 March 2023]

Statement from SODU2022

The third annual Scottish Open Data Unconference (SODU2022) took place in Aberdeen on 5–6 November 2022 in Aberdeen and online. Over the two days, attendees primarily from civic society, participated in 31 themed discussions. These covered diverse topics such as Open Data’s contribution to the economy; Policy, Strategy and Legislation of Open Data; The Technology behind Open Data Scotland; and The Ownership of Properties in Aberdeen City Centre. The organisers and participants wish to thank The Data Lab for their generosity in sponsoring the event, and ONE Tech Hub for providing the venue for the unconference.

The following statement was developed from the final session of the weekend. It has been collated and published on behalf of those at the unconference and in wider civic society by the organisers of the event, Code the City. 

Why open data matters

Making data (primarily government data) available under an open licence has a number of identified benefits. These may be categorised as humanitarian, social, economic, performance and environmental [1][2]. 

The Current State of Open Data In Scotland

While the Scottish Government publicly accepts the value of open data, the reality is that delivery on strategy and policy lags far behind what should be delivered. 

The introduction to the 2015 Open Data Strategy for Scotland [3] identifies many of the benefits and reasons for making data open. The Scottish Government’s participation in the international Open Government Partnership [4] is based on the social benefits of improving transparency and accountability, and enhancing citizen participation. The Scottish Government’s Digital Strategy 2021 [5] identifies the potential economic benefits of open data through innovation and entrepreneurship. In this context it is estimated that the value to Scotland’s economy of this, if done well, calculated as a percentage of GDP would be of the order of £2.23bn per annum. 

That said, apart from some small and isolated instances of very good open data publishing in Scotland the majority of public bodies (thought to number 179 but ironically there is no central set of open data to corroborate that) including health boards, local councils, universities, government departments etc publish no, or very little open data. 

It is not clear who is the lead for open data for the public sector in Scotland. Where does responsibility sit for implementation of the various strategies and plans? Whose role is it to monitor performance, report on that, or measure impact. And in each of the 179 or thereabouts public bodies, who has the responsibility to ensure that their organisation is implementing open data to deliver the multiple benefits noted above?

When open data is produced it often suffers from many issues:

  • Sporadic publishing
  • Lack of consistency or standards – for example across all 32 local authorities
  • Lack of current and up-to-date datasets  
  • Gaps in categories of published data by organisations  
  • Short-termism which sees projects commit £10,000s of funding then shutdown publishing platforms and delete the data

However, in civic society there is a small but thriving community for open data. They have, in addition to organising and attending three annual unconferences for Open Data, built, maintained and enhanced the Open Data Scotland portal [6].This project has provided a range of opportunities beyond making what open data exists in Scotland findable for the first time. It has also shared knowledge and expertise, and encourages community participation in the development of the site.

Threats or opportunities

Working with open data increases the breadth of experiences of students developers, analysts, businesses, and others, and increases opportunities. Participation in the development of projects such as Open Data Scotland continues to provide educational opportunities as well as societal benefits.

The lack of open data, provided to high standards, universally, and in a sustained manner is a barrier to innovation and a threat to transparency and accountability. 

Short term data publishing projects with no longevity are a risk to business continuity for those who are dependent on it.   

There is a broken feedback loop to publishers. Individuals and organisations, especially businesses, which are using open data need to be vocal about it. Otherwise, it is only natural for publishers to assume that their data is not being used, deem it to be without value, which may lead to termination of publication.

The government has created strategies and plans, but without mandate to publish for every public body. This has been shown over seven years not to work. [7]

There is a double threat to the provision of open data in Scotland. While the Scottish Government is refreshing its 2015 Open Data Strategy, the community fear that the new version will not be fully consulted on and that it will seek to water down the commitment which the Scottish Government made to data being made ‘open by default’. This would be a retrograde step and act against the intentions of the 2021 Digital Strategy and participation in the Open Government Partnership. 

In addition the Retained EU Law (Revocation and Reform) Bill [8] being put before the UK parliament would sweep away the UK’s Re-use of PSI Regulations 2015 on which basis data which has not been explicitly published as open data can be sought and used. As Jeni Tennison of ODI noted “[without these regulations] and with unprecedented pressures on the public purse, public bodies will start charging for data. Services using it will become more expensive. Some will close down. Others just won’t be built.” [9] This would kill open data in Scotland and stifle delivery of the Digital Strategy 2021

What Scotland Needs

We have identified that for Open Data to flourish in Scotland we need better commitment to, and engagement from, all organisations, bodies, or people with an interest in open data and the benefits it can deliver. We also have the following specific requests. 

From Government

We need oversight of delivery of open data in the Scottish Government to be a specific role – not be spread over multiple roles and structures. 

We need the refresh of the Open Data Strategy to be fully consulted on. It needs to strengthen, not weaken, the Scottish Government’s commitment to open data publishing. It needs to introduce regulations obliging each publicly-funded body in Scotland to deliver open data. It needs to identify  a specific role in each public sector organisation which will have a responsibility for implementing the strategy. 

It needs to provide a monitoring framework for ministers on progress in implementing the strategies which involve open data. This needs to be reported on publicly.

The Scottish Government needs to join with civic society in lobbying the UK government to retain R-PSI regulations and be prepared to introduce Scottish legislation to replace it if this fails. 

For businesses to use open data to deliver new products and services as outlined in the Digital Strategy, businesses need reliable, regular publishing without changes. Charging for data is a barrier to innovation and needs to cease. 

From Business

Anyone, including businesses, who are using open data need to be open about it. They should avoid agreeing contracts which oblige them to be silent about open data use to provide services which are only deliverable using open data. If Government is to be persuaded to make data available and keep making it available the use needs to be seen. We need businesses to join with civic society in lobbying for more, better open data.

From Education sector

We need Scotland’s schools, colleges and universities to use open data in their curriculum. We need teaching staff to better understand what open data is, how it can be used and to use it in course work. We need colleges and universities to publish open data which can be re-used too. 

From The Third Sector

The third sector needs to engage with civic society in promoting the use and benefits of open data. They need to demand that data is published openly which they can use. They need to educate their trustees and staff about the benefits of open data – such as the use of standards and platforms including 360 Giving.[10]

From The Press

We need those in the press to participate in the conversations about open data, to get involved with civil society groups, to challenge government and be involved with the broader community.  We need a press which understands the potential and power of open data and to help share the case for it and stories about open data. 

From Civic Society

We need the public to understand what open data is and what it can potentially deliver. We can help by providing access to education, resources such as the Open Data Scotland portal, and success stories from using open data. Government and the education sector have a responsibility here too. We need the public to tell us about challenges and success stories. 

[1]  https://www.europeandataportal.eu/en/training/what-open-data

[2] https://opendatahandbook.org/guide/en/why-open-data/ 

[3] https://www.gov.scot/publications/open-data-strategy/ 

[4] https://www.gov.scot/policies/improving-public-services/open-government-partnership/ 

[5] https://www.gov.scot/publications/a-changing-nation-how-scotland-will-thrive-in-a-digital-world/  

[6] https://opendata.scot 

[7] https://codethecity.org/2019/11/15/scotlands-open-data-february-2019-an-update/ 

[8] https://bills.parliament.uk/bills/3340 

[9] https://twitter.com/jenit/status/1576140360820330496?s=61&t=nI2hETRy_yo0zmhc13ycCg 

[10] https://www.threesixtygiving.org/ 

Header Photo by Hannah Busing on Unsplash

The Od-Bods project: update from CTC24

Why did we run this project? 

Theoretically, with the 2015 Scottish Government commitment to data being “open by default”, we should have universal publication of appropriate data as open data. In reality Scotland is very poorly served with Open Data. Few local authorities publish any, and those who do have little consistency. Beyond councils the picture is, if anything, even worse. Finding data is all but impossible. We set out to make data more findable, identify who is making data available and, perhaps as importantly, those who are not. We began with local government.

This work is a starting point, not an end point. 

Work done before CTC24

  • We wrote this blog post about this project to accompany the work done at CTC23, the forerunner to this event. 

What we achieved at CTC24, what impact we hope it will have

What challenges we have faced/are currently facing?

  • As always, lack of (good) engagement  with public sector.
  • There is no standardisation in how and where local government published its data.
  • Gathering data and cleaning it to output in a presentable format is currently a manual and laborious process.

What next – how can people get involved?

  • We have a page explaining the project, what our objectives are and what the plan is: https://opendata.scot/about/
  • There’s a big list of GitHub issues to be worked on here: https://github.com/OpenDataScotland/the_od_bods/issues 
  • The current milestone for Q1 2022 is to improve our data that we have gathered so far:
    • Fix known bugs with API calls
    • Tidy up inconsistent dataset tags
    • Identify and locate any missing data from the 32 local authorities which we haven’t found yet
    • Add more data features/metadata if possible

Join us at CTC25 to work on the project issues and work toward our next milestone! 

Aberdeen Plaques – Part Two

In part one I described what we did at CTC18 to capture data and images of Commemorative Plaques in Aberdeen, and what I then did in the following three weeks.

A few people asked my why we would bother to put plaques into Wikidata and WikiCommons in this way. Why not have a council website – or why not use Open Plaques?

In this second instalment I am going to demonstrate how we can use the data which we have created to make some interesting visualisations and even do some calculations and analysis.

It can also power other new apps and services – allowing developers to create tailored routes around the city, on themes such as the arts or medicine – which is beyond the scope of this post.

Getting Started

At the time of writing we now have 132 Aberdeen Commemorative Plaques recorded  in Wiki Data.

I can check that with this simple query on the Wiki Data Query Service:

Plaques - Query One
Plaques – Query One

All that this does is ask for every instance (P31) of a commemorative plaque (Q721747) whcich is located in (P131) the Aberdeen City (Q62274582) area.

Try It for yourself.

Click on the white-on-blue arrow at the left. See what it produces. Note the bottom half of the screen turns into a table of results, and on the centre bar there is a message ‘xxx results in xxxx milliseconds‘.

How many pictures of plaques?

I can retrieve the photograph for plaque using the following query.

Plaques - Query Two
Plaques – Query Two

Here I am saying give us plaques which have image (P18). In effect this is saying ONLY those that have an image. If not all entries have an image, yet, then we will get a smaller number.

Try it.

As I run it I get 126 – which is six fewer than I got plaques.

Get all plaques with images or not

Let’s modify the query to this.

Plaques - Query Three
Plaques – Query Three

Here I am the OPTIONAL command which has the effect of saying IF there is an image give me it, but don’t restrict the results to only those with images. When we run that we can spot the missing ones by scrolling down through the list. I get six plaques with no images. This is a useful technique to spot missing things when totals (in this case plaques and images) don’t tally.

Try it.

Commemorating who or what?

As it stands the query is still not very user-friendly as all we have for the plaques is their Plaque ID. Of course we can click on those, but it would be more helpful to have the names of their subjects.

We’ll do that in two steps.

Firstly, let’s work out what the subjects are.

We can add the following line to the query and remember to add ?subject to the SELECT on the first line.

 ?plaque wdt:P547 ?subject

Note P547 is the statement “commemorates“.

Try it

If we run that we get a new column called subject and it is filled with links to subject IDs, which are the Wikidata entries for either people or things that the plaques commemorates. I note that when I run it my list has grown from 132 to 134.

Any guesses why that should be?

Some of the plaques commemorate more than one person.

Let’s make it a bit more friendly.

Add the following line just before the end of your query

 SERVICE wikibase:label {bd:serviceParam wikibase:language "en". }

And change ?subject to ?subjectLabel in the first line.

This instructs the WikiData Query service to use another service to retrieve labels from the items.

Plaques - Query Four
Plaques – Query Four

The label is in effect the title of the Wikidata item. Look at this one https://www.wikidata.org/wiki/Q80818579 Immediately below the title, and to the left, there is an edit link. Click that. See how the ‘label‘ and the ‘description immediately below it become editable. Cancel that for now.

Try running that query to get subject names (labels) back

Now we have a name (in a subjectLabel column) for who or what is being commemorated.

Which provosts have plaques?

We can ask which of our plaques commemorates a previous Lord Provost of Aberdeen.

We use the P547 (commemorates) statement to get our subject, then use the following

subject wdt:P39 wd:Q57906938.

where P39 is Position Held, and Q57906938 is the identifier for Lord Provost of Aberdeen.

Plaques - provosts?
Plaques – provosts?

Currently we appear to have four plaques to former Lord Provosts.

Note: the “Try it” link below has been updated to take  account of subsequent work done to separate Provosts and Lord Provosts into separate categories.

Try it

A different view

At this point you might want to change the view for your query just to have a look at the images we have.

Above the table of results, on the extreme left there is an eye symbol and a drop down. Choose “Image Grid” to see the images only.

Plaques - change view
Plaques – change view

You might also have noticed that there are other options, several of which are greyed out as we don’t yet have that data in our query. These views include ‘Map‘ and “Timeline‘. We’ll come back to those.

Our Image Grid looks something like this:

Plaques - Image Grid
Plaques – Image Grid

Remember to swap back to ‘Table’ view once you’ve finished.

Adding more data fields

We can now add more data fields to our query.

Firstly, let’s add the geographic coordinates of the plaques’ locations.

Add the following line to your code:

 OPTIONAL {?plaque wdt:P625 ?coordinates .}

and, again add the new value, ?coordinates to the first line of the query too.

You will now have an extra field in the returned data table.

Try it 

Mapping results

Now change the view from Table to Map. The Wikidata query service automatically uses the coordinates to plot the results on a map which is scaled to show the results. You may need to scroll down to see all of the map. Click on one of the plotted points. You should get a pop up with the name of the person or building commemorated, plus a photo of the plaque itself, as shown below.

Plaques - map view
Plaques – map view

Note – if you add the following as the first line of your query, it will default to a map view rather than table when first run.

#defaultView:Map

Now let’s see if we can get more data for the people for whom there are plaques.

Dates of birth and death

We can change our query to find out if there are dates of birth and death for our human subjects  (rather than buildings).

We can use P569 (date of birth) and P570 (date of death) and ascribe those to
?DOB and ?DOD respectively – again, adding those fields to our SELECT statement on line one. Your query should look like this?

Plaques - Query Five
Plaques – Query Five

Try it

Looking at our table of results we can see that we have a mix of types of results – people, bridges, buildings etc. but only the people have dates.

Table showing dates of birth
Table showing dates of birth

Interestingly the one subject with the DOB and DOD in the screenshot above is Elizabeth Crombie Duthie who gifted Duthie Park to the city of Aberdeen.

Remember, if you change the DOB and DOB from being OPTIONAL to just being regular requests, you can filter records to show ONLY those with dates associated with them which will screen out not only non-human subjects but will exclude any people with incomplete or missing dates.

Notable people

It could be argued that the fact there is a plaque to a person would indicate that they are notable, but not every person or object for which there is a plaque has a Wikipedia article. Let’s add some code to see which of our plaques has an associated article.

Plaques - Query Six
Plaques – Query Six

Try It

Changing the above so that we remove the OPTIONAL {} around the section beginning ?article  we get ONLY those with Wikipedia articles which is, as I run it, 79 plaque subjects.

You can if you want we add the following

 ?subject wdt:P31 wd:Q5 .

where P31 (instance of ) is Q5 (human) we can screen out all of the non-people plaques.

Try it

At this point, try flipping the view to TimeLine – you may have to scroll down quite a way to see all of the plaques. Many of them are concentrated at the right, spanning much of the 20th century. You should see John Barbour (1316-1395 at the extreme left).

Plaques - timeline
Plaques – timeline

Finally, before we start doing some statistical analysis let’s try something more sophisticated.

Can we create a map showing only female subjects whose work was in the medical sciences?

To do that we need to select only subjects who have a P21 (gender or sex) of Q6581072 (female). Then we need to select an occupation (P31) which is an instance or subclass of Q66811410 (the medical profession). This requires a structure that we haven’t see before:

?occupation wdt:P31/wdt:P279* wd:Q66811410

While we are at it, let’s get an image of the subject if there is one, and find out of there is a wikipedia article about the subject. And, since we want a map, we add that as our default view at the top.

Plaques - map of female medics
Plaques – map of female medics

This gives us the following output:

Map view of female medics
Map view of female medics

Try it

Changing this query to male (Q6581097) or choosing different types of professions is straightforward.

Statistical analysis

The Wikidata Query Service allows us to move beyond visualising the data in different ways. Let’s have a look at a couple of examples.

Analysing who or what is commemorated

The following query finds out what the subject of the plaque is an instance of (P31) – line 6:

Plaque - query seven
Plaque – query seven

but instead of creating a list, it use the COUNT () function to analyse the subject being an instance of (P31) Instance Of.

Try it

We can see that we have 105 humans, 5 lanes etc. Note that some double counting occurs. Some structures, for example, are instances of two things.

We can also analyse the gender of the human subjects just by changing P31 in the above to P21 (Sex or Gender).

At present I get

Plaques by gender
Plaques by gender

That’s far from gender equality, isn’t it!

What’s in a name?

Ascertaining the most common first names on plaques is also straightforward.

We use P735 (given name) statement, get the labels, count and group by those.

Try it.

We get the following results

Plaques - given names chart
Plaques – given names chart

With 81% of plaques to people being for males it is hardly surprising that our league table of names begins with James, William, George, John, Alexander ….

We can do more sophisticated analysis too.

Analysing Occupations

We can add the following line to our query to get back the occupation of the subject of the plaque:

 ?subject wdt:P106 ?occupation

Bear in mind that many of our plaque subjects are true polymaths. Have a look at Robert Brown. He has 10 listed occupations!

So what are the most common occupations of those people for whom there are plaques? Any guesses?

Let’s use the following query:

Plaques - Using Count()
Plaques – Using Count()

This uses the COUNT () function as well as a GROUP BY clause. The query looks at all of the different occupation labels, counts how many of each there are.

Try it

This returns, by default, a table of values. We can flip to a Bar Chart to make better sense of the data:

Plaques - Bar Chart of occupations
Plaques – Bar Chart of occupations

So, we can see that for those commemorated by a plaque the most common occupations are Physician, Painter, University Lecturer, Writer and so on.

We can add a couple of refinements if we wish. If we want our query to default to a BarChart when we run it we can add the following line at the start of the query:

#defaultView:BarChart

and if we want the table to be sorted by value we can add a line such as

ORDER BY DESC (?count)

Try it

What next?

Over the last month I’ve been busy gathering data, taking photographs and publishing all of those on WikiData and wiki Commons. That phase is not quite complete, if it ever could be considered complete. You can monitor live progress here.

There are a couple of photographs which I can’t easily take which I know Aberdeen City Council’s Museum and Galleries team have. It would be great to see those made available by them on Wiki Commons, as I have shared the 148 plaque photos I have taken.

I know of at least 24 more plaques which I have photographed which are not listed yet in Wikidata.

When I published part one of this series I got some great feedback on Twitter. One suggestion is that we add structured data to the Wiki Commons pages for each photograph. Another was to add further data to the record for each plaque using statement P276 (location) where the plaque is on a known listed building. So far I have done that for 5 plaques – check it for yourself. There are loads more to do.

Many of the people records that I have created in Wikidata are skeletal. They need more detail, photographs, biographical links etc. Similarly, given that people or places are noteworthy enough to merit a plaque, they should pass the notability test for Wikipedia, yet at least 68 plaque subjects have no Wikipedia entry.

And plaques are just a start – an easy introduction to what is possible given, in this case, about 100 hours of work. While that was almost all done by one person, if we ran a Code The City weekend on a similar theme and similar sized challenge, six people could achieve the same over a weekend with a little coordination.

At Code The City, we’re about to start discussions with the local cultural institutions about setting up a more formal alliance for the city (shire?) to help shape how they use digital and data more effectively and grow volunteers with skills and tools to make that happen, which is an exciting note on which to finish this post! Watch this space, as they say.

Ian

Aberdeen Plaques – Part One

On Saturday 14th December 2019 we ran a one-day mini hack event. The idea behind it was for people to come along for a day to work on their side projects and, if they needed support, attempt to persuade others to assist them.

That’s what I did with my Aberdeen Plaques project: something I’d had on the back burner for more than a year.

Why do it?

The commemorative plaques which are dotted around the city are a perfect candidate for open data. They have a subject, usually some dates, are located somewhere, and are of different types etc. Making that all available as open data would open up a whole range of possibilities.

Some Aberdeen plaques
Some Aberdeen plaques

If we captured all of that well then we could do analysis on the data (ratio of women to men, most represented professions), create walking routes (maybe one for the arts, one for the sciences and so on), create timelines to see what periods are more represented.

Having recently trained as a WikiMedia UK trainer – and having experimented with some of the tools (Wiki Commons, Wiki Data, Wikipedia, Histropedia) I was convinced that these were the right way to go.

Pre-event prep

So, in advance of the hack day I’d done a bit of prep in the two weeks running up to the day iteself.

I’d created a spreadheet which recorded the
* subject (person or ‘thing’)
* Gender if known
* the link to the now-retired city council plaques system (hidden from public view)
* The location if known
* The geo coordinates (to be determined)
* Whether the subject had a Wikipedia page (tbd)
* Whether there was an image of the plaque on Wiki Commons (tbd)
* Whether the subject of the plaque was represented on Wiki Data (tbd)
* Any identifiers on Open Plaques (tbd)
* Any external links (eg to Flickr for photos)

I’d then populated some of the data (eg whether there were images of the plaque on Wiki Commons) as well as some other bits. But most cells were blank.

Pre-event spreadsheet
Pre-event spreadsheet

As a keen walker and photographer I had also photographed and uploaded seventeen plaque images to Wiki Commons in the lead up, so that we would have some images to work with.

How to use our time most effectively on the day?

Our aim for the day was then to find out what data / info / images existed, fill in the gaps, and explore how to use WikiData to store and retrieve data, and how we could potentially create maps, timelines and similiar new products.

What we did on the day

At the start of the event we pitched our project ideas, and I managed to persude five others (Angela, Mike, Stephen, James and Steve) to join me in working on the plaques project.

Angela and Mike, and later Angela and Stephen would go out and take photographs. Steve, James and I would work on the data capture, completing research on what existed, creating new entries for the data on Wiki Data, and testing queries on the Wiki Data query service.

How we did it

We used the spreadsheet that I had set up to capture all of the data we’d gathered – and as it eveolved it would show progress as well as what was still lacking. We had no expectations that we would do it all on the day, but we could pick away at it in future weeks and months.

In the run-up to the event I’d discovered The Pingus’ album of plaques photographs on Flickr. Sadly these had not been published with a licence that would allow us to use them. I’d sent a request, a few days before CTC18, for them to change the licence for the Aberdeen plaques pictures to a CC-SA one. This would have allowed our republishing on Wiki Commons. Sadly it didn’t elicit a response. But the album did show that there were many more plaques than the old ACC system listed. And it was possible to get co-ordinates from them. So the number of plaques to deal with kept growing.

During the day James filled in loads of gaps in which subjects were on Wikipedia and which on Wikidata.

Steve and I experimented with capturing and querying the data. Structuring that in a way that aids recall through Wiki Data Query Service was an interative process. Firstly I tried adding a statement ‘commomorative plaque image’ (P1801) into the wikidata record for the subject as you can see in this first example https://www.wikidata.org/wiki/Q2095630. But that limited what we could do.

So, we discovered that we could create a new object which was an instance of commemorative plaque. Our first attempt was https://www.wikidata.org/wiki/Q78438703 and we evolved what we captured there – adding statement, and Steve discovered the ‘openPlaques plaque ID'(P1893). Incidentally we also tried ‘openplaques Subject ID’ (P1430) but adding that to the plaque object throws an error. The latter should be added to the person record not the plaque.

At the end of CTC18

We ended the day with

  • 138 plaques listed.
  • 57 sets of co-ordinates identified
  • 68 Wikipedia articles identified as matching plaque subjects (and eleven plaques subjects who had NO wikipedia page)
  • 36 Images in WikiCommons
  • 77 WikiData entries for the subject of the plaques (existing or created)
  • 11 new wikidata entries for the plaques themselves

This was a great leap forward in one day and would pave the way for future work.

What next?

Since CTC18 ended, I’ve got firmly stuck into this project over the xmas break. Over the last three weeks I have now photographed over a hundred plaques (plenty of walking) and have created wikidata entries for most plaques and also their subjects in wikidata.

I’ll cover all of that, and how we can now use the data in part two, coming soon.