CTC23 – The OD Bods

Introduction

This blog post was written to accompany the work of The OD Bods team at Code the City 23 – The Future of The City

Open data has the power to bring about economic, social, environmental, and other benefits for everyone. It should be the fuel of innovation and entrepreneurship, and provide trust and transparency in government.

But there are barriers to delivering those benefits. These include:

  • Knowing who publishes data, and where,
  • Knowing what data is being published – and when that happens, and
  • Knowing under what licence (how) the data is made available, so that you can use it, or join it together with other agencies’ data.

In a perfect world we’d have local and national portals publishing or sign-posting data that we all could use. These portals would be easy to use, rich with metadata and would use open standards at their core. And they would be federated so that data and metadata added at any level could be found further up the tree. They’d use common data schemas with a fixed vocabulary which would be used as a standard across the public sector. There would be unique identifiers for all identifiable things, and these would be used without exception. 

You could start at your child’s school’s open data presence and get an open data timetable of events, or its own-published data on air quality in the vicinity of the school (and the computing science teacher would be using that data in classes). You could move up to a web presence at the city or shire level and find the same school data alongside other schools’ data; and an aggregation or comparison of each of their data. That council would publish the budget that they spend on each school in the area, and how it is spent. It would provide all of the local authority’s schools’ catchment areas or other LA-level education-specific data sets. And if you went up to a national level you’d see all of that data gathered upwards: and see all Scottish Schools and also see the national data such as SQA results, school inspection reports – all as open data.

But this is Scotland and it’s only six years since the Scottish Government published a national Open Data Strategy; one which committed data publication would be open by default

Looking at the lowest units – the 32 local authorities – only 10, or less than a third, even have any open data. Beyond local government, of the fourteen health boards none publishes open data, and we note that of the thirty Health and Social Care Partnerships only one has open data. Further, in 2020 it was found that of an assumed 147 business units comprising Scottish Government (just try getting data of what comprises what is in the Scottish Government) – 120 have published no data.

And, of course there are no regional or national open data portals. Why would Scottish Government bother? Apart, that is, from that six year old national strategy and an EU report in 2020 from which it was clear that OD done well would benefit the Scottish economy by around £2.21bn per annum? Both of these are referred to in the Digital Strategy for Scotland 2021

Why there is no national clamour around this is baffling. 

And despite there being a clear remit at Scottish Government for implementing the OD Strategy no-one, we are told, measures or counts the performance nationally. Because if you were doing this poorly, you’d want to hide that too, wouldn’t you? 

And, for now, there is no national portal. There isn’t even one for the seven cities, let alone all 32 councils. Which means there is 

  • no facility to aggregate open data on, say, planning, across all 32 councils. 
  • no way to download all of the bits of the national cycle paths from their custodians. 
  • no way to find out how much each spends on taxis etc or the amount per pupil per school meal. 

There is, of course, the Spatial Hub for Scotland, the very business model of which is designed (as a perfect example of the law of unintended consequences) to stifle the publication of open data by local government. 

So, if we don’t have these things, what do we have?

What might we expect?

What should we expect from our councils – or even our cities? 

Here are some comparators

Remember, back about 2013 , both Aberdeen and Edinburgh councils received funding from Nesta Scotland to be part of Code For Europe where they learned from those cities above. One might have expected that by now they’d have reached the same publication levels as these great European cities by now? We’ll see soon. 

But let’s be generous. Assume that each local authority in Scotland could produce somewhere between 100 and 200 open data sets. 

  • Scotland has 32 local authorities 
  • Each should be able to produce 100  – 200 datasets per authority  – say 150 average

= 150 x 32 = 4800 data sets.

The status quo

Over the weekend our aim was to look in detail at each of Scotland’s 32 local authorities and see which was publishing their data openly – to conform with the 2015 Open Data Strategy for Scotland. What did we find?

Our approach

As we’ve noted above there is no national portal. And no-one in Scottish Government is counting or publishing this data. So, following the good old adage, “if you want something done, do it yourself”, a few of us set about trying to pull together a list of all the open datasets for Scotland’s 7 cities and the other 25 authorities. For the naive amongst us, it sounded like an easy thing to do. But getting started even became problematic. Why?

  1. Only some councils had any open data – but which?
  2. Only some of those had a landing page for Open Data. Some had a portal. Some used their GIS systems. 
  3. Those that did provide data used different categories. There was no standardised schema. 
  4. For others, some had a landing page but then additional datasets were being found elsewhere on their websites
  5. Contradictory licence references on pages – was it open or not?

We also looked to see if there was already a central hub of sorts upon which we could build. We found reference to Open Data on Scottish Cities Alliance website but couldn’t find any links to open data. 

Curiosity then came into play, why were some councils prepared to publish some data and others so reluctant? What was causing the reluctancy? And for those publishing, why were all datasets not made open, what was the reason for selecting the ones they had chosen?

What we did

Our starting point was to create a file to allow us to log the source of data found. As a group, we decided upon headers in the file, such as the type of file, the date last updated to name but a few.

From previous CTC events which we attended we knew that Ian had put a lot of effort previously into creating a list of council datasets – IW’s work of 2019 and 2020 which became our starting source. We also knew that Glasgow and Edinburgh were famous for having large, but very out of date, open data portals which were at some point simply switched off. 


We were also made aware of another previous attempt from the end of 2020 to map out the cities’ open data. The screenshot below (Fig 1) is from a PDF by Frank Kelly of DDI Edinburgh which compared datasets across cities in Scotland. You can view the full file here.

Fig 1 From an analysis of Scottish Cities’s open data by Frank Kelly of DDI Edinburgh, late 2020 or early 2021

For some councils, we were able to pull in a list of datasets using the CKAN API. That worked best of all with a quick bit of scripting to gather the info we needed. If all cities, and other authorities did the same we’d have cracked it all in a few hours! But it appears that there is no joined up thinking, no sharing of best practices, no pooling of resources at play in Scotland. Surely COSLA, SCA, SOCITM and other groups could get their heads together and tackle this? 

For others there were varying degrees of friction. We could use the arcGIS API to gather a list of data sets. But the arcGIS API tied us up in knots trying to get past the sign in process, i.e. did we need an account or could we use it anonymously – it was difficult to tell. Luckily with an experienced coder in our team we were able to make calls to the API and get responses – even if these were verbose and needed manual processing afterwards. This post from Terence Eden “What’s your API’s “Time To 200”?” is really relevant here! 

For the rest it was a manual process of going into each city/council website and listing files. With three of us working on it for several hours. We succeeded in pulling together the datasets from the different sources into our csv file

One council trying to publish open data but the quality, and the up-to-date-ness was questionable

Ultimately, the sources were so varied and difficult to navigate that it took 5 digitally-skilled individuals a full day, that is 30 man-hours, to pull this data together. Yet if we have missed any, as we are sure to have done, it may be because they have moved or are hidden away. Let us know if there are more. 

From this output it became clear that there was no consistency in the types of files in which the data was being provided and no consistency in the refresh frequency. This makes it difficult to see a comprehensive view in a particular subject across Scotland (because there are huge gaps) and makes it difficult for someone not well versed in data manipulation to aggregate datasets, hence reducing usability and accessibility. After all, we want everyone to be able to use the data and not put barriers in the way.

We have a list, now what

We now had a list of datasets in a csv file, so it was time to work on understanding what was in it. Using Python in Jupyter Notebooks, graphs were used to analyse the available datasets by file type, the councils which provided it, and how the data is accessed. This made it clear that even among the few councils which provide any data, there is a huge variation in how they do that. There is so much to say about the findings of this analysis, that we are going to follow it up with a blog post of its own.

Unique Datasets by Council
Unique dattes by council and filetype
Average filetypes provided for each data set by Council

One of our team also worked on creating a webpage (not currently publicly-accessible) to show the data listings and the graphs from the analysis. It also includes a progress bar to show the number of datasets found against an estimated number of datasets which could be made available – this figure was arbitrary but based on a modest expectation of what any local authority could produce. As you saw above, we set this figure much lower than we see from major cities on the continent.

What did we hope to achieve?

A one stop location where links to all council datasets could be found. 

Consistent categories and tags such that datasets containing similar datasets could be found together. 

But importantly we wanted to take action – no need for plans and strategies, instead we took the first step.

What next?

As we noted at the start of this blog post, Scotland’s approach Open Data is not working. There is a widely-ignored national strategy. There is no responsibility for delivery, no measure of ongoing progress, no penalty for doing nothing and some initiatives which actually work against the drive to get data open. 

Despite the recognised economic value of open data – which is highlighted in the 2021 Digital Strategy but was also a driver for the 2015 strategy! – we still have those in government asking why they should publish and looking specifically to Scotland (a failed state for OD) for success stories rather than overseas. 

 We’ve seen closed APIs being, we assume, to try to measure use. We suspect the thinking goes something like this:

A common circular argument

In order for open data to be a success in Scotland we need it to be useful, usable, and used. 

Useful

That means the data needs to be geared towards those who will be using it: students, lecturers, developers, entrepreneurs, data journalists, infomediaries. Think of the campaign in 2020 led by Ian to get Scottish Government to publish Covid data as open data, and what has been made of it by Travelling Tabby and others to turn raw data into something of use to the public.

Usable

The data needs to be findable, accessible, and well structured. It needs to follow common standards for data and the metadata. Publishers need to collaborate – coordinate data releases across all cities, all local authorities. ‘Things’ in the data need to use common identifiers across data sets so that they can be joined together, but the data needs to be usable by humans too. 

Used

The data will only be used if the foregoing conditions are met. But government needs to do much more to stimulate its use: to encourage, advertise, train, fund, and invest in potential users. 

The potential GDP rewards for Scotland are huge (est £2.21bn per annum) if done well. But that will not happen by chance. If the same lacklustre, uninterested, unimaginative mindsets are allowed to persist; and no coordination applied to cities and other authorities, then we’ll see no more progress in the next six years than we’ve seen in the last. 

While the OGP process is useful, bringing a transparency lens to government, it is too limited. Government needs to see this as an economic issue as is the case, and one which the current hands-off approach is failing. We also need civic society to get behind this, be active, visible, militant and hold government to account. What we’ve seen so far from civic society is at best complacent apathy. 

Scotland could be great at this – but the signs, so far, are far from encouraging!

Team OD Bods (Karen, Pauline, Rob, Jack, Stephen and Ian)

Swift use of Doric Place Names

Introduction

One of the Code the City 21 projects was looking at providing Scots translations of Aberdeenshire place names for displaying on an OpenStreetMap map. Part of the outcomes for that project included a list of translated places names and potentially an audio version of name to guide in pronunciation.

I’m a firm believer that Open Data shouldn’t just become “dusty data left on the digital shelf” and to “show don’t tell”. This led me to decide to show just how easy it is to do something with the data created as part of the weekend’s activities and to make use of outcomes from a previous CTC event (Aberdeenshire Settlements on Wikidata and Wikipedia) and thus take that data off the digital shelf.

My plan was to build a simple iOS app, using SwiftUI, that would allow the following:

  • Listing of place names in English and their Scots translation
  • View details about a place including its translation, location and photo
  • Map showing all the places and indicating if a translation exists or not

I used SwiftUI as it is fun (always an important consideration) to play with and quick to get visible results. It also provides the future option to run the app as a Mac desktop app.

Playing along at home

Anyone with a Mac running at least Catalina (macOS 10.15) can install Xcode 12 and run the app on the Simulator. The source code can be found in GitHub.

Getting the source data

Knowing that work had previously been done on populating Wikidata with a list of Aberdeenshire Settlements and providing photos for them, I turned to Wikidata for sourcing the data to use in the app.

# Get list of places in Aberdeenshire, name in English and Scots, single image, lat and long

 
SELECT  ?place (SAMPLE(?place_EN) as ?place_EN) (SAMPLE(?place_SCO) as ?place_SCO) (SAMPLE(?image) as ?image) (SAMPLE(?longitude) as ?longitude)  (SAMPLE(?latitude) as ?latitude)
  WHERE {
    ?place wdt:P31/wdt:P279* wd:Q486972 .
    ?place wdt:P131 wd:Q189912 .
    ?place p:P625 ?coordinate.
    ?coordinate psv:P625 ?coordinate_node .
    ?coordinate_node wikibase:geoLongitude ?longitude .
    ?coordinate_node wikibase:geoLatitude ?latitude .
    OPTIONAL { ?place wdt:P18 ?image }.
    OPTIONAL { ?place rdfs:label ?place_EN filter (lang(?place_EN) = "en" )}.
    OPTIONAL { ?place rdfs:label ?place_SCO filter (lang(?place_SCO) = "sco" )}.
    }
GROUP BY ?place
ORDER By ?place_EN

The query can be found in the CTC21 Doric Tiles GitHub repository and run via the Wikidata Query Service.

The query returned a dataset that consisted of:

  • Place name in English
  • Place name in Scots (if it exists)
  • Single image for the place (some places have multiple images so had to be restricted to single image)
  • Latitude of place
  • Longitude of place

Just requesting the coordinate for each place resulted in a text string, such as Point(-2.63004 57.5583), which complicated the use later on. Adding the relevant code

?coordinate psv:P625 ?coordinate_node .
?coordinate_node wikibase:geoLongitude ?longitude .
?coordinate_node wikibase:geoLatitude ?latitude .

to the query to generate latitude and longitude values simplified the data reuse at the next stage.

The results returned by the query were exported as a JSON file that could be dropped straight into the Xcode project.

The App

SwiftUI allows data driven apps to be quickly pulled together. The data powering the app was a collection of Place structures populated with the contents of the JSON exported from Wikidata.

struct Place: Codable, Identifiable {
     let place: String
     let place_EN: String
     let place_SCO: String?
     let image: String?
     var latitude: String
     var longitude: String
     
     // Computed Property
     var id: String { return place }
     var location: CLLocationCoordinate2D {
         CLLocationCoordinate2D(latitude: Double(latitude)!, longitude: Double(longitude)!)
     }
 }

The app itself was split into three parts: Places list, Map, Settings. The Places list drills down to a Place details view.

List view of Places showing English and Scots translation.
List of places in English and their Scots translation if included in the data
Details view showing place name, photo, translation and map.
Details screen about a place
Map showing places and indication if they have been translated into Scots or not.
Map showing places and indicating if they have Scots translation (yellow) or not (red)

The Settings screen just displays some about information and where the data came from. It acts partially as a placeholder for now with the room to expand as the app evolves.

Next Steps

The app created over the weekend was very much a proof of concept and so has room from many improvements. The list includes:

  • Caching the location photos on the device
  • Displaying additional information about the place
  • Adding search to the list and map
  • Adding audio pronunciation of name (the related Doric Tiles project did not achieve adding of audio during the CT21 event)
  • Modified to run on Mac desktop
  • Ability to requested updated list of places and translations

The final item on the above list, the ability to request an updated list of places, in theory is straight forward. All that would be required is to send the query to the Wikidata Query Service and process the results within the app. The problem is that the query takes a long time to run (nearly 45 seconds) and there may be timeout issues before the results arrive.

Ten Years After

“Hear me calling, hear me calling loud, 
If you don’t come soon, I’ll be wearing a shroud.” – Ten Years After (1969)

Introduction

Today marks the tenth anniversary of my involvement with Open Data in Scotland. As I wrote here, back in 2009-2010 I’d been following the work that Chris Taggart and others were doing with open data, and was inspired by them to  create what I now believe to have been the first open data published in the public sector in Scotland.

This piece is a reflection of my own views. These views may be the same as those held by colleagues at Code The City or indeed on the civic side of the Open Government Partnership. I’ve not specifically asked other individuals in either group.

While my involvement in, and championing of, open data in Scotland is now a decade long, my enthusiasm for the subject and in the the social and economic benefits it can deliver, is undiminished by my leaving the public sector in 2017 after thirty four years. In fact the opposite is true: the more I am involved in the OD movement, and study what is being achieved beyond Scotland’s narrow borders, the more I am convinced that we are a country intent on squandering a rich opportunity, regardless of our politicians’ public pronouncements.

But the journey has not been easy. primarily due to a lack of direction from Scottish Government and little commitment, resource or engagement at all levels of public service. A friend who reviewed this blog post suggested that I should replace the picture of a birthday cake (above) with one of a naked human back bearing bleeding scars from the our battles. He’s right –  it is STILL a battle ten years on.

It is not as if the position in Scotland is getting better. We are moving at a glacial pace. The gap between Scotland and other countries in this regard is widening. I gave a talk earlier this year in which I showed assessments of Scotland, Romania and Kenya’s performance in Open Government (source: https://www.opengovpartnership.org/campaigns/global-report/ Vol 2) and asked the audience to identify which was Scotland.

Extracts from https://www.opengovpartnership.org/campaigns/global-report/ Vol 2
Extracts from Vol2 to of the Open Gov Partnership report

Show full version of graphic

Question: Which is Scotland? (Answer)

Economic opportunities

In February 2020 the European Data Portal published a report – The Economic Impact of Open Data – which sets out a clear economic case for open data. That paper looks at 15 previous studies between 1999 and 2020 which have examined at the market size of open data at national and international levels, measured in terms of GDP of each study’s geographical area.

Taking the average and median values from those reports (1.33%  and 1.19% respectively) and an estimated GDP for Scotland (2018) of £170.4bn we can see that the missed opportunity for Scotland is of the order of £2.027bn to £2.266bn per annum.  What is the actual value of the local market created by Scottish-created open data? if pushed for a figure I would estimate that it is currently worth a few hundred thousand pounds per annum, and no more. Quite a gap!

Meantime we have the usual suspect of consultants whispering sweetly in the ears of ministers, senior civil servants and council bosses that we should be monetising data, creating markets, selling it. There will be no mention, I suspect of the heavily-subsidised, private sector led, yet failed Copenhagen Data Exchange, I suspect. (Maybe they can make a few bob back selling the domain name! )

You can buy the failed CityDataExchange.com for just $5195
You can buy the failed CityDataExchange.com for just $5195

While this commercial approach to data may plug small gaps in annual funding for Scotland, and line the pockets of some big companies in the process, it won’t deliver the financial benefits at a national level of anything like the figures suggested by that EU Data Portal report but it will, in the process, actively hamper innovation and inhibit societal benefits.

I hear lots of institutions saying “we need to sell data” or “we need to sell access rights to these photos” or similar. Yet, in so many cases, the operation of the mechanisms of control; the staffing, administration, payment processing etc. far outstrips any generated income. When I challenged ex colleagues in local government about this behaviour their response was “but our managers want to see an income line”  to which we could add “no matter how much it is costing us.” And this tweet from The Ferret on Tuesday of this week is another excellent example of this!

I have also heard lots of political proclamations of “open and transparent” government in Scotland since 2014. Yet most of the evidence points in exactly the opposite direction. Don’t forget, when Covid 19 struck, Scotland’s government was reportedly the only political administration apart from Bolsonaro’s far right one in Brazil to use the opportunity to limit Freedom of Information.

Openness, really?

It is clear that there is little or no commitment to open data in any meaningful way at a Scottish Government level, in local authorities, or among national agencies. This is not to say that there aren’t civil servants who are doing their best, often fighting against political or senior administration’s actions.  Public declarations are rarely matched by delivery of anything of substance and conversations with people in those agencies (of which I have had many) paints a grim picture of political masters saying one thing and doing another, of senior management not backing up public statements of intent with the necessary resource commitment and, on more than occasion, suggestions of bad actors actually going against what is official policy.

I mention below that I joined the Open Government Partnership late in 2019. Initially I was enthusiastic about what we might achieve. While there are civil servants working dedicatedly on open government who want to make it work, I am unconvinced about political commitment to it. We really need to get some positive and practical demonstration that Scottish Government are behind us – otherwise I and the other civil society representatives are just assisting in an open-washing exercise.

In my view (and that of others) the press in Scotland does not provide adequate scrutiny and challenge of government. We have a remarkably ineffective political opposition. We also have a network of agencies and quangos which are reliant on the Scottish Government for funding who are unwilling to push back. All of this gives the political side a free pass to spout encouraging words of “open and transparent” yet do the minimum at all times.

We may have an existing Open Data Strategy for Scotland (2015) stating that Scotland’s data is “open by default”, yet my 2019 calculation was that over 95% of the data that could and should be open was still locked up. And there is little movement on fixing that.

We have many examples of agencies doing one thing and saying another, such as  Scottish Enterprise extolling the virtues of  Open Data yet producing none. Its one API has been broken for many months, I am told.

My good friends at The Data Lab do amazing work on funding MSc and Phd places, and providing funding for industrial research in the application of data science. Their mission is “to help Scotland maximise value from data …” yet they currently offer no guidance on open data, no targeted programme of support, no championing of open data at all, despite the widely-accepted economic advantages which it can deliver. There is the potential for The Data Lab to lead on how Scotland makes the most of open data and to guide government thinking on this!

All of this is not to pick on specific organisations, or hard working and dedicated employees within them. But it does highlight systemic failures in Scotland from the top of government downwards.

Fixing this is an enormous task: one which can only be done by the development of a fresh strategy for open data in Scotland, which is mandated for all public sector bodies, is funded as an investment (recognising the economic potential), and which is rigorously monitored and enforced.

I could go on…. but let’s look at this year’s survey.

(skip to summary)

Another year with what to show for it?

In February 2019 I conducted a survey of the state of open data in Scotland. It didn’t paint an encouraging picture. The data behind that survey has been preserved here. A year on, I started thinking about repeating the review.

In the intervening year I’d been involved in quite a bit activity around open data. I had

  • joined the civic side of the Open Government group for Scotland and was asked to lead for the next iteration of the plan on Commitment Three (sharing information and data) ,
  • joined the steering group of Stirling University’s research project, Data Commons Scotland,
  • trained as a trainer for Wikimedia UK, delivering training in Wikidata, Wikipedia and Wiki Commons, and running multiple sessions for Code The City with a focus on Wikidata,
  • created an open Slack Group  for the open data community in Scotland to engage with one another,
  • created an Open Data Scotland twitter account which has gained almost 500 followers, and
  • initiated the first Scottish Open Data Unconference (SODU) 2020 which had been scheduled to take place as a physical event in March this year. That has now been reconfigured as an online unconference which will happen on 5th and 6th September 2020.

In restarting this year’s review of open data publishing in Scotland my aims were to see what had changed in the intervening 12 months and to increase the coverage of the survey: going broader and deeper and developing an even more accurate picture. That work spilled into March at which point Covid-19 struck. During lockdown I was distracted by various pieces of work. It wasn’t until August, and with a growing sense of the imminence of this 10-year anniversary, that I was galvanised to finish that review.

I am conscious that the methodology employed here is not the cleverest – one person counting only the numbers of datasets produced.  This is something I return to later.

The picture in 2020

I broke the review down into sectoral groupings to make it more managable to conduct. By sticking to that I hope to make this overview more readable. The updated Git Hub repo in which I noted my findings is available publicly, and I encourage anyone who spots errors or omissions to make a pull request to correct them. Each heading below has a link to the Github page for the research.

Overall there is little significant positive change. This is one factor which gives rise to concerns about government’s commitment to openness generally and open data specifically; and to a growing cynicism in the civic community about where we go from here.

Local Government

(Source data here)

I reviewed this area in February 2020 and rechecked it in August.  Sadly there has been no significant change in the publication of open data by local government in the eighteen months since I last reviewed this. More than a third of councils (13 out of a total of 32) still make no open data provision.

While the big gain is that Renrewshire Council have launched a new data portal with over fifty datasets, most councils have shown little or no change.

Sadly the Highland Council portal, procured as part of the Scottish Cities Alliance’s Data Cluster programme at £10,000’s cost, has vanished. I dont think it ever saw a dataset being added to it. Searching Highland Council’s website for open data finds nothing.

While big numbers of data sets don’t mean much by themselves, the City of Edinburgh Council has a mighty 236 datasets. Brilliant! BUT … none of them are remotely current. The last update to any of them was September 2019. Over 90% of them haven’t been updated since 2016 or earlier.

Similarly Glasgow, which has 95 datasets listed have a portal which is repeatedly offline for days at a time. A portal which won’t load is useless.

Dundee, Perth and Stirling continue to do well. Their offerings are growing and they demonstrate commitment to the long-haul.

Aberdeen launched a portal, more than three years in the planning, populated it with 16 datasets and immediately let their open data officer leave at the end of a short-term contract. Some of their datasets are interesting and useful – but there was no consultation with the local data community about what they would find useful, or deliver benefits locally; all despite multiple invitations from me to interact with that community at the local data meet-ups which I was running in the city.

It was hoped that the programme under the Scottish Cities alliance would yield uniform datasets, prioritised across all seven Scottish Cities, but there is no sign of that happening, sadly. So what you find on all portals or platforms is pretty much a pot-luck draw.

Where common standards exist – such as the 360 Giving standard for the publication of support for charities – organisations should be universally adopting these. Yet this is only used by two of 32 authorities, all of whom have grant-making services. Surely, during a pandemic especially,  it would be advantageous to funders and recipients to know who is funding which body to deliver what project?

Councils – Open Government Licence and RPSI

This is a slight aside from the publication of open data, but an important one. If the Scottish Authorities were to adopt an OGL approach to the publication of data and information on their website (as both the Scottish Government’s core site and the Information Commissioner for Scotland do) then we would be able to at least reuse data obtained from those sites. This is not a replacement for publishing proper open data but it would be a tiny step forward.

The table below (source and review data here)  shows the current permissions to reuse the content of Scottish Local Authorities’ websites. Many are lacking in clarity, have messy wording, are vague or misunderstand terminologies. They also, in the main, ignore legislation on fair re-use.

Table of local authority adoption of PGL and RPSI
Table of local authority adoption of PGL and RPSI

Open Government Licence

The Scottish Government’s own site is excellent and clear: permitting all content except logos to be be reused under the Open Government Licence. This is not true for local authorities. At present only Falkirk and Orkney Councils – two of the smaller ones – allow, and promote OGL re-use of content. There is no good reason why all of the public sector, including local government, should not be compelled to adopt the terms of OGL.

Re-use of Public Sector Information (RPSI) Regulations

Since 2015 the public sector has been obliged by the RPSI Regulations to permit reasonable reuse of information held by local authorities. So, even if Scottish LAs have not yet adopted OGL for all website content, they should have been making it clear for the last five years how a citizen can re-use their data and information from their website.

In my latest trawl through the T&Cs and Copyright Statements of 32 Scottish Local Authorities, I found only 7 referencing RPSI rights there, with 25 not doing so (see the full table above). I am fairly sure that these authorities are breaking the legal obligation on public bodies to provide that information.

Finally, given the presence of COSLA on the Open Government Scotland steering group, the situation with no open data; poor, missing or outdated data; and OGL and PRSI issues needs to be raised there and some reassurance sought that they will work with their member organisations to fix these issues.

Health

(Source data here)

The NHS Scotland Open Data platform continues to be developed as a very useful resource. The number of datasets  there has more than doubled since last year (from 26 to 73).

None of the fourteen Health Boards publish their own open data beyond what is on the NHS Scotland portal.

Only one of the thirty Health and Social Care Partnerships (HSCPs) publish anything resembling open data: Angus HSCP.

COVID-19 and open data

While we are on health, I’ve wrote (here and here) early in the pandemic about the need for open data to help the better public understanding of the situation, and stimulate innovative responses to the crisis. The statistics team at Scottish Government responded well to this and we’ve started to develop a good relationship. I’ve not followed that up with a retrospective about what did happen. Perhaps I will in time.

It was clear that the need for open data in CV19 situation caught government and health sector napping. The response was slower than it should have been and patchy, and there are still gaps. People find it difficult to locate data when it is on muliple platforms, spread across Scots Govt, Health and NRS. That is, in a microcosm, one of the real challenges of OD in Scotland.

With an open Slack group for Open Data Scotland there is a direct channel that data providers could use to engage the open data community on their plans and proposals. They could also to sound out what data analysts and dataviz specialists would find useful. That opportunity was not taken during the Covid crisis, and while I was OK in the short term with being used as a human conduit to that group, it was neither efficient nor sustainable. My hope is that post SODU 2020, and as the next iteration of the Open Gov Scotland plan comes together we will see better, more frequent, direct engagement with the data community on the outside of Government, and a more porous border altogether.

Further and Higher Education

(Source data here)

There is no significant change across the sector in the past 18 months. The vast majority of institutions make no provision of open data. Some have vague plans, many of them historic – going back four years or more – and not acted on.

Lumping Universities and Colleges together, one might expect at a minimum properly structured and licensed open data from every institution on :

  • courses
  • modules
  • events
  • performance (perhaps some of this is on HESA and SFC sites?)
  • physical assets
  • environmental performance
  • KPI targets and achievements etc.

Of course, there is none of that.

Universities and colleges

I reviewed open data provision of Universities and Colleges around 17 February 2020. I revisited this on 11 August 2020, making minor changes to the numbers of data sets found.

While five of fifteen universities are publishing increasing amounts of data in relation to research projects, most of which are on a CC-0 or other open basis, there continues to be a very limited amount of real operational open data across the sector with loads of promises and statements of intent, some going back several years.

The Higher Education Statistics Agency publishes a range of potentially useful-looking Open Data under a CC-BY-4.0 licence. This is data about insitutions, course, students etc – and not data published by the institutions themselves. But I could identify none of that. Overall, this was very disappointing.

Further, while there are 20 FE colleges. None produces anything that might be classed as open data. A few have anything beyond vague statement of intent. Perhaps City of Glasgow College not only comes closest, but does link to some sources of info and data.

The Crighton Observatory

While doing all of this, I was reminded of the Crighton Institute’s Regional Observatory which was announced to loud fanfares in 2013 and appears to have quietly been shut down in 2017. Two of the team involved say in their Linked In profiles that they left at the end of the project. Even the domain name to which articles point is now up for grabs (Feb 2020).

It now appears (Aug 2020) that the total initial budget for the project was >£1.1m. Given that the purpose of the observatory was to amass a great deal of open data,  I have also attempted to find out where the data is that it collected and where the knowledge and learning arising from the project has been published for posterity? I can’t locate it. This FOI request may help. The big question: what benefits did the £1.1m+ deliver?

Scottish Parliament

(Source data here).

In February 2019 I found that The Scottish Parliament had released 121 data sets. This covers motions, petitions, Bills, petitions and other procedural data, and is very interesting. This year we find that they have still 121 data sets, so, there are no new data sources.

In fact that number is misleading. In February 2020  I discovered that while 75 of these have been updated with new data, the remaining 46 (marked BETA) no longer work. As of August 2020 this is still the case. Why not fix them, or at worst clear them out to simplfy the finadbility of working data?

Some of these BETA datasets should contain potentially more interesting / useful data e.g. Register of Members Interests but just don’t work. Returning: [“{message: ‘Data is presently unavailable’}”]

I didn’t note the availability of APIs last year, but there are 186 API calls available. Many of these are year-specific. I tested half a dozen and about a third of those returned error messages. I suspect some of these align with the non-functioning historic BETAs.

Sadly the issues raised a year ago about the lack of clarity of the licensing of the data is unchanged. To find the licence, you have to go to Notes > Policy on Use of SPCB Copyright Material. Following the first link there (to a PDF) you see that you have to add “Contains information licenced under the Scottish Parliament Copyright Licence.” to anything you make with it, which is OK. But if you go to the second link “Scottish Parliament Copyright Licence” (another PDF) the wording (slightly) contradicts that obligation. It then has a chunk about OGL but says, “This Scottish Parliament Licence is aligned with OGLv3.0” whatever that means. Why not just license all of the data under OGL? I can’t see what they are trying to do.

Scottish Government

(Source data here)

Trying to work out the business units within the structure of Scottish Government is a significant challenge in itself. Attempting to then establish which have published open data, and what those data sets are, and how they are licensed, is almost an impossible task. If my checking, and arithmetic are right, then of 147 discrete business units, only 27 have published any open data and 120 have published none.

So we can say with some confidence  that the issue with findability of data raised in Feb 2019 is unchanged, there being no central portal for open data in the Scottish public sector or even for Scottish Government. Searching the main Scottish Government website for open data yields 633 results, none of which are links to data on the first four screenfuls. I didn’t go deeper than that.

The Scottish Government’s Statistics Team have a very good portal with 295 Data Sets from multiple organisational-providers. This is up by 46 datasets on last year and includes a two new organisations: The Care Inspectorate and Registers of Scotland. The latter, so far (Aug 2020), has no datasets on the portal.

There are some interesting new entrants into the list of  those parts of Scottish Government publishing data such as David MacBrayne Limited which is, I believe, wholly owned by SG and is the parent, or operator of Calmac Ferries Limited.  On 1st March 2020 they released a new data platform to get data about their 29 ferry routes. This is very welcome. After choosing the dates, routes and traffic types you can download a CSV of results. While their intent appears to be to make it Open Data, the website is copyright and there is no specific licensing of the data. This is easily fixable.

It is also interesting to contrast Transport Scotland with work going on in England. Transport Scotland’s publication scheme says of open data “Open data made available by the authority as described by the Scottish Government’s Open Data Strategy and Resource Pack, available under an open licence. We comply with the guidance above when publishing data and other information to our website. Details of publications and statistics can be found in the body of this document or on the Publications section of our website.” I searched both without success for any OD. Why not say “we don’t publish any Open Data”? Compare this complete absence of open data with even the single project Open Bus Data for England. Read the story here. Scotland is yet again so far behind!

Summary

In the review of data I’ve shown that little has changed in 18 months. Very few branches of government are publishing open data at all. The landscape is littered with outdated and forgotten statements of good intent which are not acted on; broken links; portals that vanish or don’t work; out of date data; yawning gaps in publication and so on.

The claim of “Open By Default” in the current (2015) Open Data Strategy is misleading and mostly ignored with consequence.  The First Minister may frequently repeat the mantra of “Open and Transparent” when speaking or questioned by journalists, but it is easily demonstrable that the administration frequently act in the directly opposite way to that.

The recent situations with Covid-19 and the SQA exams results show Scotland would have found itself in a much better place this year with a mature and well-developed approach to open data: an approach one might have reasonably expected after five full years of “open by default”.

The social and economic arguments for open data are indisputable. These have been accepted by most other governments of the developed world. Importantly, they have also been taken up and acted on by developing nations who have in many cases overtaken Scotland in their delivery of their Open Government plans.

The work I have done in 2019 and in this review is not a sustainable one – i.e. one single volunteer monitoring the activity of every branch and level of government  in Scotland. And the methodology is limited to what is achievable by an individual.

A country which was serious about Open Data would have targets and measures, monitoring and open reporting of progress.

  • It wouldn’t just count datasets published. It would be looking at engagement, the usefulness of data and its integration into education.
  • It would fund innovation: specifically in the use of open data; in the creation of tools; in developing services to both support government in creating data pipelines, and in helping citizens in data use.
  • It would co-develop and mandate the use of data standards across the public sector.
  • It would develop and share canonical lists of ‘things’ with unique identifiers allowing data sets to be integrated.
  • It would adopt the concept of data as infrastructure on which new products, services, apps, and insights could be built.

I really want Scotland to make the most of the opportunities afforded by Open Data. I wouldn’t have spent ten years at this if I didn’t believe in the potential this offers; nor if I didn’t have the evidence to show that this can be done. I wouldn’t be giving up my time year-on-year researching this, giving talks, organising groups and creating opportunities for engagement.

What is fundamentally lacking here is some honesty from Scottish Government ministers instead of their pretence of support for open data.

 

Ian Watt
20 August 2020

Link to an index of pieces I have written on Open Data:
http://watty62.co.uk/2019/02/open-data-index-of-pieces-that-i-have-written/

Answer to quiz

Scotland is B, in the centre. Kenya is A, and Romania C.
I could have chosen Mexico, Honduras, Paraguay, Uruguay – or others. All are doing better than Scotland.

Back up to the quiz

Header Image by David Ballew on Unsplash.

Scotland’s Covid-19 Open Data

We are in unprecedented times. People are trying to make sense of what is going on around them and the demands for up to date, even up-to-the-minute,  information is as never before. Journalists, data scientists, immunologists, epidemiologists and others are looking for data to use to develop that information for the broader public, as well as to feed into predictive modelling. That means that governments and Health Services at all levels (UK and Scotland) need to be publishing that data quickly, consistently, and in a way that makes it easy for the data users to consume it. They need to look at best practice and quickly adopt those standards and approaches.

Let’s start with what this post is not. It is not a criticism of some very hard pressed people in NHS Scotland and Scottish Government who are trying very hard to do the right thing.

So, what is it? It is an honest suggestion of how the Scottish Government must adapt in how it publishes data on the most pressing issue of modern times.

The last five days

Last Sunday, 15th March, as the number of people in Scotland with Covid-19 started to climb in Scotland (even if numbers were still low in comparison to other EU countries) I went looking for open data on which I could start to plan some analysis and visualisation. And I found none.

What I did find was a static HTML webpage. This had the figures for that day:  the  total number of tests conducted, the total number of negative results, and the number of positive cases for each Health Board. This page is then overwritten at 2pm the next day. This is an awful practice, also used by Scotrail to hide its performance month on month.

I was able, using the Internet Wayback machine, to fill in some gaps back to 5th March but that was far from complete. I published what I could on GitHub and mentioned that on Twitter and in a couple of Slack Groups. Thankfully a friend, Lesley, was ahead of me in terms of data collection for her work as a data journalist, and was able to furnish testing data back to the start on 24 January 2020. Since then I’ve updated the GitHub repo daily – usually when the data is published at 2pm.

Almost immediately I began, a couple of people started to build visualisations based on what I had put in GitHub including this one. Some said that they were waiting for the numbers to climb to more significant levels, particularly deaths before they would start to use the data.

Two or three times the data has been published then corrected with some test results for Shetland / Grampian being reassigned between the two. This is understandable given the current circumstances.

SG webpage with table of Covid19 daily cases
SG webpage with table of Covid19 daily cases

On 19 March 2020, the 2pm publication was delayed, with the number of fatalities, and positive results being published after 3.30pm and the total number of tests being published after 7pm. Again – this is undertandable. The present circumstances are unprecedented, process are being developed. Up to now much of Scotland’s open data publication has been done, if at all, at a more leisurely and considered pace. It does make one wonder how, as the numbers rise exponentially, as they surely will, how the processes will cope.

Why is this important?

At this time the public are trying to make sense of a very difficult situation. Journalists, scientists and others are trying to assist in that by interpreting what data there is for them, including building visualisations of that. People are also seeking reassurances – that the UK and Scottish Government are on top of the situation. Transparency around government activity such as testing, and the spread of the virus, would build trust. Indeed there is real concern that Scotland, and the UK as a whole, is not meeting WHO guidance on testing and tracing cases.

But with a static web page, with limited range of data that is erased daily, this is not possible. Even setting up a scraper to grab the essential content from that page is not feasible if the data is only partially published for long periods.

We have some useful data visualisations such as this set by Lesley herself. What can be done is limited. Deaths per health board are are collected, we’ve been told, but they are not published – only a Scotland-level total.

I’ve had it confirmed by someone I know in the Scottish Government that they are looking at creating and posting Linked Open Data which I suspect will be on their platform, which is a great resource but which is seen by many as a barrier to actually getting data quickly and simply.

Italian government GitHub repo
Italian government GitHub repo

Compare this with the Italian Government who have won plaudits from the data science, journalism and developer communities for making their data available quickly and simply using GitHub  as the platform. This is one that is familiar to the end-users. They also have a great range of background information (look at it in Chrome which will translate it). On that platform they publish daily national and regional statistics for

  • date
  • state
  • hospitalised with symptoms
  • intensive care
  • total hospitalised
  • home isolation
  • total currently positive
  • new currently positive
  • discharged healed
  • deceased
  • total cases
  • swabs tests.

Not only is the data feeding the larger, world-wide analysis such as that by Johns Hopkins University, but people at a national level are using that data to create some compelling, interactive visualisations such as this one. As each country starts to recover and infections and deaths start to slow, having ways o visualising that depends on data to drive those views.

[edited] Wouldn’t a dashboard such as this one for Singapore, built by volunteers, be a good thing for Scotland? We could do it with the right data supplied.

Singapore dashboard
Singapore dashboard

[/edited]

So, this is a suggestion, or rather a request, to NHS Scotland and the Scottish Government to put in place a better set of published data, which is made available in as simple and as timely a fashion as can be accomplished under the present circumstances. Give us the data and we’ll crowd-source some useful tools built on it.

How to do that?

The Scottish Government should look to fork one of the current repositories and using that as a starting point. In an ideal world that would be the Italian one – but even starting with my simple one (if the former is too much) would be a step forward.

Also, I would encourage the government to get involved in the conversations that are already happening – here for example in the Scottish Open Data group.

There is a large and growing community there, composed of open data practitioners, enthusiasts and consumers, across many disciplines, who can help and are willing to support the government’s work in this area.

SODU2020 – a guest post by Sarah Roberts of Swirrl

Scottish Open Data Unconference

It’s all going on in Scotland in March. As we spring into Spring (nearly there!), we’re very excited to be sponsoring, and going, to the Scottish Open Data Unconference in Aberdeen on 14th and 15th March. Topics are pitched in the morning of each day, an agenda is created and participants talk as much as the chair. 

Our colleague Jamie Whyte is lucky enough to have a ticket, so if you spot him do say hi! Here are some recent open data happenings we’ve picked up on our radar…

Scottish Index of Multiple Deprivation

The Scottish Index of Multiple Deprivation was released late January and we loved the accompanying briefing document, which put the numbers into context (find it here). The data’s also available on the Scottish Government’s Open Data site, where you can use the Atlas section to find key data zones and see key facts about them. The below screenshot is of the data zone which is ranked as the most deprived in the 2020 SIMD.

SIMD - Greenock Town centre
SIMD – Greenock Town centre

People are already making stuff with the data — below is a screenshot of Jamie’s lava lamp visualisation of the data

Commentary, explanation and analysis from others include: Alasdair Rae’s summary matrix of the SIMD data by council area, a story graphic of the data, an interactive mapping tool, an analysis blog post from Scottish parliament information centre and news articles, like this one from the BBC.

Jamie Whyte - King of the Lava Lamp
Jamie Whyte – King of the Lava Lamp image

W3C Community Group

Another thing we’ve noticed is that there’s preliminary work happening on GraphQL and RDF, which aims to serve as a case for future standardisation. More on this here, where you can send a request to join the group if this is your bag. It’s definitely ours! 

Collaborative work with data

Last, but not least, collaboration. This is a wide concept but it’s also a trend that’s cropping up in different aspects of working with open data. Here are some we’ve noticed:

“promote trust and co-operation between government and civil society.”

  • The Office for National Statistics is publishing data in a collaborative project across a spread of organisations including ONS, HMRC, MHCLG, DWP and DIT. The Connected Open Government Statistics (COGS) project involves a lot of technical collaborative work in harmonising codelists, as well as harmonising a data model and all the processes that go into it. More on this project here on the GSS blog site. 
  • 2019 saw a growing, collaborative API community, with API events involving government and people working with government. We went to one in Newcastle and another one’s arranged for March 16th (if you’re still hungry for more after the unconference!) 
  • The Open Data Institute have been busy, busy, busy. Jeni Tennison spoke about the idea of how collaboration is key for new institutions of the data age, at our Power of Data conference in October (catch that video here). The ODI have also been working on a data and public services toolkit & there’s an introductory event to this in Edinburgh just a few days before the Scottish Open Data Unconference. 

Thanks for reading! If you’d like to find out a bit more about who we are and what we do, take a look at our website, our blog, our latest newsletter and / or our twitter stream. We’ve just been named as one of the FT1000 fastest growing companies in Europe and we’re still hiring, so if you think you can help us we’d love to hear from you. 

We love data and we’re delighted to be sponsoring the Scottish Open Data Unconference. See you there.