Scotland’s Open Data, February 2019. An update.

Note: this blog post first appeared on codethecity.co.uk in February 2019 and has been archived here with a redirect from the original URL.

Scotland’s provision of open data may be slowly improving, but it is a long way behind the rest of the UK. In my most recent trawl through websites and portals I found a few minor improvements, which are positive, but progress is too slow; some data providers are slipping backwards; and most others are still ignoring the issue altogether. Now is the time for the Scottish Government to act to fix this drag on the Scottish economy and society, and stop inhibiting innovation.

Latest review

Over the last week, I have conducted yet another trawl of Scottish Open Data websites and portals. I keep this updated on this Github Repo.  I’ve carried out this research without assistance, in my own time. The review could be more comprehensive, frequent and robust if I was supported to do it.

This work builds on previous pieces of research I’ve carried out and articles that I have written. Recently, I’ve created an index of those blog posts here as much for my own convenience of finding and linking to them as anything.

During this latest trawl, I’ve tried to better capture the wide spread of Scottish Government departments, agencies, non-departmental public bodies, health boards, local authorities, health and social care partnerships and academic institutions;  and assess each sector using quite conservative measures.

The output of that, as we will see below, does not paint a good picture of Scotland’s performance, despite a few very good examples of people doing good work despite a clear policy gap.

Let us look at this sector by sector, following the list of findings here.

Local Authorities

Of Scotland’s 32 local authorities, only 19 produce open data of any kind.  This group uses a mixture of open data portals (10), web landing pages (7) and GIS systems (2). This leaves 13 who produce no open data whatsoever.

Those 19 councils (ignoring the other 13) produce a total of 731 datasets, giving a mean for the group of 38 and a median of 17 datasets. This total is only six more than I found three months ago, despite Dumfries and Galloway launching a new portal with 33 datasets !

Also, stagnation is a real issue. For example, it is worth noting once again that while Edinburgh produces an impressive 234 open data sets, only five of those have been updated in the last six months, and 228 of them date from 2014-2017.  While there is a value in retaining historic data ( allowing comparisons, trends etc to be analysed), the value of data which is not being updated diminishes rapidly.

When I ran the OD programme for Aberdeen City Council (which, like all Scottish councils, is a unitary authority), based on some back-of-the-envelope calculations I reckoned that we could reasonably expect to have about 250 data sets. So, if each of the 32 did the same, as we would expect, then we’d have 8,000 datasets from local authorities alone. This puts the 731 current figure into perspective.

Scottish Government

So far, I have found the following open data being produced:

  • 248 datasets on the excellent, and expanding, Statistics.Gov.Scot portal  covering a number of departments, agencies and NDPBs,
  • 54 datasets on the Scottish Natural Heritage portal, 53 of which are explicitly covered by OGL and one marked “free to use data.”
  • At least 43 OGL-licensed mapping layers on the Marine Scotland portal
  • Just four geospatial datasets for download on the Spatial Hub
  • Six Linked open data sets, licensed under OGL, on the SEPA site.
  • Great interactive mapping of the Scottish Indices of Multiple Deprivation, for which the source Data is included above on the Statistics Portal mentioned above.

That makes a total of 353 datasets. I’ve not tracked these number previously, so can’t say if they are rising, but there certainly appears to be good progress and some good quality work going on to make Scottish Government data available openly. This includes the four newly-opened sets of boundary data by the Spatial Hub, out of 33 data sets.

However, if we look at the breadth of agencies etc that comprises the Scottish Government, it is clear that there are many gaps. In addition to the parent body of the Scottish Government there are a further 33 Directorates, 9 Agencies, and 92 Non-Departmental Public Bodies. That’s a total of 135 business units.

Let’s assume that they could each produce a conservative 80 data sets, and it is arguable that that should be considerably higher, then we’d expect 10,800 datasets to be released. Suddenly, 353 doesn’t seem that great.

Health

Scotland’s Health service is composed, in addition to the parent NHS Scotland body, of 14 Health Boards and 30 joint Health and Social Care Partnerships. That gives a total of 45 bodies.

Again, taking the same modest yardstick, of 80 open data sets for each, we would expect to see 3,600 data sets released.

What I found was 26 data sets on the new NHS Scotland open data portal. This is a great, high-quality resource, which I know from conversations with those behind it has great commitment to adding to its range of data provided.

However, given our yardstick above, we are still 3,574 data sets short on Scottish Health data.

Higher and Further education

Scotland’s HE / FE landscape comprises of 35 Universities and colleges.

Glasgow and Edinburgh Universities each have an open data publication mechanism for data arising out of a business operation, which contain interesting and useful data.

Despite that, there is no operational, statistical or other open data being created by any universities or colleges that I could identify. Again, using the same measure as above, that produces a deficit of (80 x 35) or 2,800 datasets.

Supply versus expectation

If we accept for the moment that the approximate number of data sets that we might expect in the Scottish public sector is as set out above, and that the current provision is, or is close to, what I have found in this trawl, then what is the over all picture?

Sector Published Expected Defecit
Local Government 731 8000 7,269
Scottish Government 353 10,800 10,447
Health 26 3,600 3,574
FE / HE 0 2,800 2,800
Totals 1,110 25,200 24,090

Table 1: Supply versus expectation of Scottish public sector Open Data

As we can see from the table above, it appears that the Scottish public sector is currently publishing 1,110 of 24,090 expected open data sets. This is just 4.6%. So, by those calculations, more than 95% of data that we might reasonably expect to see published as Open Data is not being released.

Scotland is behind the UK generally

Whether you agree with the exact figures or not, and I am open to challenge and discussion, it is clear that we are failing to produce the data that is badly needed to stimulate innovation and deliver the economic and social benefits that we expected when set out to deliver open data for Scotland.

I’ve long argued that in terms of the UK’s performance in Open Data league tables, such as the Open Data Barometer, Scotland is a drag on the UK’s performance, with Scotland’s meagre output falling well short of the rest of the UK’s Open Data.  In addition to existing approaches, we should see Scotland’s OD assessed separately, using the same methodology, in order to be able compare Scotland with the UK as a whole. That would allow us to measure Scotland’s performance on a like-for-like basis, identify shortfalls and target remedial action where needed.

Policy underpinning

I have argued previously that a significant issue which stops the Scottish public sector getting behind open data is the lack of public policy to make it happen, as well as an ignorance, or denial, of the potential economic and social benefits that it would bring. While I was part of the group who wrote the Scottish Government’s 2015 Open Data Strategy, it was, in its final form, toothless and not underpinned by policy.

We now have an Open Government Action Plan for Scotland 2018-2020 (PDF). This is  great step forward but unfortunately it is almost entirely silent on Open Data, as pointed out in my response to the draft in November 2018.

Even when Open Data does make an appearance, on page 19, it is relation to broader topic rather than forming actions on its own merits.  The position is similar in the plan’s detailed commitments.  This is not to denigrate the work that has gone into these, and the early positive engagement between Scottish Government and civic groups, but this is a huge missed opportunity – and we should not have to wait until 2020 to rectify it.

At this point, it is worth contrasting this with the Welsh Government’s Open Government plan 2016-2018 which was reviewed recently (PDF). In that plan, Open Data was the entire focus of the first two sections, and covered pages 4 to 6 of the plan. This was no afterthought: it was a significant driver and a central plank of their open government plan.

The broader community

Scotland still lacks a developed Open Data community. This will come in time as data is made more widely available, is more usable and useful – and also through the engagement with the Open Government process  – but we all need to work to develop that and accelerate the process. I set out suggestions for this in a previous post.

There are significant opportunities to grow the use of open data through the opening of private sector and community-generated and -curated data.

The universities and colleges in Scotland should be adopting open data in their curriculum, raising awareness among students, creating entrepreneurs who can establish businesses on the back of open data.

Schools should be using open data to get their classes involved: using it to explain their environment, climate, and transport system; to understand local demographics, the distribution of local government spending, or comparative attainment of schools.

Government should be  developing the curriculum to use open data to foster a better understanding of data and how it underpins modern society.

There are some positive things going on: the roadshows that the Scottish Government are doing, as well as other Data Fest Fringe events; the regular data hack weekends we’ve been doing in Aberdeen under the Code The City banner; and the major long-term project to build and deploy community-hosted air quality monitoring sensors which provide open data for the local community. These need to become the norm – and to be happening across the country.

Organisations such as The Data Lab, Censis and other innovation centres have a great opportunity here to advance their work, whether in education, community building or fostering innovation, and to support this to achieve their organisational missions.

Bringing people together

Having earlier created a Twitter account for a nascent Scottish Open Data Action Group (@Soda_group), I have reconsidered that. Instead of an action group to pressure, shame or coerce the Scottish Government into action, what we need is a common group that has the Scottish Government onside – and everyone works together. So I have renamed it @opendata_sco. It already has 179 followers and I hope that we can grow that quickly, and use that to generate more interest and engagement.

I have also launched a new open Slack channel for Open Data Scotland, so that a community can better communicate with one another.

Please join, using this form.

As I have said previously this isn’t a them-and-us, supply-and-demand relationship. We’re all in it together, and the better we collaborate as a community the better, and quicker, society as a whole benefits from it.

========================================

Header photo by Andrew Amistad on Unsplash

Boundaries, not barriers

Note: This blogpost first appeared on codethecity.co.uk in January 2019 and has been archived here with a redirect from the original URL. 

I wrote some recent articles about the state of open data in Scotland. Those highlighted the poor current provision and set out some thoughts on how to improve the situation. This post is about a concrete example of the impact of government doing things poorly.

Ennui: a great spur to experimentation

As the Christmas ticked by I started to get restless. Rather than watch a third rerun of Elf, I decided I wanted to practice some new skills in mapping data: specifically how to make Choropleth Maps. Rather than slavishly follow some online tutorials and show unemployment per US state, I thought it would be more interesting to plot some data for Scotland’s 32 local authorities.

Where to get the council boundaries?

If you search Google for “boundary data Scottish Local Authorities”  you will be taken to this page on the data.gov.uk website. It is titled “Scottish Local Authority Areas”  and the description explains the background to local government boundaries in Scotland. The publisher of the data is the Scottish Government Spatial Data Infrastructure (SDI). Had I started on their home page, which is far from user-friendly, and filtered and searched, I would have eventually been taken back to the page on the data.gov.uk data portal.

The latter page offers a link to “Download via OS OpenData” which sounds encouraging.

Download via OS Open Data
Download via OS Open Data

This takes you to a page headed, alarmingly, “Order OS Open Data.” After some lengthy text (which warns that DVDs will take about 28 days to arrive but that downloads will normally arrive within an hour), there then follows a list of fifteen data sets to choose. The Boundary Line option looked most appropriate after reading descriptions.

This was described as being in a proprietary ERSI shapefile format, and being 754Mb of files, with another version in the also proprietary Mapinfo format. Importantly, there was no option for downloading data for Scotland only, which I wanted. In order to download it, I had to give some minimal details, and complete a captcha. On completion, I got the message, “Your email containing download links may take up to 2 hours to arrive.”

There was a very welcome message at the foot of the page: “OS OpenData products are free under the Open Government Licence.” This linked not to the usual National Archives definition, but to a page on the OS site itself with some extra, but non-onerous reminders.

Once the link arrived (actually within a few minutes) I then clicked to download the data as a Zip file. Thankfully, I have a reasonably fast connection, and within a few minutes I received and unzipped twelve sets of 4 files each, which now took up 1.13GB on my hard drive.

Partial directory listing of downloaded files
Partial directory listing of downloaded files

Two sets of files looked relevant: scotland_and_wales_region.shp and scotland_and_wales_const_region.shp. I couldn’t work out what the differences were in these, and it wasn’t clear why Wales data is also bundled with Scotland – but these looked useful.

Wrong data in the wrong format

My first challenge was that I didn’t want Shapefiles, but these were the only thing on offer, it appeared. The tutorials I was going to follow and adapt used a library called Folium, which called for data as GeoJson, which is a neutral, lightweight and human readable file format.

I needed to find a way to check the contents of the Shapefiles: were they even the ones I wanted? If so, then perhaps I could convert them in some way.

To check the shapefile contents, I settled on a library called GeoPandas. One after the other I loaded scotland_and_wales_region.shp and scotland_and_wales_const_region.shp. After viewing the data in tabular form, I could see that these are not what I was looking for.

So, I searched again on the Scottish Spatial Infrastructure and found this page. It has a Download link at the top right. I must have missed that.

SSI Download Link
SSI Download Link

But when you click on Download it  turns out to be a download of the metadata associated with the data, not the data files. Clicking Download link via OS Open Data, further down page, takes you back to the very same link, above.

I did further searching. It appeared that the Scottish Local Government Boundary Commission offered data for wards within councils but not the councils’ own boundaries themselves. For admin boundaries, there were links to OS’ Boundary Line site where I was confronted by same choices as earlier.

Eventually, through frustration I started to check the others of the twelve previously-downloaded Boundary Line data sets and found there was a shape file called “district_borough_unitary_region.shp” On inspection in GeoPandas it appeared that this was what I needed – despite Scottish Local Authorities being neither districts nor boroughs – except that it contained all local authority boundaries for the UK – some 380 (not just the 32 that I needed).

Converting the data

Having downloaded the data I then had to find a way to convert it from Shapefile to Geojson (adapting some code I had discovered on StackOverflow) then subset the data to throw away almost 350 of the 380 boundaries. This was a two stage process: use a conversion script to read in Shapefiles, process and spit out Geojson; write some code to read in the Geojson, covert it to a python dictionary, match elements against a list of Scottish LAs, then write the subset of boundaries back out as a geojson text file.

Code to convert shapefiles to geojson
Code to convert shapefiles to geojson

Using the Geojson to create a choropleth map

I’ll spare the details here, but I then spent many, many hours trying to get the Geojson which I had generated to work with the Folium library. Eventually it dawned on me that while the converted Geojson looked ok, in fact it was not correct. The conversion routine was not producing the correct Geojson.

Another source

Having returned to this about 10 days after my first attempts, and done more hunting around (surely someone else had tried to use Scottish LAs as geojson!) I discovered that Martin Crowley had republished on Github boundaries for UK Administrations as Geojson. This was something that had intended to do for myself later, once I had working conversions, since the OGL licence permits republishing with accreditation.

Had I had access to these two weeks ago, I could have used them. With the Scottish data downloaded as Geojson, producing a simple choropleth map as a test took less than ten minutes!

Choropleth map of Scottish Local Authorities
Choropleth map of Scottish Local Authorities

While there is some tidying to do on the scale of the key, and the shading, the general principle works very well. I will share the code for this in a future post.

Some questions

There is something decidedly user-unfriendly about the SDI approach which is reflective of the Scottish public sector at large when it comes to open data. This raises some specific, and some general questions.

  1. Why can’t the Scottish Government’s SDI team publish data themselves, as the OGL facilitates, rather than have a reliance on OS publishing?
  2. Why are boundary data, and by the looks of it other geographic data, published as ESRI GIS shapefiles or Mapinfo formats rather than the generally more-useable, and much-smaller, GeoJson format?
  3. Why can’t we have Scottish (and English, and Welsh) authority boundaries as individual downloads, rather than bundled as UK-level data, forcing the developer to download unnecessary files? I ended up with 1.13GB (and 48 files) of data instead of a single 8.1MB Scottish geojson file.
  4. What engagement with the wider data science / open community have SDI team done to establish how their data could be useful, useable and used?
  5. How do we, as the broader Open Data community share or signpost resources? Is it all down to government? Should we actively and routinely push things to Google Dataset Search? Had there been a place for me to look, then I would have found the GitHub repo of council boundaries in minutes, and been done in time to see the second half of Elf!

And finally

I am always up for a conversation about how we make open data work as it should in Scotland. If you want to make the right things happen, and need advice, or guidance, for your organisation, business or community, then we can help you. Please get in touch. You can find me here or here or fill in this contact form and we will respond promptly.

It is easier to recycle a fridge than reuse Scottish public sector website content and data!

During the course of  Code The City 17: Make Aberdeen Better this weekend we made a startling discovery. It is easier to recycle your old fridge-freezer than to get data and content for re-use from Scottish public sector websites. As a consequence, innovating new solutions to common problems and helping make things easier for citizens is made immeasurably more difficult.  

One of the event’s challenges posed was “How do we easily help citizens to find where to recycle item ‘x’ in the most convenient fashion. That was quickly broadened out to ‘dispose of an item” since not everything can be recycled – some might be better reused, and others treated as waste, if it can’t be reused or recycled. With limited kerbside collections, getting rid of domestic items mainly involves taking them somewhere – but where?

With climate change, and the environment on most people’s minds at the moment, and legislative and financial pressures on local authorities to put less to landfill, surely it is in everyone’s interest to make it work as well as it can.

To test how to help people to help themselves by giving advice and guidance, we came up with a list of 12 items to test this on – including a fridge, a phone charger, a glass bottle, and tetra pack carton. On the face of it this should be simple, and probably has been solved already.

The Github Repo

All of Code The City hack weekend projects are based on open data and open source code. We use Github to share that code – and any other digital artefacts created as part of the project. All of this one’s outputs can be found (and shared openly) here.

Initial research

That was where we started: looking to see if the problem has already been solved.  There is no point in reinventing the wheel.

We looked for two things – apps for mobile phones, and websites with appropriate guidance.

Aberdeen specific information?

Since we were at an event in Aberdeen we first looked at Aberdeen City Council’s website. What could we find out there?

Not much as it turned out – and certainly not anything useful in an easy-to-use fashion. On the front page there was an icon and group of suggested services for Bins and recycling; none of which were what we were looking for.

ACC Bins and recycling
ACC Bins and recycling

Typing recycling into the search box (and note we didn’t at this stage know if our hypothetical item could be recycled) returned the first 15 of 33 results.  As shown below.

Search results for recycling
Search results for recycling

The results were a strangely unordered list – neither sorted alphabetically nor by obvious themes. So relevant items could be on page 3 of the results. Who wants to read policies if they are trying to dispose of a sofa? Why are two of (we later discovered) five recycling centres shown but three others not? Why would I as a citizen want to find out about trade waste when I just want to get rid of a dodgy phone charger?

Why is there a link to all recycling points (smaller facilities in supermarket carparks or such like, with limited acceptance of items), but apparently not to all centres which cover much more items? Actually there is a link ‘Find Your Nearest Recycling Centre’ (but not your nearest recycling point which are much more numerous). This takes you a map and tabular list of centres and what they accept. And it is easy to miss the search box between the two. No such facility exists for the recycling points.

Open Data?

Perhaps there is open data on the ACC Data portal that we could re-purpose – allowing us to build our own solution? Sadly not – the portal has had the same five data sets for almost two years, and every one of those has a broken link to the WMSes.

If we were in Dundee we could download and use freely their recycling centre data. But not in Aberdeen.

Dundeee recycling Open Data
Dundeee recycling Open Data

Apps to the rescue?

There are some apps and services that do most of what we are trying to do. For example iRecycle – Iphone and Android is a nice app for Android and iOS that would work were it not for US locations only.

We couldn’t find something for Scotland that worked as an App.

Other sources of information?

Since we drew a blank as far as both Aberdeen City Council and any useable apps, we widened our search.

Recycle For Scotland

The website Recycle For Scotland (RFS) is, on the face of it a useful means to identify what to do with a piece of domestic waste. Oddly, there appears not to be any link to it that we could find from any of the ACC recycling pages.

BUT …… it doesn’t work as well as it could and the content, and data behind it have no clear licence to permit reuse.

The Issues with RFS

Searching the site, or navigating by the menus, for Electrical Items results in a page that is headed “This content was archived on 13th August 2018” – hardly inspiring confidence. No alternative page appears to exist and this page is the one turned up in navigation on the site.

Recycle For Scotland Archived content
Recycle For Scotland Archived content

Searching for what to do with batteries in Aberdeen results in a list of shops at least one of which closed down about 18 months ago. Entering a search means entering your location manually – every time you search! This quickly becomes wearing.

While the air of neglect is strong, the site is at least useful compared to the ACC website. But it doesn’t do what we want. Perhaps we could re-use some of the content? No – there is no clear licence regarding reuse of the website’s content.

The site appears to be a rebadged version of Recycle Now, built for Zero Waste Scotland (ZWS). According to ZWS’s Terms and Conditions on their own site, and deeply ironically, you can’t (re)use any materials from that site.

Zero Waste Scotland - zero re-use
Zero Waste Scotland – zero re-use

ZWS are publicly funded by the Scottish Government and the European Regional Development Fund – all public money.

Scottish Government Fund ZWS
Scottish Government Fund ZWS

Public funding should equal open licences

We argue that any website operated by a government agency, or department, or NDPB, should automatically be licensed under the Open Government Licence (OGL). And any data behind that site should be licensed as Open Data.

The Scottish Government’s own website is fully licenced under OGL.

Changing the licensing of Recycle For Scotland website, making its code open source, and making its data open would have many benefits.

  • its functionality could be improved on by anyone
  • the data could be repurposed in new applications
  • errors could be corrected by a larger group than a single company maintaining it.

Where did this leave us?

Having failed to identify an app that worked for Scotland, nor interactive guidance on the ACC website, we tried the patchy and, on the face of it, unreliable RFS site. We’d turned to the data and whether we could construct something useable from open data and repurposed, fixed, content over the weekend – this is a hack event after all.

But in this we were defeated – data is wrapped up in web pages: formatted for human readability, not reuse in new apps.

Websites which were set up to encourage re-use and recycling ironically prohibit that as far as their content and data is concerned, and deliberately stifle innovation.

Public funding from the City Council, the Scottish Government and the European Regional Development Fund is used to fund sites which you have paid but elements of which you cannot reuse yourself.

Finally

At a time of climate crisis, which the Scottish Government has announced is a priority action, it can’t be right that not only is it difficult to find ways to divert domestic items from landfill,  but also that these Government-funded websites have deliberate measures in place to stop us innovating in order to make access to reuse and recycle easier!

Hopefully politicians, ministers and councillors will read this (please draw it to their attention) and wake up to the fact that Scotland deserves, and needs, better than this.

Only by having an Open Data by default policy for the whole of the Scottish Public Sector, and an open government licence on all websites can we fix these problems through innovation.

After all if the non-functioning Northern Ireland Assembly can come up with an open data strategy that commits the region to open data by default, why on earth can’t Scotland?

See below:

“Northern Ireland public sector data is open by default. Open by default is the first guiding principle that will facilitate and accelerate Open Data publication.”

NI Open Data principles
NI Open Data principles

[Edit – Added 12-Nov-2019]

Postscript

If you are interested to read more about the poor state of Scottish Open Data you might be interested in this post I wrote in February 2019 which also contains links to other posts on the subject:

Scotland’s Open Data, February 2019. An Update.

Sadly, not much has changed in the intervening nine months.

[/Edit]