Meet Your Next MSP at CTC22

At Code The City 22 we started Meet Your Next MSP, a project to list hustings for the Scottish Parliamentary election. The team comprised of James Baster and Johnny Mckenzie.

James Baster had prior experience working on a similar project for the UK general election in 2015, where they listed over 1000 events in a project that was cited by many charities and campaigns. This showed him that there was interest in such a project. It also showed that many people don’t even know what a hustings areis, so the project deliberately tries to be accessible in order to introduce others to these type of events.

At Code The City 22 we built a basic working prototype; a git repository to hold the data; a Python tool to parse the files in the git repository to a SQLite database, and a Python Flask web app to serve that SQlite database as a friendly website to the public. This website invites submissions to the crowd-sourced data set by means of a Google form.

Thanks to Johnny who wrangled data from National Records of Scotland to make a dataset that mapped postcodes to areas; vital for powering the postcode lookup box on the home page of the site.

Storing data in a git repository is an interesting approach; it has some drawbacks but some advantages (moderation by pull requests and a full history for free). Crucially, it’s not a new idea and is something many people already do so it will be interesting to learn more about this approach.

Since the hackathon, the website has been tweaked, the Google form replaced with a better custom form and the website is now live!

We will run this over the next month and see how this goes.

And after the general election, the lessons won’t be lost. What we are essentially building are tools that let a community of people list events of interest together, with the data stored in a git repository. We think this tool could be applicable to many different situations.

Nautical Wrecks

This is project started as part of CTC21: Put Your City on the Map which ran Saturday 28th Nov 2020 and Sunday 29th Nov 2020. You can find our code on Github.

There are thousands of ship wrecks off the coast of Scotland which can be seen on Marine Scotland’s website

Marine Scotland map of wrecks

In Wikidata the position was quite different with only a few wrecks being logged. The information for the image below was derived from running the following query in Wikidata https://w.wiki/nDt

Initial map of Wikidata shipwrecks

Day one – sourcing the information of the wrecks. 

The project started by research various website to obtain the raw data required. Maps with shipwrecks plotted were found but finding the underlying data source was not so easy.

Data on Marine Scotland, Aberdeenshire Council’s website and on the Canmore website were considered. 

Once data was found, the next stage was finding out the licensing rights and whether or not the data could be downloaded and legitimately reused. The data found on Canmore’s website indicated that it was provided under an Open Government Licence hence could be uploaded to Wikidata. This is the data source which was then used on day two of the project. 

A training session on how to use Wikidata was also required on day one to allow the team to understand how to upload the data to Wikidata and how the identifiers etc worked.

Day two – cleaning and uploaded the data to Wikidata. 

Deciding on the identifiers to use in Wikidata was the starting point, then the data had to be cleaned and manipulated. This involved translating Easting and Northings coordinates to latitude and longitude, matching the ship types between the Canmore file and Wikidata, extracting the reference to the ship from Canmore’s URL and general overall common sense review of the data. To aid with this work a Python script was created. It produced a tab separated file with the necessary statements to upload to Wikidata via Quickstatements. 

A screenshot of the output text file.

The team members were new to Wikidata and were unable to create batch uploads as they didn’t have 4 days since creating their accounts and 50 manual edits to their credit – a safeguard to stop new accounts creating scripts to do damage. 

We asked Ian from Code The City to assist, as he has a long editing history. He continues this blog post. 

Next steps

I downloaded the output.txt file and checked if it could be uploaded straight to Quickstatements. It looked like there were minor problems with the text encoding of strings. So I imported the file into Google Docs. There, I ensured that the Label, Description and Canmore links were surrounded in double quotation marks. A quick find and replace did this. 

I tested an upload of five or six entries and these all ran smoothly. I then did several hundred. That turned up some errors. I spotted loads of ships with the label “unknown” and every wreck had the same description. I returned to the Python script and tweaked it to concatenate the word “Unknown” with a Canmore ID. This fixed the problem. I also had to create a checking method of seeing if our ship had already been uploaded. I did this by downloading all the matching Canmore IDs for successfully uploaded ships. I then filtered these out before re-creating the output.txt file. 

I then generated the bulk of the 24,185 to be uploaded.  I noticed a fairly high error rate. This was due to a similar issue to the Unknown-named ships. The output.txt script was trying to upload multiple ships with the same names (e.g. over 50 ships with the name Hope). I solved this in the same manner as with Unknown-named wrecks, concatenating ship names with “Canmore nnnnnn.”

I prepared this even as the bulk upload was running. Filtering out the recently uploaded ships and re-running the creation of the Output.txt file meant that within a few minutes I was able to have the corrective upload ready. Running this a final time resulted in all shipwrecks being added to WIkidata, albeit with some issues to fix. This had taken about a day to run, refine and rerun. 

The following day I set out to refine the quality of the data. The names of shipwrecks had been left in sentence case: an initial capital and everything else in lower case. I downloaded a CSV of records we’d created, and changed the Labels to Proper Case. I also took the opportunity to amend the descriptions to reflect the provenance of the records from Canmore in the description of each. I set one browser the task of changing Labels, and another the change to descriptions. This was 24,185 changes each – and took many hours to run. I noticed several hundred failed updates – which appear to just be “The save has failed” messages. I checked those and reran them. Having no means of exporting errors from Quickstatements (that I know of) makes fixing errors more difficult than it should be.

Finally I noticed by chance that a good number of records (estimated at 400) are not shipwrecks at all but wrecks of aircraft. Most, if not all, are prefixed “A/C’ in the label.

I created a batch to remove statements for ships and shipwrecks and to add statements saying that these are instances of crash sites. I also scripted the change to descriptions identifying these as aircraft wrecks rather than ship wrecks.

This query https://w.wiki/pjA now identifies and maps all aircraft wrecks.

aircraft wrecks uploaded from Canmore
All aircraft wrecks uploaded from Canmore

This query https://w.wiki/pSy maps all shipwrecks

the location of all shipwrecks uploaded to Wikidata from Canmore.
The location of all shipwrecks uploaded to Wikidata from Canmore.

Next steps?

I’ve noted the following things that the team could do to enhanced and refine the data further:

  • Check what other data is available by download or scraping from Canmore (such as date of sinking, depth, dimensions) and add that to the wikidata records
  • Attempt to reconcile data uploaded from Aberdeen built ships at CTC19 with these wrecks – there may be quite a few to be merged

Finally, in the process of working on the cleaning of this uploaded data I noticed the the data model on Wikidata to support this is not well structured.

This was what I sketched out as I attempted to understand it.

The confusing data model in Wikidata
A confusing data model

Before I changed the aircraft wrecks to “crash site” I merged the two items which works with the queries above. But this needs more work.

  • Should the remains of a crashed aircraft be something other than a crash site? The latter could be cleared of debris and still be the crash site. The term Shipwreck more clearly describes where a wreck is whether buried, on land, or beneath the sea.
  • Why is a shipwreck a facet of a ship, but a crash site is a subclass of aircraft.
  • And Disaster Remains seems like the wrong term for what might be a non-disastrous event (say if a ship from the middle ages gently settled into mud over the centuries and was forgotten about – and certainly isn’t a subclass of Conservation Status, anyway.

I’d be happy to work with anyone else on better working out an ontology for this.

CTC14 Archaeology Weekend – write up

We held our CTC14 Archaeology weekend, which was sponsored by Aberdeenshire Council Archaeology Service, on the weekend of 15 and 16 September 2018.
All code, and some data and documentation which were created over the weekend has been published on Github repos. 

Background

Throughout 2006 an archaeological dig of the East Kirk of the St Nicholas Church was conducted by a team led by the archaeology service of Aberdeen City Council. You can read more of the history here. A large number of skeletal remains and other artefacts were recovered. Written records were created in the form of plans, and log books, and some of these were drawn, then scanned, and a MS Access 2 Database was also created.

Since the end of the dig, some post-excavation analysis of skeletal remains, and other artefacts, has been conducted, but this is far from complete due to a lack of funds.

Saturday – getting started

Following an introduction from Ali Cameron, the dig director, challenges were identified,  ideas for tackling those identified, and teams formed around those.

The teams, and their projects, created a pipeline; one feeding the other.

Below we introduce the teams. Each of these will shortly be linked to individual blog posts for each team.

The Teams

Team Scoliosis

They had two aims:

  • to re-label photos and bringing them together for the ‘skelelocator’ team,
  • to lay out skeletons and explore options for 3D scanning using mobile apps and cameras with photos offloaded to laptops for processing.

Team Skelocator

Working from CSV files (derived from an Access 2 MDB file), JPEG diagrams, Corel Photopaint files and even using original hand-draw plans and log books, this team aimed to create a complete set of data of all excavated remains, allowing these to be plotted in 3D space using X, Y and Z coordinates.

Team Skeleton Bridge

This team set out to create a  schema and mesh diagram which would use the data from team skelocator in a format which the unity burials could use.

Team Unity Burials

The members of this team wanted to create a 3D model of the church interior in Unity, and to place skeletal remains accurately in the 3D space, moving from block models to accurate ones. Their initial focus was on setting up the deployment of their basics to GitHub and to speak to the other teams about what the formats of data they could work with in order to move this along faster

Team PR and Marketing

This team was looking at stories that would help drive any fundraising later, and exploring what data might be possible for visualisations.

Saturday 5pm update

The second round of updates at 5pm on Saturday saw each team make progress.

Team Scoliosis

They were using Qlone for scanning from mobile phones and found it takes only a few minutes per bone – this is also the app recommend for this by Historic Scotland. However, they are blocked by the size of the paper grid that is needed under the object being scanned. An A3 sheet is not quite big enough for larger bones. Another group found it took 20 minutes with laptop to produce low res version from camera photos, and all tried pushing completed models to Sketchfab site.

Team Skelocator

They explored raw files to see what could be extracted with different tools looking at exif data and whether this could be supplemented with data from the dig books if necessary.

Team Skeleton Bridge

It appeared that there was no need for this team as Skelocator could output neutral format files, with agreed content, directly to team Unity Burials. So the team disbanded and members were absorbed into other teams.

Team Unity Burials

They were experimenting with how they can manipulate data from one app to another to provide reference points for when they do have the skeletons to place in the model, and how they might show the metadata of each skeleton too.

Team PR and Marketing

This team were exploring how to use the Microsoft ML libraries to build a chatbot that would use FAQ information about the dig to answer questions.

Sunday morning

The Sunday had teams coming together from 9:30-10:30 and most people returned which was good. We saw an update around noon.

Team Scoliosis

The team used Qlone to scan more bones with mobiles. They discovered the resolution was high enough, even on smaller ones, to be able to notice things which hadn’t previously been noticed such as what appear to be sword cuts to a rib bone (white marks) on a person who was known to have been stabbed in the head.

This highlights the importance of scanning the bodies while they are available before re-internment at a later date. The larger scans with Qlone were found to be too big for detail as you had to stand further away and thus lost resolution.

Team Skelocator

They are now using the Python library Sloth to bring the images to life by extracting text from them and putting this into a JSON file. They also found this enabled a way to position each of the skeletons by marking their locations on the image and then creating a grid for reference from a known fixed location, which could also be used by the unity burial team.

Team Unity Burials

One member worked out a convoluted, but workable, process to get images into Unity from other file formats, while another team member worked on a UI for the public to navigate the model.

Team PR and Marketing (aka skeleton)

They started a chatbot, but found it needs to have its data in a better format, and are starting to work with with cleaning up a list of skeletons for using with dc.js visualisations, as well as a webpage for holding the data from the other teams.

Final presentations on Sunday afternoon – 4pm

Team Scoliosis

Everything is being brought together in Sketchfab so that all models can be found.

https://sketchfab.com/models/6edc98b057c740b5a66d34276ee261da/embed St Nicholas Kirk Skeleton SK820 Skull by Moira Blackmore on Sketchfab

They are exploring how to combine smaller models to create bigger ones by exporting them into another app. They discovered the limits of scaling the grid in Qlone to get the best resolution with devices, and how to use photos and a laptop to get a scan of whole skeleton using the right background cloth.

Team Skelocator

They further pulled data from processed photos with x,y,z locations from a superimposed grid that could be automated with human double-checking to make up for the lack of GPS being available for their tools in 2006, as it is now. This can then be handed to unity burials.

Team Unity Burials

One member found more ways to bring in scanned data from team scoliosis, while the other improved upon the UI for the VR version and demoed the basic model to people with the VR headset.

Team PR and Marketing (aka skeleton)

They updated their google doc spreadsheet and pulled it into the pages at GitHub.io so that it could be queried for skeleton info and finished adding the basic visualisations of this data with dc.js while the skelebot was improved with some personality, but wasn’t as useful as it was hoped it would be.

Take-aways from the weekend

  1. We saw how well cross-functional teams worked as there was usually someone around during the event who could help with something, and that within each team people were able to bring diverse experience to help with issues. This was most apparent during the ‘round ups’ of team effort when people heard what others were doing or trying to do.
  2. We learned that skeletons hundreds of years old, and paper records of 12 years ago ensured durability of information, while a two year old tech gadget from Google was found to be useless after it updated itself as google had stopped the project, so its built in two-camera scanner couldn’t be used. Similarly file formats which were common at the time of the dig, just 12 years ago, were challenging to access.
  3. We also found that these ‘strongly themed’ events work well for participation. We had 28 attendees on Saturday and 25 on Sunday.

Chatbots and AI – #CTC8

Code the City #8, which will take place in on Sat 25th to Sunday 26th February 2017, will be an exploration of the world of chatbots and AI (or Artificial Intelligence), identifying problems to tackle and quickly prototyping solutions.

>>> Book a ticket on our  Eventbrite page 

What are chat bots?

A chatbot is a piece of software that interacts with a customer or user to directly answer their questions. It uses existing data or information coupled with artificial intelligence to respond in a human-like way, guiding the user to a solution.

There are many examples of live chat bots in this exciting, emerging field. A chatboat could give you travel directions, tell you when its next going to rain in your area, or help you contest parking tickets. It could book you a flight and hotel, or act as a free lawyer to help the homeless get housing . The HBO series Westworld has even launched a bot to help you interact with the (fictional) holiday park!

If you are new to this field and want to get started we suggest you read the Complete Beginners Guide to Chatbots (and some of the links at the end of this article).

Example Travel Bot
Example Travel Bot

Example Waste Bot
Example Waste Bot

How will the weekend run?

We’ll apply our usual  Code The City methodology:

  • Bring together a diverse range of people from various backgrounds, to form teams.
  • Identify problems that we’d like to apply chatbots to solve.
  • Identify approaches,  information and data, to guide how we develop the bots and train them
  • Mix academic thinking, and user need, with open source technology and open data to develop new services
  • Iterate quickly through approaches, testing ideas, failing quickly and refining our approaches.
  • Prototype and demonstrate solutions to an interested audience

Who should attend?

  • Service owners – and service providers
  • Academics and students in the field of chatbots and artificial intelligence
  • Coders
  • Data specialists
  • Front-end and UX designers
  • Bloggers and social media practitioners
  • Anyone with an interest in getting involved in creating bots even for fun!

What you will do?

You will create mixed teams to workshop chatbot solutions to real world issues.  Maybe these will building on the outputs of previous work we’ve done at CodeTheCity. Through rapid prototyping you will create new applications and have some fun in the process.

We’ll show you new techniques for service design, idea generation, prototyping, and rapid iterative application development – and you will show other participants some tricks and approaches, too. We’ll share knowledge and learning.

You might even get a Tshirt, and we can guarantee the best catering of any weekend workshop in the city!

To book a free ticket visit our Eventbrite page   But be quick, tickets will go swiftly!

All attendees will get a year’s free membership of the Open Data Institute.

You can find out more about the previous events on tumblr, on the eventifier, and on flickr.

If you have any questions please get in touch.

How can I support this event?

If you are interested in sponsoring this event please, or providing other support such as access to online tools or services, please  get in touch.

Useful Articles and Resources

>>> Book a ticket on our  Eventbrite page

Team pitches – catchup session

Here’s where the team’s ideas have got to  – at around 12.00

rashban – gamify throwing away rubbish by turning the trash can into an actual monster and creating an app that gives you points according to what you throw away

Land Revival – encourage community projects and parks inside derelict and vacant sites using OS open data, and Edinburgh Council data – one website to see all the information you need

Edinbro – getting people together, help each other out, and bring their community closer together. Put up tasks, and carry out the tasks with a leader board of who is helping out.

Biodiversity app (maybe called FLY) – cool social media app to help parents and young people and learn about biodiversity, a challenge every day from easy to hard – with more points for each.  To make sure other folk can comment if it’s true of false. Also want to build heat maps to show where species have been found and build a heat map, that changes monthly

Shower idea has changed – now thinking about other services such as bike pumps and mapping where these are available

A bin that eats rubbish – working out how to use Arduino for this.