You Don’t Need to Know How to Code to Enjoy Our Events

While the projects we carry out may seem a bit daunting to newcomers, everyone involved in our events carries out a vital role which is highlighted by some past attendee experiences. 

As part of our commitments to making Data accessible for all we were looking to digitally archive many 19th and 20th century records that were only available physically or were missing information online. This was the main focus of our 19th and 20th Events known as ‘History + Data = Innovation’ and ‘History and Culture’ respectively. These primarily involved uploading data on a multitude of subjects to Wikidata such as Listed buildings, convict registers and March Stones, which signified the boundary of crofts in Aberdeen primarily in the 16th century.

Heather Black initially found out about the project through the Aberdeen memories Facebook page and got involved transcribing part of the Aberdeen harbour logbooks; specifically the names, registered countries, the name of the captain and what cargo was being shipped.

A page of the harbour arrivals register for 11 Nov 1920.

Having a keen interest in local history the project immediately drew her attention, and when asked about her thoughts on the event, she stated that “It was a good distraction while being on furlough from my work at the time, and very interesting too”, and is looking to participate in future events that catch her interest. 

Sheila Watt was similarly involved in this project, and despite not having too much interest initially ended up having a great experience. Her daughter joined Code the City after looking for a volunteering project to take part in for a Duke of Edinburgh Award last year. Eventually Sheila joined as well after hearing about the Aberdeen Harbour Arrivals project and due to her interest in local history ended up  greatly enjoying her experience, describing it as the perfect project to keep her interested during the first lockdown last year.

Taking on the role of a volunteer transcriber, she enjoyed her experience so much that she took part again for the Returned Prisoner Project and similarly had a great time in the same role. She now regularly checks on our projects through the slack channel and will hopefully be involved in future projects. 

Despite some participants being apprehensive about the tech-side of things, there are lots of ways to contribute to each event, and you’ll definitely find something that plays to your strengths and skills. If you’re interested in any of the event topics you’ll find the work engaging too, and get to talk to some friendly like-minded people.

Check this post to learn more about the Harbour archival process as well as the wider project. 

Our next event is CTC24 Open in Practice where we will have several projects that will appeal to non-coders, offering the opportunity to gather more data and open it up for public benefit.

Aberdeen Built Ships – an update at CTC20

This project was commenced at CTC19 on 11th -12th April. The aim was to import from Aberdeen Built Ships (with the permission of the Galleries and Museums Service who operate it) a complete set of data on those 3000+ ships into Wikidata data in as clean and well-formatted state as possible.

We got part of the way there at CTC19, and in work done in the following weeks, but the data had still not been imported.

CTC20 progress

We had in the weeks since CTC19, we had identified issues with two significant aspects of the data in the core ABS system: a lack of standardisation of ship types (meaning that there were up to nine variants of a single type) and a similar issue with ship builders.

For the purposes of CTC20 we agreed to set these aside and press ahead with an import of core data for each ship we could – and to revisit the specific details above later.

What was done

Core data was imported into Wikidata for most of the ships. We excluded some ships from the import if the name field was blank or UNKNOWN or UNNAMED. Other, existing, ships had an ABS ID added to their item. This has resulted in 3085 ships in Wikidata with an ABS ID at the time of writing.

Screenshot of Samuel Plimsoll
Screenshot of Samuel Plimsoll

Method

We initially tried to use the CSV format for wikidata quickstatements, but couldn’t get this to work so switched to the TSV version. A python script was written to write the quickstatements file that could then be copied into the quickstatements batch import tool. The import had 2 errors for ships that had a range of years in the Date so generated invalid dates in the quickstatements. These (and 2 duplicates that I noticed after the import) are noted to correct later.

The ABS ID property (P8260) was manually added to the ships that already existed in wikidata.

The mappings between QID and ABS ID was found from SPARQL query:

SELECT ?qid ?absid
WHERE
{
  ?qid wdt:P8260 ?absid.
}

Next Steps?

To complete the project the following needs to be done

  • Add Country of Origin (P495) to all existing Aberdeen-built ships in Wikidata. This will suppress the warning messages when viewing each ship.
  • Rationalise all ship builders that exist in ship_builders.csv – deduplicating these and create Wikidata entries for each we will use.
  • Rationalise all ship types that exist in ship_types.csv – deduplicating these and create Wikidata entries for each we will use.
  • Update each ship with specific type and ship builder.
  • Extract / rationalise data from some of the fields, e.g. we have one dimensions field rather than separate fields for length/beam/draft/… and what’s there is inconsistent
  • Isolate ships that have no Wikidata identifier – i.e. any one not in the list of 59 positive matches. Set aside those which have entries for later processing.
  • Source and add pictures of the ships in ABS (see below)
  • Develop a means of monitoring both the original ABS system (rescrape periodically and do a diff on the file in some way? ) and monitor Wikidata for changes to the ships records (Wikidata query, executed periodically, generating a CSV download and checked for differences from previous runs?) to feed back to ABS.

Images of ships

ships with images
Ships with images

Despite there now being 3,085 Aberdeen-built ships in Wikidata only 12 of these (or 0.388%) has a picture associated with them. There is a significant opportunity to work with Aberdeen Museums to add images from their extensive collection to Wiki Commons and associate these with the ships now in Wikidata.

Header image Twice & Rinina25 / CC BY-SA https://upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Genova-Tall_Ship-IMG_1509.JPG/512px-Genova-Tall_Ship-IMG_1509.JPG

Aberdeen Built Ships

This project was one of several initiated at the fully-online Code the City 19 History and Data event.

It’s purpose is to gather data on Aberdeen-built ships, with the permission of the site’s owners, and to push that refined bulk data, with added structure, onto Wikidata as open data, with links back to the Aberdeen Ships site through using a new identifier.

By adding the data for the Aberdeen Built Ships to Wikidata we will be able to do several things including

  • Create a timeline of ship building
  • Create maps, charts and graphs of the data (e.g. showing the change in sizes and types of ships over time
  • Show the relative activity of the many shipbuilders and how that changed
  • Link ship data to external data sources
  • Improve the data quality
  • Increase engagement with the ships database.

The description below is largely borrowed from the ReadMe file of the project’s Github Repo.

Progress to date

So far the following has been accomplished, mainly during the course of the weekend.

Next Steps?

To complete the project the following needs to be done

  • Ensure that the request for an identifier for ABS is created for use by us in adding ships to Wikidata. A request to create an identifier for Aberdeen Ships is currently pending.
  • Create Wikidata entities for all shipbuilders and note the QID for each. We’ve already loaded nine of these into WikiData.
  • Decide on how to deal with the list of ships that MAY be already in Wikidata. This may have to be a manual process. Think about how we reconcile this – name / year / tonnage may all be useful.
  • Decide on best route to bulk upload – eg Quickstatements. This may be useful: Wikidata Import Guide
  • Agree a core set of data for each ship that will parsed from ships.json to be added to Wikidata – e.g. name, year, builder, tonnage, length etc
  • Create a script to output text that can be dropped into a CSV or other file to be used by QuickStatements (assuming that to be the right tool) for bulk input ensuring links for shipbuilder IDs and ABS identifiers are used.

We will also be looking to get pictures of the ships published onto Wiki Commons with permissive licences, link these to the Wiki Data and increase and improve the number of Wikipedia articles on Aberdeen Ships in the longer-term.

Header Image of a Scale Model of Thermopylae at Aberdeen Maritime Museum By Stephencdickson – Own work, CC BY-SA 4.0