GLAM/Newsletter/December 2020/Contents/WMF GLAM report
|
The GLAM & Culture office hours
ByAs some of the GLAM newsletter readers may already know, for the last three months, the GLAM and Culture team at the Wikimedia Foundation has been holding monthly office hours. There is a specific topic each month with a Foundation host and guest presenters from the movement. The meetings are intended as an open forum, a place for Wikimedians and GLAM professionals to share ideas, conversations, and creativity.
The meetings usually take place in Zoom and there are two slots each month to reach different time zones. January's GLAM & Culture office hours will introduce project grants for community organizing. Join us on Monday 25 January 4.30-5.30pm UTC or Tuesday 26 January 9.30am-10.30pm UTC.
Stay tuned to this page for joining details and information about the following months.
Below, you can find documentation of last year's events:
September: IIIF on Wikimedia Commons
The first GLAM & Culture office hours happened in September and it was about potential GLAM use cases for IIIF, especially on Wikimedia Commons. At the Monday meeting, the presenter was Evan Prodromou, Product Manager in the Foundation's Platform team. Evan gave a general introduction to the Foundation's new API service. On Tuesday, Jason Evans, National Wikimedian at the National Library of Wales, shared his experience of IIIF.
The National Library of Wales has been contributing to Wikimedia for five years and there are already 20,000 images from its collections available on Wikimedia Commons. When Wikidata approved a property for the IIIF manifest, the institution contributed 15,000 items to Wikidata with IIIF manifests.
The IIIF metadata started to get used by others, including an Italian website that is now displaying all the NLW images, without actually having a digital copy of those works, and is also pulling in all the metadata that is associated with those manifests.
Jason also demonstrated how IIIF allows you to clearly state where in an image something is depicted. The image position can be saved in Wikidata and then queried, or even extracted in a IIIF manifest to be used on other platforms.
Therefore the use of IIIF, according to Jason, would accomplish three goals: enhance Commons data, improve import and export options, and have more potential for reuse.
This subject attracted more than 60 participants across the two meetings, including Wikimedians, affiliate staff, and GLAM professionals from the Smithsonian, Metropolitan Museum, DPLA, The Getty, Wellcome Collection, National Gallery UK, Huntington Art Gallery, Harvard Library, V&A, British Library, and Princeton University Library. These attendees were polled to determine the most interesting IIIF use cases for Wikimedia Commons.
Top three from Monday’s meeting:
- Dynamic redisplay of images (e.g. zoom, crops, etc) for reuse on Wikipedia and elsewhere: 65%
- Aggregating Wikimedia images with other IIIF-compatible sources: 48%
- Wikimedia Commons as a free IIIF server for GLAMs and other contributors: 42%
Top three from Tuesday’s meeting:
- Wikimedia Commons as a free IIIF server for GLAMs and other contributors: 83%
- Simplifying bulk contribution of images to Commons: 58%
- Directly annotating media on Wikimedia Commons and Wikipedia: 50%
You can find the collective notes and chat transcripts for both meetings in this document.
October: Structured Data Across Wikimedia
For the October event, we focused on Structured Data, with presentations from Carly Bogen, the Foundation's Program Manager for Structured Data, and Alicia Fagerving and David Haskiya from Wikimedia Sweden.
Carly's presentation introduced the Structured Data Across Wikimedia (SDAW) project, which has the following goals:
- Allow machines to recognize Wikimedia content and suggest relations to other Wikimedia content.
- Design a way to structure articles and pages to enable new content formats.
- Give Wikimedia users a more inviting, more efficient way to search and find content.
So far, the product team has focused on an improved Media Search for Commons. The new Media Search:
- Has an image-focused user interface that will make it easier to find what you're looking for and to discover new things.
- Generates a set of search results that utilizes structured data and is more language-agnostic.
- Has filters for media types and tabs for audio, videos, and categories results.
Carly explained how Wikimedians and institutions can improve the search relevance of a file:
- Add a descriptive title
- Add captions in multiple languages
- Add a detailed description
- Add the file to the relevant categories
- Add depicts statements
This is described in more detail in a new Help page for Media Search.
Carly noted that the “mark as prominent” feature needs to be used more consistently if it is to be included in the search algorithm.
The structured data team will soon add a license filter, which will only use license data contained in structured data statements, so it’s very important that the GLAM community adds license information to the statements.
Finally, Carly posed two questions for the community:
- Should the depicts statements on Wikidata be added to the Commons search index?
- What other statements could most usefully be added to search?
Wikimedia Sweden’s presentation introduced a project that will use Structured Data on Commons to improve discovery and use of Wiki Loves Monuments images. WMSE will add at least 250,000 new ‘’Depicts’’ and ‘’Participant in’’ statements to Sweden’s Wiki Loves Monuments entries, focusing on those that have relevant Wikidata items. They will share their process and tools, which could be useful for other Wiki Loves campaigns, or for museums that have added depicts information to Wikidata.
There were more than 30 attendees across the two sessions, with good representation from cultural institutions, affiliates, and the broader community. At the end, there was a conversation about less useful depicts statements being added to Commons and how to prioritize the creation of tools to help with quality control and maintenance.
You can find the collective notes and chat transcripts for both meetings in this document.
November: Wikisource
The November event was dedicated to Wikisource. Satdeep Gill, Program Officer for GLAM and Culture, shared how he’s coordinating across movement stakeholders to improve Wikisource infrastructure. There were also guest presentations by David Kamholz from PanLex, and Sara Thomas from Wikimedia UK.
Satdeep opened his remarks by saying that he believes Wikisource is a major part of the essential infrastructure for free knowledge. It is imperative to have a really good transcription platform, especially so that underrepresented languages can have their own digital library. Wikisource hasn’t had a lot of investment in its infrastructure and has been mainly volunteer built.
Satdeep shared an overview of the Wikisource workflow, with an overlay of projects that are being worked on this year. These small projects have been supported in different ways—some via the Community Wishlist Survey 2020, others by Wikimedia Foundation grants, and another was achieved through a Google Summer of Code mentorship.
Satdeep shared the recently launched Pagelist Widget, which improves the visualization of files and pages, as well as user experience and editing. It has already been enabled on 25 Wikisources. He also previewed Wikisource Export and encouraged people to give their feedback on the proposed designs.
David’s presentation was about a grant-funded project to develop a Balinese palm-leaf transcription platform on Wikisource.
The starting point for this project was the Balinese Digital Library, which was created in 2011 by the Internet Archive in partnership with major Balinese collections. It made available digital photographs of 3,000 works, containing 130,000 leaves, and covering all aspects of Balinese culture for centuries. However, it turned out that images alone were not enough. They were hard to use, read, and share and the Internet Archive wanted to create something more useful and engage the community.
PanLex applied for a Wikimedia Foundation project grant to have a new Balinese Wikisource as the long-term home for the transcription platform and its works. The project encompassed:
- Uploading scans to Commons
- Importing existing work from Palm Leaf Wiki
- Adding Balinese fonts to the Universal Language Selector
- ProofreadPage improvements for content language
- Balinese Language Converter for transliteration
- User script to activate transcription interface and transliteration
Interestingly, the Balinese Wikisource editing interface uses IIIF to retrieve a high resolution tile for only the part of the image that is being transcribed, reducing data usage in low resource contexts.
Sara’s presentation, Responding to Covid: The National Library of Scotland & Wikisource introduced WikiProject NLS.
In 2020, the National Library of Scotland’s building closed due to the Covid-19 pandemic and they wanted a productive and valuable work-from-home activity for staff. They had a longstanding collaboration with Wikimedia UK and had already considered using Wikisource as an alternative to their in-house OCR, which is auto-generated with no facility to correct it. They decided to correct the OCR for a collection of over 3,000 Scottish chapbooks, which had been recently digitized and made available on the Library's Digital Gallery. The chapbooks covered a wide range of topics and, at just 10-20 pages per book, could be transcribed in a day.
With more than 70 staff members, it was one of the largest professional cohorts Wikimedia UK has ever engaged, and most staff hadn’t tried Wikimedia projects before. The library wanted to complete all 3,000 books so they worked with two members of the Wikisource community to agree on a more limited use of Wikisource templates, striking a balance between completeness and speed.
Library staff reported enjoying the work and the project brought this important collection to a broader audience. Sara concluded that Wikisource probably isn’t a replacement for a better in-house OCR, and ultimately the main benefit of the project was staff learning how to use Wikimedia platforms.
More than 30 participants joined the November meetings, including British Library staff who expressed an interest in learning more about the Balinese palm leaf project by PanLex. Staff from the Foundation were able to address some of the specific issues encountered on the National Library of Scotland project, noting that Google OCR limits can be removed, and committing to fixing the .txt export issue. The new PageList widget presented by Satdeep addressed another of the challenges.
You can find the collective notes and chat transcripts for both meetings in this document.