2019:Transcription/Subtitling the Chaos Communication Congress: An Experience Report
This is an Accepted submission for the Transcription space at Wikimania 2019. |
Description
[edit | edit source]I have been part of the c3subtitles team that aims at creating high-quality transcripts for the conference recordings of the annual Chaos Communication Congress for the past few years. I will describe how we process recordings, which tools and infrastructure we use, and which challenges we encountered along the way.
Relationship to the theme
[edit | edit source]This session will address the conference theme — Wikimedia, Free Knowledge and the Sustainable Development Goals — in the following manner: Video transcripts help to make videos accessible to a wider audience, thus improving the quality of education (4) and reducing inequalities (10). Furthermore, transcripts enable indexing and searching video contents (9).
Session outcomes
[edit | edit source]At the end of the session, the following will have been achieved: The audience understands which tools and techniques are used to transcribe the recordings of the Chaos Communication Congress.
Session leader(s)
[edit | edit source]
Session type
[edit | edit source]Each Space at Wikimania 2019 will have specific format requests. The program design prioritises submissions which are future-oriented and directly engage the audience. The format of this submission is a:
- Lightning talk
Requirements
[edit | edit source]The session will work best with these conditions:
- Room:
projector + screen
- Audience:
This might be a bit of a niche topic, so I'm not sure about the appropriate size of the audience. No prior knowledge is required.
- Recording:
appropriate for recording
SESSION SUMMARY
[edit | edit source]CCC chaos communication congress, between Christmas and New Year, now in Leipzig, 17000 visitors.
4 days of presentation and much more
Mostly volunteers works and a lot of fun.
The goal is to provide realtime caption
- with minimum delay
- but... people speak fast so it doesn't work well
- 200 words/minute WPM = 3-4 times faster than typing
- 1200 strokes/Minute SPM
Different solutions:
- The tech respeaking based on speech recognition, works well for clearly-defined vocabularies, but doesn't work at all for too large or specific
- Stenographer 300 WPM
- None work for them
Real time captions
- Our solution is : if one person is too slow, just use 4 persons.
- Can work if people are coordinating. And need focus and error will happens anyway !
Offline afterward subtitles for high-quality video only, the goal is not to be fast anymore, you have time.
Workflow
speech recognition software = transcript
Angels turn it into a proper transcript, it doesn't need to much correction
There is a quality control step to fix spelling error and unify the style and alignment.
Align transcript & audio using Youtube