Let an AI Caption Meetings

So-called artificial intelligence (AI) has arrived, but in the usual unexpected form. Instead of an electronic brain capable of all tasks, the most-advanced AI comes in the form of deep learning, a way to train an algorithm to pick things that are very like other things. This lets machine-learning systems identify cats in photos, predict upcoming weather conditions based on radar imagery, and turn spoken words into text—all with shockingly good accuracy.

There are three general types of AI-based conversion of speech into text available:

  • Live transcription: While people talk, the service creates a transcript which can be viewed as it’s created. It’s often just concatenating snippets of live captioning, but it typically attempts to uniquely number and identify speakers.

  • Live captioning or closed captioning: While people speak, a text version of what they say is posted live in the videoconference feed, just as if it were a captioned video or TV program. The quality can be quite high, but because it’s real time, it’s often worse than offline processing. This is often provided free (as in Skype and Google Meet), as part of a business plan (as in Microsoft Teams), or as a third-party subscription add-on (as with Otter.ai, one of the first, and other services).

  • Post-meeting transcription: Offline processing of audio can produce better results, because it’s not trying to keep up with the demands of nearly instantaneous conversion. This audio tends to be more accurate and better identify multiple speakers.

Live captioning can be an advantage for any attendee, but especially for participants who have a hearing impairment. It may also help attendees in noisy situations where they can’t play audio or wear headphones that let them hear clearly what’s being said. Live transcripts during a session can be useful to follow along for the same reason or to reinforce what’s being said.

A post-meeting transcription with extra processing can provide a close-enough record of a meeting for later searching. But these post-meeting documents can also almost always have speakers’ names assigned (which then propagate for that identified speaker throughout the transcript) and allow clean-up of text and annotation. That allows production of a more polished or even verbatim text record.

Zoom’s service offers live captioning and live transcripts during meetings for both free and paid tiers. The service offers no post-processing phase or editing interface. You can turn to third-party services that integrate with Zoom or that let you upload audio after an event to provide more advanced, flexible, or different features than Zoom’s included ones.

To manage captions, a host or co-host clicks the Live Transcript icon in a desktop app and clicks the Enable button in the panel that appears.

Participants immediately see live captioning and a live transcript in all apps (Figure 125):

  • If captions don’t appear, in a desktop or web app they can click the Live Transcript icon in meeting controls and select Show Subtitles. In iOS or iPadOS, tap More , tap Meeting Settings, and enable Closed Captioning. (Captions are always enabled in Android.)

    Figure 125: Captions scroll as people speed and slowly disappear.
    Figure 125: Captions scroll as people speed and slowly disappear.
  • To see the live transcript, click the Live Transcript icon in meeting controls (desktop/web) or tap More in mobile apps and select View Full Transcript.

Individual attendees can control the type size of captions in the desktop apps through Settings > Accessibility.

A host can also keep or select the checkbox marked “Allow participants to request Live Transcription.” Then any participant who wants it can click the Live Transcript icon in a desktop or web app or tap More in mobile apps and click or tap Request Live Transcription.

This results in a prompt in all apps asking to confirm the request (Figure 126). In desktop and web apps, a participant can check the “Ask anonymously” box; mobile app users can tap Request Anonymously. That takes the onus off an individual requesting captioning.

Figure 126: Any participant can request Live Transcription if the host allows it.
Figure 126: Any participant can request Live Transcription if the host allows it.

A host then sees either a participant’s name or “a participant” requesting Live Transcription. The host can select Enable, Decline, or the pretty severe “Decline and don’t ask again” (Figure 127)!

Figure 127: A host can approve the Live Transcription request.
Figure 127: A host can approve the Live Transcription request.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset