AI meeting & transcription glossary

Plain-language definitions of the terms you meet when choosing an AI meeting or transcription tool.

Automatic Speech Recognition (ASR)

Automatic Speech Recognition is the technology that converts spoken audio into written text. It is the engine behind every meeting transcription tool and its accuracy is usually measured with the word error rate (WER).

Transcription

Transcription is the written record of what was said in a meeting, produced from audio either in real time or after the call. AI transcription adds speaker labels, timestamps and search on top of the raw text.

Real-time transcription

Real-time transcription converts speech to text live during the meeting, enabling captions and follow-along notes. It is harder than post-call transcription because the system cannot use the full context of the recording.

Speaker diarization

Speaker diarization is the process of detecting who spoke when and labelling each segment of a transcript with the correct participant. Good diarization is essential for readable minutes and accurate action-item assignment.

AI meeting summary

An AI meeting summary is a short, structured recap generated by a language model from the transcript, typically covering key points, decisions and next steps. Quality depends on transcript accuracy and the underlying model.

Action items

Action items are the concrete tasks a meeting tool extracts from the discussion, usually with an owner and a due date. Reliable extraction is what turns a passive transcript into a follow-up workflow.

Notetaker bot

A notetaker bot is an automated participant that joins a video call to record and transcribe it. Some tools avoid the bot by capturing system audio on-device, which can be less intrusive and easier to use in regulated settings.

Large language model (LLM)

A large language model is an AI model trained on huge amounts of text that can summarize, classify and rewrite language. Meeting tools use LLMs to turn raw transcripts into summaries, action items and answers to questions.

Word error rate (WER)

Word error rate is the standard metric for transcription accuracy: the share of words that are wrong, missing or inserted compared to a reference. A lower WER means a more accurate transcript; numbers vary widely by language and audio quality.

GDPR

The General Data Protection Regulation is the EU law governing how personal data is processed. For meeting tools it matters because recordings and transcripts contain personal data, so a lawful basis, a data processing agreement and EU hosting are often required.

Data residency

Data residency is the geographic location where your data is stored and processed. EU data residency means recordings and transcripts stay on servers within the European Union, which many organizations require for compliance.

Data processing agreement (DPA)

A data processing agreement is a contract between a company and a vendor that processes personal data on its behalf, required under the GDPR. Before rolling out a meeting tool, organizations should sign a DPA covering recordings and transcripts.