Recording Flow

Understand the complete flow from recording to actionable insights.

Overview

EdgeNote AI processes your audio through a streamlined pipeline that transforms speech into structured, actionable content.

Recording

Capture Audio

Mic + System Audio
Meeting Detection
Audio Levels
Timer Display

Transcription

Whisper AI

Speech-to-Text
Speaker Diarization
Timestamps
Language Detection

Summarization

LLM Analysis

Key Points
Action Items
Decisions
Goals

Outputs

Results

Summary
Transcript
Insights
Export

Step 1: Recording

The journey begins when you start a recording. EdgeNote AI captures audio from your selected sources and displays real-time feedback.

Audio Sources

Microphone input (your voice)
System audio (meeting participants)
Or import existing audio files

During Recording

Live audio level visualization
Recording timer
Pause/resume capability

Recording interface with audio levels

Step 2: Transcription

When recording ends (or you import audio), EdgeNote AI runs the Whisper model to convert speech to text.

What Happens:

Audio Processing

Audio is processed by the Whisper model using GPU acceleration when available.

Language Detection

Large models automatically detect the spoken language (99 languages supported).

Speaker Diarization

When enabled, the system identifies different speakers and labels their segments.

Timestamped Segments

Output includes precise timestamps for each segment of speech.

Complete transcription with speaker labels and timestamps

Step 3: Summarization

The transcript is then analyzed by a local LLM to generate structured summaries and extract insights.

Summary Generation

The LLM reads the full transcript and generates a concise summary following your selected template format.

Insight Extraction

Key points, action items, decisions, and goals are automatically identified and categorized.

ActionsTasks to complete with owners and due dates

DecisionsChoices and agreements made

GoalsStrategic objectives set

AI-generated summary with extracted insights

Step 4: Outputs

The final results are saved and made available for review, search, and export.

Summary

Structured overview

Transcript

Full text with timestamps

Export

Text, Markdown, PDF

Processing Time

Processing speed depends on your hardware and model choices:

Hardware	1 Hour Recording	Notes
Apple Silicon (M1-M4)	2-5 minutes	Full GPU acceleration
NVIDIA RTX GPU	2-5 minutes	CUDA acceleration
AMD/Intel GPU	4-10 minutes	Vulkan acceleration
CPU Only	20+ minutes	Use smaller models