Import (Setup)
On the Setup screen, drop MP4, MOV, or WAV files onto the dropzone — or use Select Files. Batch import is supported, so you can queue a whole scene's clips at once.
DVP Tools / Post-Production
Professional-grade transcription for filmmakers. Generate broadcast-accurate SRT and VTT files in seconds using OpenAI Whisper via fal.ai, complete with speaker diarization and QA tools.
Quick start
CineScribe works in two stages: a Setup screen where you import and configure, then an Editor where you refine and export. It drops you into the editor automatically once transcription finishes.
On the Setup screen, drop MP4, MOV, or WAV files onto the dropzone — or use Select Files. Batch import is supported, so you can queue a whole scene's clips at once.
Still in Setup, choose your quality tier, language, and output (toggle Speaker Labels if you need them), then click Generate Subtitles. CineScribe uploads, transcribes, and lands you in the editor when it's done.
In the editor, fix cues inline and check the QA tab for reading-speed and line-length flags. When it's clean, use the export bar to download SRT, VTT, TXT — or a PPTX supertitle deck for live performance. Use + Add media to head back to Setup for more clips.
Preparing your media
CineScribe uploads your file straight to fal.ai's CDN before transcription. There's no hard size cap in the app, but a little prep makes uploads faster and more reliable — especially for long projects.
Transcription only listens to the audio track. Export a WAV or MP3 and you'll upload a fraction of the size with identical accuracy — the single biggest speed win for any clip over a few minutes.
For features, long interviews, or anything past roughly an hour, break the media into reels or scenes and queue them as a batch. Shorter files upload faster, recover cleanly if one fails, and keep the cue editor responsive.
Video: MP4, MOV, MKV. Audio: WAV, MP3, M4A, AAC, FLAC, OGG. Drag several in at once — batch import queues them all.
Files over ~90 MB upload in chunks automatically, so a big clip won't fail outright — it simply takes longer on slower connections. If an upload stalls, use Retry Failed rather than re-queuing from scratch.
Audio-only + clips kept under an hour = the fastest, most reliable path. Reach for video upload only when you plan to burn subtitles back into the picture.
Methodology
Subtitling isn't just about accuracy; it's about readability. CineScribe continuously monitors your cues against industry standard metrics.
Every subtitle file is audited in real-time, highlighting exactly where viewers might struggle to keep up with the dialogue.
We use fal.ai's high-speed inference engine to run Whisper. This ensures that a 10-minute interview can be transcribed in under 30 seconds with 99% accuracy.
A fal.ai API key is required. Setup takes 2 minutes and includes free starter credits—perfect for testing the CineScribe pipeline.
Controls
The Quality selector trades speed for care — every tier runs Whisper large-v3 (there's no model-size choice). Draft and Swift use fal's fast wizper engine for the quickest turnaround. Balanced is the recommended default. Precision uses a smaller batch size for steadier chunking on difficult or noisy audio — slowest, but most careful.
Note on cue timing: the fast wizper tiers (Draft / Swift) segment coarsely — they're ideal for a quick, accurate full-text transcript but may group long passages into a single cue. For properly timed subtitle cues, use Balanced or Precision, then fine-tune in the cue editor.
Toggle "Speaker Labels" to have the AI identify different voices. You can rename generic labels like "SPEAKER_01" to actual character names in the editor's Inspector tab.
Enable the "Silence Filter" to automatically suppress non-speech intervals, preventing the AI from hallucinating text during long pauses.
Click any text cell to edit instantly. The timecodes and speaker labels remain locked, allowing you to focus exclusively on linguistic accuracy.
The fastest way to fix proper nouns. One pass can correct a character's name or technical term across 500+ cues simultaneously.
Use Save Session (in the topbar Sessions menu) to store your full queue and edits in browser storage. Pick up right where you left off without re-uploading.
Professional Workflow
Upload WAV or MP3 instead of large video files. It's 10x faster to upload and the transcription quality is identical to video processing.
Switch to 'Raw' mode to copy the entire subtitle block into an LLM (like Claude) for a creative rewrite or translation pass, then paste it back.
Export SRT for Premiere or Resolve. If your editor supports it, use the 'Word Timings' export for even more precise subtitle placement.
Never export a file with a 'Red' status chip. Resolving speed and length violations ensures your film remains accessible and professional.
Once you're in the editor, click Export Supertitles in the export bar for a one-click stage-ready deck — one cue per slide, large white text on black, with a blank slide between every line so the screen goes dark in the gaps. (You can also set the format to PPTX in the same bar and use Download Export.) Drop it straight into PowerPoint and let an operator advance each title live. For translated supertitles, transcribe with Output → English first so the slides carry the translation.