cv-evidence-renderer

The problem

Your detector fires. Now you need a 15-second MP4 with bounding boxes drawn on it, to attach to an alert or archive for compliance. Every CV team writes this code from scratch, ships the bug to prod, then writes it again on the next project.

"How to overlay bbox and save a clip on event?" — recurring thread on the NVIDIA DeepStream forum since 2022. The only official answer (Smart Record) has no Python bindings in the official pyds bindings, and Smart Record itself still has open multi-stream crash reports on DeepStream 7.1.

Why nothing else covers this

	cv-evidence-renderer	supervision	DeepStream Smart Record	KeyClipWriter
Python-only install	✓	✓	✗	✓
Event-window trim (offline)	✓	✗	C only	✓
Cross-file event concat	✓	✗	✗	✗
Decode-once batch	✓	✗	✗	✗
Output duration cap (timelapse / framedrop)	✓	✗	✗	✗
Bbox burn-in	✓	✓	✓	✗
supervision interop	✓	—	✗	✗
Ultralytics YOLO adapter	✓	✓	✗	✗
NVENC encode	🚧 v0.2	✗	✓	✗
Live RTSP ring buffer	🚧 v0.2	✗	C only	✓
Multi-stream pool	🚧 v0.3	✗	✓	✗

Quick start

from cv_evidence_renderer import render_from_jsonl

render_from_jsonl(
    video="incidents/raw_001.mp4",
    detections_jsonl="incidents/raw_001.detections.jsonl",
    event_start=12.5,            # seconds
    event_end=22.0,
    output="evidence/event_001.mp4",
    encoder="libx264",           # NVENC ships in v0.2
)

That's the whole API for the most common case. Pair with from_yolo_results or from_supervision if you're routing through Ultralytics or supervision.

Benchmark — Apple M4 (CPU libx264 baseline)

5 s source, 30 fps, two detections burned in every frame. Reproduce with python scripts/benchmark.py.

Resolution	Render time	Throughput	× realtime	Output
480p (854×480)	0.53 s	282 fps	9.4×	0.42 MB
720p (1280×720)	0.89 s	168 fps	5.6×	0.70 MB
1080p (1920×1080)	1.70 s	88 fps	2.95×	1.34 MB

Pure CPU libx264 already runs faster than realtime up to 1080p. NVENC on a discrete GPU will join this table side-by-side once it ships.

Three things you'd actually use this for

🚨 Production safety alerts

Attach evidence clips to ops alerts. Operators open the MP4, see exactly what tripped the rule. No JSON sidecar they won't read.

📊 Dataset curation

Re-render every detection event from a week of footage as labeled MP4 clips. Hand to annotators or stack into a training set.

🧪 Demo / paper supplement

Render the exact frames your detector flagged for inclusion in a publication, demo reel, or stakeholder review.

FAQ

Why not just use supervision?: We complement it. Pass sv.Detections straight in. We add what supervision doesn't have: event-window trim, ring buffer (v0.2), and eventually NVENC.
Why not just use DeepStream Smart Record?: NVIDIA confirmed on their developer forum that the official pyds bindings don't expose Smart Record (a community fork exists with a custom-built wheel). Smart Record itself also has open multi-stream crash reports on DS 7.1. When NVIDIA upstreams a clean Python API we'll add a sink so you can use both together.
Why not just write it yourself with FFmpeg subprocess?: Because you already did, twice, and it broke under multi-stream. This is the library version of that script, done right.
What about NVENC?: Designed for v0.2. The encoder backend already has a stub interface; what's missing is the PyAV NVENC wiring and a CUDA CI runner. Tracker issue on GitHub once it lands.
License?: MIT. Use it commercially, fork it, embed it, ship it.