Jessie/video watermarking #259

JessieeeNotLi · 2025-11-06T09:22:20Z

Summary by CodeRabbit

New Features
- Added a GPU-accelerated video transcoding benchmark with configurable trials, scaling, watermark overlay, warmup/repeat, and JSON/CSV outputs
- Added runnable launchers for local and containerized execution and automated test-video generation when no input is provided
Documentation
- Added usage and setup instructions for running the benchmark scripts and containerized workflow
Chores
- Added placeholder files to preserve benchmark directories in version control

coderabbitai · 2025-11-06T09:22:30Z

Walkthrough

Adds a GPU NVENC benchmarking suite: new Python benchmark(s), shell launchers, Docker assets, directory placeholders, and docs. The tools probe FFmpeg/NVIDIA capabilities, run GPU-accelerated or CPU-fallback transcodes with optional watermark/scale, and emit per-trial JSON/CSV metrics.

Changes

Cohort / File(s)	Summary
Placeholders `watermarking_bench/.gitkeep`, `benchmarks/200.multimedia/225.video-watermarking-gpu/.gitkeep`	Empty files added to preserve directories in version control.
Top-level benchmark utilities `watermarking_bench/gpu_bench.py`, `watermarking_bench/run_nvenc_bench.sh`, `watermarking_bench/read.me`, `watermarking_bench/watermarking_readme.md`	New Python benchmark script and launcher: locate ffmpeg, probe encoders/filters, query GPU, build GPU/CPU filter graphs (scale_npp/scale_cuda/overlay_cuda and fallbacks), assemble/run ffmpeg per trial with optional watermark/scale/duration, parse progress, collect metrics, and provide usage docs.
Bench Docker & runner `benchmarks/200.multimedia/225.video-watermarking-gpu/Dockerfile`, `benchmarks/200.multimedia/225.video-watermarking-gpu/Dockerfile.cpu`, `benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh`, `benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py`, `benchmarks/200.multimedia/225.video-watermarking-gpu/read.me`	New containerized benchmark: Dockerfiles (GPU and CPU), benchmark runner script, copy of gpu_bench.py for container use, run script that defines trial sets (nvenc/h264/hevc and libx264 fallback), creates sample input if missing, and invokes benchmark producing JSON/CSV outputs.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant User
    participant Launcher as run_nvenc_bench.sh / run.sh
    participant Bench as gpu_bench.py
    participant FFmpeg
    participant NV as nvidia-smi

    User->>Launcher: start benchmark (args: INPUT,WATERMARK,DURATION,REPEAT)
    Launcher->>NV: query GPU info
    Launcher->>FFmpeg: probe encoders & filters
    Launcher->>Bench: invoke with trials & inputs
    Bench->>Bench: decide decode path & build filter graph (GPU vs fallback)
    Bench->>FFmpeg: execute transcode command (decode → filters/overlay → encode)
    FFmpeg-->>Bench: emit progress logs
    Bench->>Bench: parse progress & collect metrics
    Bench-->>Launcher: write JSON (+ optional CSV)
    Launcher-->>User: print summary / CSV path

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Focus review on:
- watermarking_bench/gpu_bench.py and benchmarks/.../gpu_bench.py: filter-graph planning, hwupload/hwdownload bridging, ffmpeg command construction, progress parsing, error handling, and metric correctness.
- run_nvenc_bench.sh and benchmarks/.../run.sh: environment validation, sample generation, CLI parsing, and robustness.
- Dockerfile variants: file copy paths, permissions, and runtime entrypoint expectations.
- Consistency between JSON and CSV outputs and duplicate script copies across locations.

Poem

🐰
I hopped through frames of light and stream,
NVENC hummed soft — a coder’s dream.
I stamped each pixel with a merry tune,
Benchmarks bloom beneath the moon.
Results in rows — a rabbit’s beam.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title check	❓ Inconclusive	The title 'Jessie/video watermarking' is vague and uses non-descriptive branch naming convention rather than a clear description of the main change.	Rename the title to clearly describe the primary change, such as 'Add GPU-accelerated video watermarking benchmark with NVENC support' or similar, explaining what the changeset adds.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

JessieeeNotLi · 2025-11-06T09:22:46Z

added video watermarking benchmark

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

watermarking_bench/.gitkeep (1)

1-1: .gitkeep may be redundant if the directory contains other files.

Since the PR already adds multiple files to watermarking_bench/ (gpu_bench.py, documentation, etc.), the .gitkeep placeholder is unnecessary—the directory will be preserved by these files alone. You can safely remove this.
watermarking_bench/gpu_bench.py (2)
32-40: Consider splitting multiple statements per line.

Lines 37-39 use semicolons to place multiple statements on single lines, which slightly reduces readability.

Apply this diff for improved clarity:
-        m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last);  fps = float(m.group(1)) if m else None
-        m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last); speed = float(m.group(1)) if m else None
-        m = re.search(r"frame=\s*([0-9]+)", last);              frames = int(m.group(1)) if m else None
+        m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last)
+        fps = float(m.group(1)) if m else None
+        m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last)
+        speed = float(m.group(1)) if m else None
+        m = re.search(r"frame=\s*([0-9]+)", last)
+        frames = int(m.group(1)) if m else None
245-293: Consider splitting the cleanup statements for clarity.

Lines 275-276 compress the temporary file cleanup into a single line. While functional, separating these statements would improve readability.

Apply this diff:
             idx += 1
-            try: os.remove(outp)
-            except OSError: pass
+            try:
+                os.remove(outp)
+            except OSError:
+                pass

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b54b46c and aa3483f.

⛔ Files ignored due to path filters (4)

watermarking_bench/results.csv is excluded by !**/*.csv
watermarking_bench/results_long.csv is excluded by !**/*.csv
watermarking_bench/sample.mp4 is excluded by !**/*.mp4
watermarking_bench/watermark.png is excluded by !**/*.png

📒 Files selected for processing (5)

watermarking_bench/.gitkeep (1 hunks)
watermarking_bench/gpu_bench.py (1 hunks)
watermarking_bench/read.me (1 hunks)
watermarking_bench/run_nvenc_bench.sh (1 hunks)
watermarking_bench/watermarking_readme.md (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

watermarking_bench/gpu_bench.py (1)

benchmarks/200.multimedia/220.video-processing/python/function.py (1)

watermark (38-47)

🪛 Ruff (0.14.3)

watermarking_bench/gpu_bench.py

1-1: Shebang is present but file is not executable

(EXE001)

14-14: subprocess call: check for execution of untrusted input

(S603)

29-29: Do not catch blind exception: Exception

(BLE001)

37-37: Multiple statements on one line (semicolon)

(E702)

38-38: Multiple statements on one line (semicolon)

(E702)

39-39: Multiple statements on one line (semicolon)

(E702)

149-149: Avoid specifying long messages outside the exception class

(TRY003)

275-275: Multiple statements on one line (colon)

(E701)

276-276: Multiple statements on one line (colon)

(E701)

🔇 Additional comments (10)

watermarking_bench/watermarking_readme.md (1)

1-9: LGTM! Clear and helpful documentation.

The instructions are concise and cover the common use cases effectively.

watermarking_bench/read.me (1)

1-3: LGTM! Concise usage instructions.

The commands are clear and cover the essential use cases.

watermarking_bench/gpu_bench.py (5)

7-22: LGTM! Well-designed helper functions.

The utility functions are clean and appropriate for the task. The subprocess call is safe since command arguments are controlled by the script logic, not untrusted user input.

24-30: LGTM! Appropriate error handling for optional GPU info.

The broad exception catch is acceptable here since nvidia-smi availability is optional, and the function gracefully degrades by returning None values.

44-130: LGTM! Comprehensive filter graph construction.

The function effectively handles multiple scenarios with proper GPU/CPU bridging. The explicit use of hwdownload and hwupload_cuda for format conversion is good practice and aligns with FFmpeg best practices for CUDA filters.

134-198: LGTM! Well-structured transcoding function.

The function properly orchestrates the FFmpeg command construction, execution, and metric collection. The detailed error message on line 149 (flagged by static analysis) is actually helpful for debugging, providing both the failure output and the command arguments.

200-244: LGTM! Comprehensive benchmark orchestration.

The main function effectively handles argument parsing, warmup execution, and trial iteration. The default trials include av1_nvenc, which will fail on systems without AV1 support, but this is caught by the encoder availability check in transcode_once(), so the error handling is appropriate.

watermarking_bench/run_nvenc_bench.sh (3)

1-26: LGTM! Robust script setup and validation.

The script properly validates prerequisites and provides helpful diagnostic output. The use of set -euo pipefail ensures failures are caught early.

269-278: LGTM! Helpful sample generation for testing.

The automatic generation of a 4K test video with audio is convenient for users who want to run the benchmark without providing their own input. The encoding parameters are reasonable for a test sample.

280-300: LGTM! Clear benchmark execution and results preview.

The benchmark execution is well-structured with timestamped CSV output and a helpful preview of results. The trial selection excludes av1_nvenc, which improves compatibility with systems that lack AV1 encoder support.

coderabbitai · 2025-11-06T09:29:56Z

watermarking_bench/gpu_bench.py

@@ -0,0 +1,293 @@
+#!/usr/bin/env python3


⚠️ Potential issue | 🟡 Minor

Set executable permission on the file.

The shebang indicates this script should be directly executable, but the file permissions aren't set. Ensure chmod +x is applied to match the usage shown in the documentation.

Apply this fix:

chmod +x watermarking_bench/gpu_bench.py

🧰 Tools

🪛 Ruff (0.14.3)

1-1: Shebang is present but file is not executable

(EXE001)

🤖 Prompt for AI Agents

In watermarking_bench/gpu_bench.py at line 1 the file contains a shebang but lacks executable permission; make the script executable by setting the execute bit (e.g., run chmod +x watermarking_bench/gpu_bench.py) so it can be run directly as indicated in the docs and CI.

coderabbitai · 2025-11-06T09:29:56Z

watermarking_bench/run_nvenc_bench.sh

+# --- Make or update a working gpu_bench.py (GPU-first, safe bridges, skip missing encoders) ---
+cat > gpu_bench.py <<'PY'
+#!/usr/bin/env python3
+import argparse, datetime, json, os, re, shutil, subprocess, sys, tempfile, csv
+from typing import List, Dict, Any, Optional, Tuple
+
+def which_ffmpeg() -> str:
+    p = shutil.which("ffmpeg")
+    if not p:
+        sys.exit("ffmpeg not found on PATH. Use Docker image with NVENC or install FFmpeg with NVENC.")
+    return p
+
+def run(cmd: List[str]) -> subprocess.CompletedProcess:
+    return subprocess.run(cmd, stdin=subprocess.DEVNULL, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)
+
+def has_encoder(ffmpeg: str, enc: str) -> bool:
+    out = run([ffmpeg, "-hide_banner", "-encoders"]).stdout
+    return re.search(rf"\b{re.escape(enc)}\b", out) is not None
+
+def has_filter(ffmpeg: str, name: str) -> bool:
+    out = run([ffmpeg, "-hide_banner", "-filters"]).stdout
+    return (f" {name} " in out)
+
+def gpu_info() -> Dict[str, Any]:
+    try:
+        out = run(["nvidia-smi", "--query-gpu=name,memory.total,driver_version", "--format=csv,noheader,nounits"]).stdout.strip()
+        name, mem, drv = [x.strip() for x in out.splitlines()[0].split(",")]
+        return {"name": name, "memory_total_mb": int(mem), "driver_version": drv}
+    except Exception:
+        return {"name": None, "memory_total_mb": None, "driver_version": None}
+
+def parse_progress(log: str) -> Dict[str, Any]:
+    lines = [ln for ln in log.splitlines() if ("fps=" in ln or "speed=" in ln or "frame=" in ln)]
+    fps = speed = frames = None
+    if lines:
+        last = lines[-1]
+        m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last);  fps = float(m.group(1)) if m else None
+        m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last); speed = float(m.group(1)) if m else None
+        m = re.search(r"frame=\s*([0-9]+)", last);              frames = int(m.group(1)) if m else None
+    return {"fps": fps, "speed_x": speed, "frames": frames}
+
+def build_vf_or_complex(ffmpeg: str, scale: Optional[str], wm_path: Optional[str], overlay: str, want_gpu_decode: bool) -> Tuple[List[str], str]:
+    used = []
+    vf_args: List[str] = []
+    complex_graph = ""
+
+    have_scale_npp    = has_filter(ffmpeg, "scale_npp")
+    have_scale_cuda   = has_filter(ffmpeg, "scale_cuda")
+    have_overlay_cuda = has_filter(ffmpeg, "overlay_cuda")
+
+    if not wm_path:
+        if scale:
+            if want_gpu_decode and have_scale_npp:
+                vf_args = ["-vf", f"scale_npp={scale}"]; used.append("scale_npp")
+            elif want_gpu_decode and have_scale_cuda:
+                vf_args = ["-vf", f"scale_cuda={scale}"]; used.append("scale_cuda")
+            else:
+                vf_args = ["-vf", f"hwdownload,format=nv12,scale={scale},hwupload_cuda"]
+                used.append("scale(cpu)+hwdownload+hwupload_cuda")
+        return (vf_args, "+".join(used))
+
+    # watermark path
+    if want_gpu_decode and have_overlay_cuda:
+        if scale and have_scale_npp:
+            complex_graph = f"[0:v]scale_npp={scale}[v0];[v0][1:v]overlay_cuda={overlay}[vout]"
+            used += ["scale_npp","overlay_cuda"]
+        elif scale and have_scale_cuda:
+            complex_graph = f"[0:v]scale_cuda={scale}[v0];[v0][1:v]overlay_cuda={overlay}[vout]"
+            used += ["scale_cuda","overlay_cuda"]
+        elif scale:
+            complex_graph = f"[0:v]hwdownload,format=nv12,scale={scale},hwupload_cuda[v0];[v0][1:v]overlay_cuda={overlay}[vout]"
+            used += ["scale(cpu)+hwdownload+hwupload_cuda","overlay_cuda"]
+        else:
+            complex_graph = f"[0:v][1:v]overlay_cuda={overlay}[vout]"
+            used += ["overlay_cuda"]
+        return (["-filter_complex", complex_graph, "-map", "[vout]"], "+".join(used))
+
+    # CPU overlay fallback (explicit bridges)
+    if scale and want_gpu_decode and (have_scale_npp or have_scale_cuda):
+        scaler = "scale_npp" if have_scale_npp else "scale_cuda"
+        complex_graph = (
+            f"[0:v]{scaler}={scale}[v0gpu];"
+            f"[v0gpu]hwdownload,format=nv12[v0cpu];"
+            f"[v0cpu][1:v]overlay={overlay}[mix];"
+            f"[mix]hwupload_cuda[vout]"
+        )
+        used += [scaler, "hwdownload+overlay(cpu)+hwupload_cuda"]
+    elif scale:
+        complex_graph = (
+            f"[0:v]hwdownload,format=nv12,scale={scale}[v0cpu];"
+            f"[v0cpu][1:v]overlay={overlay}[mix];"
+            f"[mix]hwupload_cuda[vout]"
+        )
+        used += ["scale(cpu)+overlay(cpu)+hwupload_cuda"]
+    else:
+        complex_graph = (
+            f"[0:v]hwdownload,format=nv12[v0cpu];"
+            f"[v0cpu][1:v]overlay={overlay}[mix];"
+            f"[mix]hwupload_cuda[vout]"
+        )
+        used += ["overlay(cpu)+hwupload_cuda"]
+
+    return (["-filter_complex", complex_graph, "-map", "[vout]"], "+".join(used))
+
+def transcode_once(ffmpeg: str, inp: str, outp: str, codec: str, bitrate: str, preset: str,
+                   duration: Optional[float], scale: Optional[str], wm_path: Optional[str],
+                   overlay_pos: str, decode_mode: str = "gpu") -> Dict[str, Any]:
+
+    if not has_encoder(ffmpeg, codec):
+        raise RuntimeError(f"encoder '{codec}' not available; check your ffmpeg build (NVENC/AV1).")
+
+    want_gpu_decode = (decode_mode == "gpu")
+    args = [ffmpeg, "-hide_banner", "-y", "-vsync", "0"]
+
+    if want_gpu_decode:
+        args += ["-hwaccel", "cuda", "-hwaccel_output_format", "cuda", "-extra_hw_frames", "16",
+                 "-init_hw_device", "cuda=cuda", "-filter_hw_device", "cuda"]
+
+    args += ["-i", inp]
+    if wm_path:
+        args += ["-loop", "1", "-i", wm_path]
+    if duration:
+        args += ["-t", str(duration)]
+
+    filt_args, filter_used = build_vf_or_complex(ffmpeg, scale, wm_path, overlay_pos, want_gpu_decode)
+    args += filt_args
+
+    args += ["-c:v", codec, "-b:v", bitrate, "-preset", preset, "-rc", "vbr", "-movflags", "+faststart"]
+    args += ["-c:a", "copy"]
+    args += [outp]
+
+    t0 = datetime.datetime.now()
+    proc = run(args)
+    t1 = datetime.datetime.now()
+    if proc.returncode != 0:
+        raise RuntimeError("ffmpeg failed:\n" + proc.stdout + f"\n\nARGS:\n{' '.join(args)}")
+
+    parsed = parse_progress(proc.stdout)
+    size = os.path.getsize(outp) if os.path.exists(outp) else 0
+    return {
+        "args": args,
+        "filter_used": filter_used,
+        "stdout_tail": "\n".join(proc.stdout.splitlines()[-15:]),
+        "compute_time_us": (t1 - t0) / datetime.timedelta(microseconds=1),
+        "fps": parsed["fps"],
+        "speed_x": parsed["speed_x"],
+        "frames": parsed["frames"],
+        "output_size_bytes": size
+    }
+
+def main():
+    ap = argparse.ArgumentParser(description="GPU NVENC benchmark.")
+    ap.add_argument("--input", required=True)
+    ap.add_argument("--duration", type=float, default=None)
+    ap.add_argument("--repeat", type=int, default=1)
+    ap.add_argument("--warmup", action="store_true")
+    ap.add_argument("--csv", default=None)
+    ap.add_argument("--watermark", default=None)
+    ap.add_argument("--overlay", default="main_w/2-overlay_w/2:main_h/2-overlay_h/2")
+    ap.add_argument("--decode", choices=["gpu","cpu"], default="gpu")
+    ap.add_argument("--trials", nargs="+", default=[
+        "codec=h264_nvenc,bitrate=5M,preset=p5",
+        "codec=h264_nvenc,bitrate=12M,preset=p1,scale=1920:1080",
+        "codec=hevc_nvenc,bitrate=6M,preset=p4",
+        # "codec=av1_nvenc,bitrate=3M,preset=p5",  # only if available
+    ])
+    args = ap.parse_args()
+
+    ffmpeg = which_ffmpeg()
+    gi = gpu_info()
+
+    def parse_trial(s: str) -> Dict[str, str]:
+        d: Dict[str, str] = {}
+        for kv in s.split(","):
+            k, v = kv.split("=", 1); d[k.strip()] = v.strip()
+        return d
+
+    trial_specs = [parse_trial(s) for s in args.trials]
+
+    # Warmup with first available encoder
+    if args.warmup:
+        warm = next((t for t in trial_specs if has_encoder(ffmpeg, t.get("codec","h264_nvenc"))), None)
+        if warm:
+            with tempfile.NamedTemporaryFile(suffix=".mp4", delete=True) as tmp:
+                _ = transcode_once(ffmpeg, args.input, tmp.name,
+                                   warm.get("codec","h264_nvenc"),
+                                   warm.get("bitrate","5M"),
+                                   warm.get("preset","p5"),
+                                   args.duration,
+                                   warm.get("scale"),
+                                   args.watermark,
+                                   args.overlay,
+                                   args.decode)
+
+    results = []; idx = 0
+    for spec in trial_specs:
+        for _ in range(args.repeat):
+            if not has_encoder(ffmpeg, spec.get("codec","h264_nvenc")):
+                results.append({
+                    "trial_index": idx, "codec": spec.get("codec"), "bitrate": spec.get("bitrate"),
+                    "preset": spec.get("preset"), "scale_filter": "", "fps": None, "speed_x": None,
+                    "frames": None, "compute_time_us": 0, "output_size_bytes": 0,
+                    "stdout_tail": "SKIPPED: encoder not available", "argv": "", "status": "skipped"
+                }); idx += 1; continue
+            with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as tmp:
+                outp = tmp.name
+            res = transcode_once(ffmpeg, args.input, outp,
+                                 spec.get("codec","h264_nvenc"),
+                                 spec.get("bitrate","5M"),
+                                 spec.get("preset","p5"),
+                                 args.duration,
+                                 spec.get("scale"),
+                                 args.watermark,
+                                 args.overlay,
+                                 args.decode)
+            results.append({
+                "trial_index": idx, "codec": spec.get("codec"), "bitrate": spec.get("bitrate"),
+                "preset": spec.get("preset"), "scale_filter": res["filter_used"], "fps": res["fps"],
+                "speed_x": res["speed_x"], "frames": res["frames"],
+                "compute_time_us": res["compute_time_us"], "output_size_bytes": res["output_size_bytes"],
+                "stdout_tail": res["stdout_tail"], "argv": " ".join(res["args"]), "status": "ok"
+            })
+            idx += 1
+            try: os.remove(outp)
+            except OSError: pass
+
+    report = {"gpu": gi, "ffmpeg_path": ffmpeg, "trial_count": len(results), "results": results}
+    print(json.dumps(report, indent=2))
+
+    if args.csv and results:
+        with open(args.csv, "w", newline="") as f:
+            w = csv.DictWriter(f, fieldnames=list(results[0].keys()))
+            w.writeheader(); w.writerows(results)
+
+if __name__ == "__main__":
+    main()
+PY


🛠️ Refactor suggestion | 🟠 Major

Eliminate code duplication between embedded and standalone scripts.

The Python benchmarking code is duplicated here (237 lines embedded) and in the standalone gpu_bench.py file. This creates maintenance burden and increases the risk of the two versions diverging—which has already begun (the embedded version has av1_nvenc commented out on line 192 and includes different skipping logic on lines 225-231).

Consider one of these approaches:

Option 1 (recommended): Ship gpu_bench.py as a standalone script and invoke it:

-cat > gpu_bench.py <<'PY' -#!/usr/bin/env python3 -[... entire script ...] -PY -chmod +x gpu_bench.py +# Ensure gpu_bench.py is in the same directory +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +GPU_BENCH="${SCRIPT_DIR}/gpu_bench.py" + +if [[ ! -f "$GPU_BENCH" ]]; then + echo "Error: gpu_bench.py not found in ${SCRIPT_DIR}" + exit 1 +fi +chmod +x "$GPU_BENCH"

Then update the execution line:

-./gpu_bench.py \ +"$GPU_BENCH" \

Option 2: If embedding is required for distribution, generate it from the canonical source during build/release rather than maintaining two copies manually.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh (2)
14-19: Add validation and error handling for gpu_bench.py invocation.

Line 14 invokes /app/gpu_bench.py without verifying it exists or has execute permissions. Additionally, there is no error handling if the script fails (the set -euo pipefail will exit, but provides no context).

Consider adding:

Explicit file existence check before invocation

Better error messaging on failure
# Verify gpu_bench.py exists and is executable
if [[ ! -x /app/gpu_bench.py ]]; then
  echo "ERROR: /app/gpu_bench.py not found or not executable" >&2
  exit 1
fi

# Run with better error handling
if ! /app/gpu_bench.py \
  --input "$INPUT" \
  --duration "$DURATION" \
  --repeat "$REPEAT" \
  --decode "$DECODE" \
  --csv "$CSV" \
  --trials \
    "codec=h264_nvenc,bitrate=5M,preset=p5" \
    "codec=h264_nvenc,bitrate=12M,preset=p1,scale=1920:1080" \
    "codec=hevc_nvenc,bitrate=6M,preset=p4"; then
  echo "ERROR: gpu_bench.py execution failed" >&2
  exit 1
fi
5-9: Document expected input data and GPU availability handling.

The script assumes:

Input data exists at /data/sample.mp4 (line 5) – requires a volume mount that is not documented

GPU is available (DECODE=gpu, line 9) – no fallback to CPU if GPU is unavailable

Output directory /out is writable (line 8)

These assumptions should be either:

Documented clearly in a README or comments

Made more resilient (e.g., check GPU availability, synthesize input if missing, provide CPU fallback)

Consider adding a comment block documenting the expected environment:
# Expected environment:
# - /data/ volume with sample.mp4 (or provide INPUT env var with custom path)
# - /out/ volume for output results
# - GPU device available for DECODE=gpu (defaults to gpu; set DECODE=cpu for CPU-only)
Alternatively, enhance the script to handle missing input gracefully (synthesize test video) or GPU unavailability (fall back to CPU).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa3483f and e0cfbdc.

📒 Files selected for processing (3)

benchmarks/200.multimedia/225.video-watermarking-gpu/.gitkeep (1 hunks)
benchmarks/200.multimedia/225.video-watermarking-gpu/Dockerfile (1 hunks)
benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh (1 hunks)

✅ Files skipped from review due to trivial changes (1)

benchmarks/200.multimedia/225.video-watermarking-gpu/.gitkeep

🔇 Additional comments (3)

benchmarks/200.multimedia/225.video-watermarking-gpu/Dockerfile (2)

5-7: Verify Python dependencies are installed.

The Dockerfile installs Python3 and pip but doesn't install any runtime dependencies that gpu_bench.py might require. If gpu_bench.py imports packages like numpy, scipy, or others, they should be installed during the Docker build to avoid runtime failures.

Verify what packages gpu_bench.py depends on and confirm whether they need to be added to the Dockerfile pip install step.

10-10: Python shebang verified—no action needed.

The Python shebang is present in gpu_bench.py, confirming it can be executed directly with chmod +x and invoked from run.sh. The code is correct.
benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh (1)
20-23: Verify trial configuration format matches gpu_bench.py CLI interface.

The trials are passed as comma-separated key=value pairs (lines 20-23). Without visibility into gpu_bench.py, it is not possible to confirm this format is correct. If the actual CLI expects a different format, the script will fail at runtime.

Confirm that gpu_bench.py accepts --trials with the format:
--trials "codec=h264_nvenc,bitrate=5M,preset=p5" "codec=h264_nvenc,bitrate=12M,preset=p1,scale=1920:1080" "codec=hevc_nvenc,bitrate=6M,preset=p4"
Specifically verify:

Does --trials accept multiple arguments or a single argument?

Is the comma-separated key=value syntax correct?

Are codec, bitrate, preset, and scale the correct parameter names?

coderabbitai · 2025-11-12T21:22:06Z

benchmarks/200.multimedia/225.video-watermarking-gpu/Dockerfile

+COPY gpu_bench.py /app/gpu_bench.py
+COPY run.sh       /app/run.sh


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify COPY source paths match actual file locations.

The Dockerfile copies gpu_bench.py and run.sh from the build context root (lines 10-11), but based on the PR summary, gpu_bench.py is located in watermarking_bench/ directory. Confirm whether:

The build context is correctly structured and paths should be watermarking_bench/gpu_bench.py

These files are actually at the repository root

Or the docker build is invoked from a non-standard location

🏁 Script executed:

#!/bin/bash # Description: Verify locations of gpu_bench.py and run.sh relative to Dockerfile # Find gpu_bench.py echo "=== Finding gpu_bench.py ===" find . -name "gpu_bench.py" -type f 2>/dev/null | head -20 # Find run.sh echo "=== Finding run.sh in video-watermarking-gpu context ===" find . -path "*/225.video-watermarking-gpu/run.sh" -type f 2>/dev/null # Check build context assumptions echo "=== Checking Dockerfile location ===" find . -path "*225.video-watermarking-gpu/Dockerfile" -type f 2>/dev/null

Length of output: 284

I need to gather more information to confirm the build context assumptions:Now I need to verify the file details and check the directory structure to understand the exact build setup:Now let me verify the actual content of the files and directory structure to understand the full context:

🏁 Script executed:

#!/bin/bash # Description: Verify file locations and check gpu_bench.py details echo "=== Checking gpu_bench.py shebang and location ===" head -3 ./watermarking_bench/gpu_bench.py echo "" echo "=== Checking if watermarking_bench is at repo root ===" ls -la ./watermarking_bench/ | head -10 echo "" echo "=== Directory structure around Dockerfile ===" ls -la ./benchmarks/200.multimedia/225.video-watermarking-gpu/ echo "" echo "=== Checking if watermarking_bench is inside benchmarks ===" find ./benchmarks -name "gpu_bench.py" -type f 2>/dev/null echo "" echo "=== Checking for any requirements or build configuration ===" find ./watermarking_bench -name "requirements*.txt" -o -name "setup.py" -o -name "*.md" 2>/dev/null | head -10

Length of output: 1338

Let me check the build documentation and run.sh to understand the intended build context:

Now let me check the build documentation and run.sh to determine the intended build pattern:

🏁 Script executed:

#!/bin/bash # Description: Check build documentation and run.sh echo "=== Contents of run.sh ===" cat ./benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh echo "" echo "=== Contents of watermarking_readme.md ===" cat ./watermarking_bench/watermarking_readme.md

Length of output: 935

Fix COPY source paths to match actual file locations.

The Dockerfile's COPY commands (lines 10-11) assume both files are in the build context root, but they are located elsewhere:

gpu_bench.py is at watermarking_bench/gpu_bench.py (not in context root)

run.sh is at benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh (not in context root)

Update to:

COPY watermarking_bench/gpu_bench.py /app/gpu_bench.py COPY benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh /app/run.sh

This assumes docker build is invoked from the repository root with the standard context (.).

🤖 Prompt for AI Agents

In benchmarks/200.multimedia/225.video-watermarking-gpu/Dockerfile around lines 10 to 11, the COPY source paths are incorrect (they assume files live in the build context root); update the COPY commands to use the actual locations: copy watermarking_bench/gpu_bench.py to /app/gpu_bench.py and copy benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh to /app/run.sh, assuming docker build is run from the repository root (context .).

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0cfbdc and f534a53.

📒 Files selected for processing (1)

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py (1)

benchmarks/200.multimedia/220.video-processing/python/function.py (1)

watermark (38-47)

🪛 Ruff (0.14.4)

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py

1-1: Shebang is present but file is not executable

(EXE001)

14-14: subprocess call: check for execution of untrusted input

(S603)

29-29: Do not catch blind exception: Exception

(BLE001)

37-37: Multiple statements on one line (semicolon)

(E702)

38-38: Multiple statements on one line (semicolon)

(E702)

39-39: Multiple statements on one line (semicolon)

(E702)

149-149: Avoid specifying long messages outside the exception class

(TRY003)

275-275: Multiple statements on one line (colon)

(E701)

276-276: Multiple statements on one line (colon)

(E701)

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f534a53 and e53cfde.

⛔ Files ignored due to path filters (1)

benchmarks/200.multimedia/225.video-watermarking-gpu/out_cpu/results.csv is excluded by !**/*.csv

📒 Files selected for processing (4)

benchmarks/200.multimedia/225.video-watermarking-gpu/Dockerfile.cpu (1 hunks)
benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py (1 hunks)
benchmarks/200.multimedia/225.video-watermarking-gpu/read.me (1 hunks)
benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh (1 hunks)

✅ Files skipped from review due to trivial changes (1)

benchmarks/200.multimedia/225.video-watermarking-gpu/read.me

🧰 Additional context used

🧬 Code graph analysis (1)

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py (1)

benchmarks/200.multimedia/220.video-processing/python/function.py (1)

watermark (38-47)

🪛 Ruff (0.14.4)

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py

1-1: Shebang is present but file is not executable

(EXE001)

14-14: subprocess call: check for execution of untrusted input

(S603)

29-29: Do not catch blind exception: Exception

(BLE001)

37-37: Multiple statements on one line (semicolon)

(E702)

38-38: Multiple statements on one line (semicolon)

(E702)

39-39: Multiple statements on one line (semicolon)

(E702)

85-85: Multiple statements on one line (semicolon)

(E702)

87-87: Multiple statements on one line (semicolon)

(E702)

158-158: Avoid specifying long messages outside the exception class

(TRY003)

289-289: Multiple statements on one line (colon)

(E701)

290-290: Multiple statements on one line (colon)

(E701)

coderabbitai · 2025-11-12T22:27:10Z

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py

+    # audio: copy if present
+    args += ["-c:a", "copy"]
+


⚠️ Potential issue | 🔴 Critical

Audio copy crashes when watermark is enabled

Whenever build_vf_or_complex needs watermarking it returns -filter_complex … -map [vout] (e.g., Line 73). That mapping drops all audio streams, yet we still force -c:a copy here. ffmpeg then aborts with “Stream specifier ':a' matches no streams,” so every watermark trial fails. Please either keep audio mapped (e.g., append -map 0:a? alongside [vout]) or skip -c:a copy when no audio output is present.

A minimal fix is to extend each filter-complex return to include audio (and keep -c:a copy working):

- return (["-filter_complex", complex_graph, "-map", "[vout]"], "+".join(used)) + return ([ + "-filter_complex", complex_graph, + "-map", "[vout]", + "-map", "0:a?" + ], "+".join(used))

Make the corresponding change for the other watermark branches as well.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py around lines 189-191, we always append "-c:a copy" even when build_vf_or_complex returns a filter_complex that maps only video (e.g., "-map [vout]"), which drops audio and makes ffmpeg fail with "Stream specifier ':a' matches no streams." Fix by either (1) modifying the watermarking branches in build_vf_or_complex to include audio mapping (append "-map 0:a?" alongside the video map so audio is preserved and "-c:a copy" remains valid) or (2) only append "-c:a copy" when an audio output map exists (detect if any "-map 0:a" or "-map 0:a?" is present and skip adding "-c:a" otherwise); apply the same change to all other watermark branches that return a filter_complex mapping video only.

coderabbitai · 2025-11-12T22:27:10Z

benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh

+DECODE="${DECODE:-gpu}"   # 'gpu' or 'cpu'
+


⚠️ Potential issue | 🟠 Major

Default decode mode breaks CPU image startup

Line 13 forces DECODE to gpu, so transcode_once always adds -hwaccel cuda (Line 165 in gpu_bench.py). On the CPU image—which by definition has no CUDA devices—ffmpeg fails immediately (No NVDEC device found), making the CPU benchmark unusable unless the caller manually overrides DECODE. Please default to CPU (or auto-detect with nvidia-smi) so the CPU image works out of the box.

Apply this diff to auto-detect the decode mode:

- DECODE="${DECODE:-gpu}" +if [ -z "${DECODE:-}" ]; then + if command -v nvidia-smi >/dev/null 2>&1 && nvidia-smi -L >/dev/null 2>&1; then + DECODE="gpu" + else + DECODE="cpu" + fi +fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

DECODE="${DECODE:-gpu}" # 'gpu' or 'cpu'

if [ -z "${DECODE:-}" ]; then

if command -v nvidia-smi >/dev/null 2>&1 && nvidia-smi -L >/dev/null 2>&1; then

DECODE="gpu"

else

DECODE="cpu"

fi

fi

🤖 Prompt for AI Agents

In benchmarks/200.multimedia/225.video-watermarking-gpu/run.sh around lines 13-14, the script currently forces DECODE="${DECODE:-gpu}" which breaks the CPU image; change it to default to cpu or auto-detect GPUs: if DECODE is not set, run a quick check for nvidia-smi (and that it reports at least one GPU) and set DECODE=gpu only in that case, otherwise set DECODE=cpu; ensure the detection is robust (check command exists and exit code/output) and leaves an explicit caller-provided DECODE unchanged.

JessieeeNotLi added 6 commits November 6, 2025 10:06

Create .gitkeep

f8577e7

watermarking GPU benchmark files

dfaa14a

added run benchmark script

7f4f6c9

Add usage instructions to read.me

1653b7c

watermarking_readme.md

f5e7ab7

Update NVENC benchmark instructions in README

aa3483f

coderabbitai bot reviewed Nov 6, 2025

View reviewed changes

JessieeeNotLi added 4 commits November 12, 2025 22:16

Create .gitkeep

fa7e76e

Add Dockerfile for NVENC-enabled FFmpeg image

51713f4

Add run script for video watermarking benchmark

e0cfbdc

Add files via upload

f534a53

coderabbitai bot reviewed Nov 12, 2025

View reviewed changes

benchmarks/200.multimedia/225.video-watermarking-gpu/gpu_bench.py Outdated Show resolved Hide resolved

JessieeeNotLi and others added 3 commits November 12, 2025 22:32

Update run.sh

7057465

Update run.sh

44c8bcb

added gpu benchmark, and test results on CPU:wq

e53cfde

coderabbitai bot reviewed Nov 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Jessie/video watermarking #259

Jessie/video watermarking #259

Uh oh!

JessieeeNotLi commented Nov 6, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 6, 2025 •

edited

Loading

Uh oh!

JessieeeNotLi commented Nov 6, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 6, 2025

Uh oh!

coderabbitai bot Nov 6, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 12, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 12, 2025

Uh oh!

coderabbitai bot Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-DECODE="${DECODE:-gpu}"   # 'gpu' or 'cpu'
+if [ -z "${DECODE:-}" ]; then
+  if command -v nvidia-smi >/dev/null 2>&1 && nvidia-smi -L >/dev/null 2>&1; then
+    DECODE="gpu"
+  else
+    DECODE="cpu"
+  fi
+fi

Jessie/video watermarking #259

Are you sure you want to change the base?

Jessie/video watermarking #259

Uh oh!

Conversation

JessieeeNotLi commented Nov 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

JessieeeNotLi commented Nov 6, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JessieeeNotLi commented Nov 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 6, 2025 •

edited

Loading