FFmpeg Error Handling How to Find the Key Points in a Pile of Nonsense | pyVideoTrans-Open Source Video Translation Tool -pyvideotrans.com github.com/jianchang512/pyvideotrans

FFmpeg Error Handling: How to Find the Key Points in a Pile of Nonsense

When using Python's subprocess module to call external tools, especially ffmpeg, a common headache is that when a command fails, the subprocess.CalledProcessError exception throws the entire standard error output (stderr) at you. This output is often intimidatingly long, mixed with version numbers, compilation information, configuration parameters, etc. The truly useful error clues may be just one or two lines, hidden in this massive amount of information and hard to find.

Problem: FFmpeg Errors, Logs Full of "Nonsense"

For example, suppose you want to use ffmpeg to convert a file that doesn't exist at all:

python

import subprocess
import logging

logger = logging.getLogger("FFmpegRunner")

cmd = ["ffmpeg", "-hide_banner", "-i", "no_such_file.mp4", "output.mp4"]

try:
    subprocess.run(cmd, check=True, capture_output=True, text=True, encoding="utf-8")
except subprocess.CalledProcessError as e:
    logger.error(f"FFmpeg encountered an error!\nCommand: {' '.join(cmd)}\nError output:\n{e.stderr}")

Running this code, e.stderr might spit out a lot of stuff: FFmpeg's version information, a list of supported encoders... only to find a simple no_such_file.mp4: No such file or directory at the end. If this is in a production environment or a complex workflow, facing such a long log, quickly figuring out where the problem lies is a nightmare.

bash

C:\Users\c1\Videos>ffmpeg -c:v h264_amf -i 480.mp4  -c:v 152.mp4
ffmpeg version N-112170-gb61733f61f-20230924 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.0 (crosstool-NG 1.25.0.232_c175b21)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libharfbuzz --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth --enable-chromaprint --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --enable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20230924
  libavutil      58. 25.100 / 58. 25.100
  libavcodec     60. 27.100 / 60. 27.100
  libavformat    60. 13.100 / 60. 13.100
  libavdevice    60.  2.101 / 60.  2.101
  libavfilter     9. 11.100 /  9. 11.100
  libswscale      7.  3.100 /  7.  3.100
  libswresample   4. 11.100 /  4. 11.100
  libpostproc    57.  2.100 / 57.  2.100
Trailing option(s) found in the command: may be ignored.
Unknown decoder 'h264_amf'
Error opening input file 480.mp4.
Error opening input files: Decoder not found

We need a way to extract the truly critical error information and prevent it from being drowned in "nonsense."

Solution: Smartly Extract Key Information

Directly printing the entire e.stderr is definitely not ideal; it's too messy. A better approach is to pick out the few sentences that best explain the problem from this pile of output.

Observing ffmpeg's error messages, there are usually a few patterns:

Key information is often in the last few lines, such as prompts for file not found or unsupported format.
They contain obvious keywords, such as "Error," "Invalid," "No such file," "Permission denied," etc.

Based on these characteristics, we can write a function to specifically dig out the useful parts from stderr:

python

def extract_concise_error(stderr_text: str, max_lines=3, max_length=250) -> str:
    """Extract concise error information from stderr, usually the last few lines with keywords."""
    if not stderr_text:
        return "Unknown error (stderr is empty)"

    # Split stderr into lines
    lines = stderr_text.strip().splitlines()
    if not lines:
        return "Unknown error (stderr has no content)"

    # Common error keywords
    error_keywords = ["error", "invalid", "fail", "could not", "no such",
                      "denied", "unsupported", "unable", "can't open", "conversion failed"]

    # Only look at the last few lines (default is 3 lines)
    start = max(0, len(lines) - max_lines)
    for i in range(len(lines) - 1, start - 1, -1):  # Find from back to front
        line = lines[i].strip()
        if not line:  # Skip empty lines
            continue

        # If this line contains a keyword, it's basically what we're looking for
        if any(keyword in line.lower() for keyword in error_keywords):
            # Add the previous line for context, which may be more useful
            if i > 0 and lines[i-1].strip():
                return f"{lines[i-1].strip()}\n{line}"[:max_length] + ("..." if len(line) > max_length else "")
            return line[:max_length] + ("..." if len(line) > max_length else "")

    # If no keywords are found, take the last line as a compromise
    for line in reversed(lines):
        if line.strip():
            return line[:max_length] + ("..." if len(line) > max_length else "")
    
    return "Unknown error (no specific problem found)"

# Use it like this:
try:
    subprocess.run(cmd, check=True, capture_output=True, text=True, encoding="utf-8")
except subprocess.CalledProcessError as e:
    short_error = extract_concise_error(e.stderr)
    logger.error(f"FFmpeg failed (exit code: {e.returncode})! Command: {' '.join(cmd)}, Error: {short_error}")
    # If you need the full output, record it at DEBUG level
    # logger.debug(f"Full error output:\n{e.stderr}")

Some Thoughts and Pitfalls When Writing This Function

Keywords May Not Be Complete The error_keywords I listed are based on experience and may miss some of FFmpeg's special error messages. In actual use, you may need to add a few keywords when you encounter new situations.
Context is Important Sometimes, just looking at the error line is not enough. For example, "File cannot be opened" requires looking at the previous line to know which file it is. So I added some code to try to include the previous line as well.
What If No Keywords Are Found? If no keywords are matched, I compromise and take the last line as the result. It's better than throwing out the entire page of logs, but it's not always accurate.
Character Encoding Issues FFmpeg's output sometimes produces strange characters that are not standard UTF-8. To prevent the program from crashing, encoding="utf-8" is added to subprocess.run, and errors="replace" can be used as a fallback if necessary.
How Should Logs Be Recorded? My approach is to record short error messages at the ERROR level, making it easy to see the problem at a glance. If you need to check the details, record the full output at the DEBUG level. This is both clear and doesn't lose information.

With this method, we can quickly extract key information from the stderr of subprocess.CalledProcessError, making the logs much more readable and troubleshooting faster. This idea can not only be used on ffmpeg, but also on other command-line tools with long and messy outputs.

The core is to understand the patterns of the tool's error output and then "streamline" it.

Although it's not guaranteed to be perfect every time, it can at least help you flip through fewer pages of logs and scratch your head less.