8000 Add nvidia cuda hardware accelerated decoding/encoding support by Rob-Otman · Pull Request #480 · C4illin/ConvertX · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@Rob-Otman
Copy link
@Rob-Otman Rob-Otman commented Dec 17, 2025

🚀 Enhanced Hardware Acceleration & Smart Conversion Filtering

This PR adds comprehensive hardware acceleration support and intelligent conversion filtering to ConvertX, significantly improving performance and user experience.

🎯 Key Features Added

🔧 Hardware Acceleration

  • NVENC Support: Automatic H.264/H.265 hardware encoding using NVIDIA NVENC when GPU available
  • CUDA Decoding: Smart CUDA hardware acceleration for video input decoding (skips images)
  • GPU Detection: Real-time NVIDIA GPU availability checking using nvidia-smi
  • Codec Detection: Accurate video codec detection using ffprobe instead of file extensions

🎛️ Environment Variables

  • FFMPEG_PREFER_HARDWARE=true: Enables hardware acceleration when available
  • Automatically falls back to software encoding/decoding if hardware unavailable

🧠 Smart UI Filtering

  • Dynamic Options: Conversion targets filtered based on input file types
  • Multi-file Support: Intersection of compatible formats across multiple selected files
  • Real-time Updates: Options update immediately when files are added/removed
  • No Invalid Conversions: Prevents selection of incompatible output formats

📊 Comprehensive Logging

  • Hardware acceleration decisions and GPU status
  • Codec detection results and encoder selection
  • Fallback behavior when hardware unavailable
  • Performance optimization indicators

🛠️ Technical Implementation

FFmpeg Integration

  • Selective Hardware: Only applies CUDA/NVENC when both GPU and codec support available
  • Safe Fallbacks: Graceful degradation to software processing
  • Environment Respect: Honors existing FFMPEG_ARGS while adding intelligence

UI Enhancements

  • Client-side Filtering: JavaScript handles dynamic option filtering
  • Server-side Validation: Backend validates format compatibility
  • Progressive Enhancement: Works with or without JavaScript

Testing Coverage

  • 15 test cases covering all hardware acceleration scenarios
  • Mock implementations for GPU detection and codec probing
  • Edge case handling for missing hardware/drivers

🔍 Architecture Decisions

Hardware Detection Strategy

  • One-time Check: GPU availability cached to avoid repeated nvidia-smi calls
  • Fail-safe: Any detection failure results in software-only mode
  • Container-aware: Works correctly in Docker environments with GPU passthrough

Codec Compatibility

  • CUDA Supported: H.264, H.265, VP8, VP9, MPEG-2, MPEG-4, AV1
  • Image Exclusion: JPG, PNG, etc. automatically skip hardware acceleration
  • Format-aware: Container format (MP4, MKV) vs actual codec distinction

UI/UX Philosophy

  • Progressive Disclosure: Show all options initially, filter as files selected
  • No Breaking Changes: Existing functionality preserved
  • Smart Defaults: Hardware enabled when available, software otherwise

📈 Performance Benefits

  • Faster Encoding: Up to 10x faster H.264/H.265 encoding with NVENC
  • Efficient Decoding: CUDA acceleration for compatible video inputs
  • Reduced CPU Usage: GPU offloading for supported operations
  • Smart Optimization: Only uses hardware when beneficial

🧪 Testing Results

  • 15/15 tests passing
  • Hardware acceleration works when GPU available
  • Software fallback when GPU unavailable
  • Codec detection accurately identifies supported formats
  • UI filtering prevents incompatible conversions
  • Error handling for missing dependencies

🔄 Backward Compatibility

  • Existing configs continue to work unchanged
  • FFMPEG_ARGS respected and enhanced with intelligence
  • No breaking changes to API or user interface
  • Optional features - all hardware acceleration is opt-in

🎨 User Experience

  • No more failed conversions due to incompatible format selection
  • Automatic performance optimization when hardware available
  • Clear visibility into acceleration decisions via logging
  • Seamless operation whether hardware is available or not

This enhancement transforms ConvertX from a basic file converter into an intelligent, hardware-accelerated media processing platform while maintaining simplicity and reliability.


Summary by cubic

Adds GPU-accelerated FFmpeg encoding and decoding. When enabled, we use NVENC for H.264/H.265 and CUDA for input decoding to speed up conversions and reduce CPU usage.

  • New Features
    • Opt-in via FFMPEG_PREFER_HARDWARE=true; falls back to software if no GPU.
    • Detects NVIDIA GPU with nvidia-smi and caches the result.
    • Uses ffprobe to detect codecs; adds -hwaccel cuda only for supported video codecs; skips images.
    • Respects FFMPEG_ARGS (does not override existing -hwaccel).
    • Logs GPU status and encoder/decoder choices; adds tests for both paths.

Written for commit f71d555. Summary will update on new commits.

Add FFMPEG_PREFER_HARDWARE env var to enable hardware acceleration
Use h264_nvenc and hevc_nvenc encoders when hardware preferred
Fall back to software encoders (libx264/libx265) when disabled
Add comprehensive tests for hardware/software encoding modes
Update README with new environment variable documentation
This enables GPU-accelerated video encoding for better performance on systems with NVIDIA GPUs.
- Detect video codec using ffprobe instead of file extensions
- Auto-enable CUDA hwaccel for supported codecs when FFMPEG_PREFER_HARDWARE=true
- Skip probing for image formats to optimize performance
- Add comprehensive tests for hardware acceleration logic
…logging

- Add `checkNvidiaGpuAvailable()` function using nvidia-smi to detect GPU presence
- Add comprehensive logging for hardware acceleration decisions:
  - GPU detection status
  - Hardware vs software encoding/decoding choices
  - CUDA codec support detection
  - Encoder selection decisions
- Cache GPU availability results to avoid repeated checks
- Add `resetNvidiaGpuCache()` for testing
- Update tests to handle GPU availability mocking

This provides visibility into hardware acceleration decisions and ensures CUDA is only attempted when NVIDIA GPU hardware is actually available.
Copy link
@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 3 files

Prompt for AI agents (all 4 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/converters/ffmpeg.ts">

<violation number="1" location="src/converters/ffmpeg.ts:870">
P1: Hardware encoder selection doesn&#39;t check GPU availability. If `FFMPEG_PREFER_HARDWARE=true` but no GPU is detected, this will use `h264_nvenc` anyway, causing ffmpeg to fail. Should check `preferHardware &amp;&amp; gpuAvailable`.</violation>

<violation number="2" location="src/converters/ffmpeg.ts:896">
P1: CUDA hardware acceleration is added without checking GPU availability. This will cause ffmpeg to fail with `-hwaccel cuda` when no NVIDIA GPU is present, even though the codec detection passed. Should check `gpuAvailable` as well.</violation>
</file>

<file name="tests/converters/ffmpeg.test.ts">

<violation number="1" location="tests/converters/ffmpeg.test.ts:256">
P1: Test assertion checks wrong call index. The comment states `calls[0]` is `nvidia-smi` and `calls[1]` is `ffmpeg`, but this assertion checks `calls[0]` which would never contain `-hwaccel cuda`. Should check `calls[1]` (the ffmpeg call) to verify CUDA is not added.</violation>

<violation number="2" location="tests/converters/ffmpeg.test.ts:256">
P1: Test assertion checks wrong call index. When hardware is preferred, `calls[0]` is the `nvidia-smi` args (as documented in other tests), not the `ffmpeg` call. This assertion will always pass trivially since `nvidia-smi` args never contain `-hwaccel cuda`. Should check `calls[2]` for the ffmpeg arguments.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

…ation

- Check both preferHardware AND gpuAvailable before using NVENC codecs (h264_nvenc, hevc_nvenc)
- Check gpuAvailable before adding CUDA hwaccel flag for input decoding
- Fix test assertion to check correct call index for CUDA hwaccel verification

This prevents hardware acceleration attempts when GPU is unavailable, addressing code review feedback.
- Adjusted formatting of callback responses in the ffmpeg test file for better readability.
- Ensured consistent indentation and line breaks for JSON strings in mock responses.

This enhances code clarity and maintainability in the test suite.
@manolol1
Copy link
manolol1 commented Jan 9, 2026

I successfully got this PR working on my T1000 after some troubleshooting.
Now, it uses the GPU for almost all FFmpeg conversions, which greatly sped up the processing times and lowered CPU-usage.

I had to mount the nvidia-smi binary from the host into the container. Maybe, this could be mentioned somewhere in the README (or included in the sample compose file).

This is my entire docker-compose.yml:

services:
  convertx:
    image: convertx:gpu # use the custom-built image
    group_add: # ADDED (required): Ensure the docker process has permission to access the graphics device
      - 226 # The render group GID on the host
    container_name: convertx
    restart: unless-stopped
    ports:
      - "3000:3000"
    runtime: nvidia
    environment:
      - TZ=Europe/Vienna
      - JWT_SECRET=[...] # will use randomUUID() if unset
      - HTTP_ALLOWED=true # uncomment this if accessing it over a non-https connection
      - ACCOUNT_REGISTRATION=false
      - ALLOW_UNAUTHENTICATED=true
      - AUTO_DELETE_EVERY_N_HOURS=168
      - HIDE_HISTORY=false
      - LANGUAGE=en
      - UNAUTHENTICATED_USER_SHARING=false
      - MAX_CONVERT_PROCESS=10

      # required for nvidia support
      - FFMPEG_PREFER_HARDWARE=true
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,video,utility
    volumes:
      - ./data:/app/data
      - /usr/bin/nvidia-smi:/usr/bin/nvidia-smi:ro # nvidia-smi binary from host

Thanks for this useful contribution, I hope this gets merged soon! :-)

* Returns false for image formats without probing (performance optimization).
* Falls back to false if probing fails (safe default).
*/
async function isCudaSupportedCodec(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we have a list of all formats that potentially supports instead. It should only be all video container formats right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that at first but file types like .mkv can contain virtually any codec and .mp4 can contain H.264, H.265, MPEG-2, AV1 etc. so many formats we can not know without ffprobe if has a cuda supported codec.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and then instead of skipping all image formats we skip everything that isn't a container. And we use ffprobe on mkv, mp4 etc.

Copy link
Author
@Rob-Otman Rob-Otman Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my current approach is preferable because it only skips definitively non-video formats (images) and lets ffprobe handle codec detection for all video files. File extensions can be changed/wrong but ffprobe reading the file headers is very accurate assurance that the file is supported.

I'm open to change to your approach if you want, this is just my thinking.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But FFmpeg support almost 500 file extensions and of them only a handful is a container format. I don't mean that we should remove FFprobe just changing it from an image blacklist to a container whitelist :)

…ment variable details

- Added detailed instructions for enabling NVIDIA GPU hardware acceleration in the deployment section.
- Updated environment variable documentation to clarify usage and options for hardware acceleration.
- Improved overall formatting and structure for better readability.

These changes enhance the clarity of the README, making it easier for users to configure and deploy the application effectively.
README.md Outdated
# - NVIDIA_DRIVER_CAPABILITIES=all # Optional: Comma-separated list (e.g., 'compute,video,utility'), 'all' for everything
volumes:
- ./data:/app/data
# - /usr/bin/nvidia-smi:/usr/bin/nvidia-smi:ro # May be needed: Mount nvidia-smi for GPU detection (not required in Unraid)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be better to add this under a separate header? I want to keep the example compose as general as possible to make it look easy. What do you think?

…nstructions

- Added a new section for NVIDIA GPU hardware acceleration, including a sample docker-compose configuration.
- Clarified requirements and notes for using NVIDIA drivers and runtime.
- Improved formatting for better readability and user guidance.

These updates aim to assist users in optimizing performance with NVIDIA GPUs.
- Replaced the previous imageFormats set with a new containerFormats set to optimize ffprobe usage.
- Updated the isCudaSupportedCodec function to only probe container formats that can contain video streams, improving performance and clarity in codec detection.

These changes enhance the efficiency of the codec detection process for CUDA hardware acceleration.
@webysther
Copy link

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

@webysther
Copy link

related to #518

@Rob-Otman
Copy link
Author

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523

I don't think container changes make sense to include on this PR

@webysther
Copy link
webysther commented Jan 27, 2026

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523

I don't think container changes make sense to include on this PR

Mostly of this check for cuda it's not necessary, even for igpu in general:

From https://trac.ffmpeg.org/wiki/HWAccelIntro#Usewiththeffmpegcommand-linetool

Internal hwaccel decoders are enabled via the -hwaccel option (now supported in ffplay). The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

Maybe is better to switch for jellyfin-ffmpeg and just copy the compiled optmized version inside the container.

About your option about add container changes in this PR, I think this PR is not even necessary.

@webysther
Copy link

Now talking about software enconding/decoding, some people don't know, but software encoding/decoding always will be better then using hardware acceleration. If the quality over time is the preference, it's better stick with software and use:

services:
  convertx:
    image: ghcr.io/c4illin/convertx:v0.17.0
    deploy:
      resources:
        limits:
          cpus: '2.00'
          memory: 2G
    ...

@Rob-Otman
Copy link
Author

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523
I don't think container changes make sense to include on this PR

Mostly of this check for cuda it's not necessary, even for igpu in general:

From https://trac.ffmpeg.org/wiki/HWAccelIntro#Usewiththeffmpegcommand-linetool

Internal hwaccel decoders are enabled via the -hwaccel option (now supported in ffplay). The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

Maybe is better to switch for jellyfin-ffmpeg and just copy the compiled optmized version inside the container.

About your option about add container changes in this PR, I think this PR is not even necessary.

I think you're glossing over the nuance of how hardware decoding/encoding is actually enabled and used in ffmpeg.

The -hwaccel option is used to enable and specify a hardware acceleration method for video decoding (e.g., -hwaccel cuda for NVIDIA GPUs, -hwaccel vaapi for Intel hardware, -hwaccel videotoolbox for Apple devices). When researching for this change I did notice an auto option in the documentation, but I personally had issues with it incorrectly selecting methods and have found comments from others saying they have experienced the same. Even if the auto option were to work reliably, it still does not automatically handle or enable GPU acceleration for encoding, for which you must explicitly select a hardware-accelerated encoder via options like -c:v (or -vcodec), such as:

  • -c:v h264_nvenc (NVIDIA encoding)

  • -c:v h264_vaapi (VAAPI encoding)

  • -c:v hevc_videotoolbox (Apple encoding)

I don't know about jellyfin-ffmpeg and whether or not it has any added functionality, but as far as i can tell it won't help in this regard.

@Rob-Otman
Copy link
Author

Now talking about software enconding/decoding, some people don't know, but software encoding/decoding always will be better then using hardware acceleration. If the quality over time is the preference, it's better stick with software and use:

Now talking about environment variable configuration options, some people don't know, but these are entirely optional to use and will always have a sensible default value. If the quality over time is the preference, the end user can opt to leave the default value, which is software decoding/encoding.
😉

@webysther
Copy link

Now talking about software enconding/decoding, some people don't know, but software encoding/decoding always will be better then using hardware acceleration. If the quality over time is the preference, it's better stick with software and use:

Now talking about environment variable configuration options, some people don't know, but these are entirely optional to use and will always have a sensible default value. If the quality over time is the preference, the end user can opt to leave the default value, which is software decoding/encoding. 😉

Yeap, I see that here: https://github.com/C4illin/ConvertX/pull/480/changes#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R101

@webysther
Copy link
webysther commented Jan 27, 2026

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523
I don't think container changes make sense to include on this PR

Mostly of this check for cuda it's not necessary, even for igpu in general:
From https://trac.ffmpeg.org/wiki/HWAccelIntro#Usewiththeffmpegcommand-linetool

Internal hwaccel decoders are enabled via the -hwaccel option (now supported in ffplay). The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

Maybe is better to switch for jellyfin-ffmpeg and just copy the compiled optmized version inside the container.
About your option about add container changes in this PR, I think this PR is not even necessary.

I think you're glossing over the nuance of how hardware decoding/encoding is actually enabled and used in ffmpeg.

The -hwaccel option is used to enable and specify a hardware acceleration method for video decoding (e.g., -hwaccel cuda for NVIDIA GPUs, -hwaccel vaapi for Intel hardware, -hwaccel videotoolbox for Apple devices). When researching for this change I did notice an auto option in the documentation, but I personally had issues with it incorrectly selecting methods and have found comments from others saying they have experienced the same. Even if the auto option were to work reliably, it still does not automatically handle or enable GPU acceleration for encoding, for which you must explicitly select a hardware-accelerated encoder via options like -c:v (or -vcodec), such as:

  • -c:v h264_nvenc (NVIDIA encoding)
  • -c:v h264_vaapi (VAAPI encoding)
  • -c:v hevc_videotoolbox (Apple encoding)

I don't know about jellyfin-ffmpeg and whether or not it has any added functionality, but as far as i can tell it won't help in this regard.

Let's go for a better solution.

Let's use the same logic from here: https://docs.linuxserver.io/images/docker-jellyfin/#nvidia

Which works for millions of docker pull every year:

Finally not less important, add to dockerfile it's not code, which i DCA2 s more sustainable, add code and not maintain will drive to caos, it's better to reuse which is maintained by giants.

PS(1): The format by far for HA it's not enough, the filters, the audio codec, the container, the format in container.
PS(2): One thing will be much better after all, using copy to using muxing, which is what you did, but I will change a bit to use the long parameter to help others.

@Rob-Otman
Copy link
Author

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523
I don't think container changes make sense to include on this PR

Mostly of this check for cuda it's not necessary, even for igpu in general:
From https://trac.ffmpeg.org/wiki/HWAccelIntro#Usewiththeffmpegcommand-linetool

Internal hwaccel decoders are enabled via the -hwaccel option (now supported in ffplay). The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

Maybe is better to switch for jellyfin-ffmpeg and just copy the compiled optmized version inside the container.
About your option about add container changes in this PR, I think this PR is not even necessary.

I think you're glossing over the nuance of how hardware decoding/encoding is actually enabled and used in ffmpeg.
The -hwaccel option is used to enable and specify a hardware acceleration method for video decoding (e.g., -hwaccel cuda for NVIDIA GPUs, -hwaccel vaapi for Intel hardware, -hwaccel videotoolbox for Apple devices). When researching for this change I did notice an auto option in the documentation, but I personally had issues with it incorrectly selecting methods and have found comments from others saying they have experienced the same. Even if the auto option were to work reliably, it still does not automatically handle or enable GPU acceleration for encoding, for which you must explicitly select a hardware-accelerated encoder via options like -c:v (or -vcodec), such as:

  • -c:v h264_nvenc (NVIDIA encoding)
  • -c:v h264_vaapi (VAAPI encoding)
  • -c:v hevc_videotoolbox (Apple encoding)

I don't know about jellyfin-ffmpeg and whether or not it has any added functionality, but as far as i can tell it won't help in this regard.

Let's go for a better solution.

Let's use the same logic from here: https://docs.linuxserver.io/images/docker-jellyfin/#nvidia

Which works for millions of docker pull every year:

Finally not less important, add to dockerfile it's not code, which is more sustainable, add code and not maintain will drive to caos, it's better to reuse which is maintained by giants.

PS(1): The format by far for HA it's not enough, the filters, the audio codec, the container, the format in container. PS(2): One thing will be much better after all, using copy to using muxing, which is what you did, but I will change a bit to use the long parameter to help others.

Sounds like you've got it all figured out. Why don't you submit a PR with your changes and once they have been tested working I will happily close this PR.

@webysther
Copy link
webysther commented Jan 27, 2026

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523
I don't think container changes make sense to include on this PR

Mostly of this check for cuda it's not necessary, even for igpu in general:
From https://trac.ffmpeg.org/wiki/HWAccelIntro#Usewiththeffmpegcommand-linetool

Internal hwaccel decoders are enabled via the -hwaccel option (now supported in ffplay). The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

Maybe is better to switch for jellyfin-ffmpeg and just copy the compiled optmized version inside the container.
About your option about add container changes in this PR, I think this PR is not even necessary.

I think you're glossing over the nuance of how hardware decoding/encoding is actually enabled and used in ffmpeg.
The -hwaccel option is used to enable and specify a hardware acceleration method for video decoding (e.g., -hwaccel cuda for NVIDIA GPUs, -hwaccel vaapi for Intel hardware, -hwaccel videotoolbox for Apple devices). When researching for this change I did notice an auto option in the documentation, but I personally had issues with it incorrectly selecting methods and have found comments from others saying they have experienced the same. Even if the auto option were to work reliably, it still does not automatically handle or enable GPU acceleration for encoding, for which you must explicitly select a hardware-accelerated encoder via options like -c:v (or -vcodec), such as:

  • -c:v h264_nvenc (NVIDIA encoding)
  • -c:v h264_vaapi (VAAPI encoding)
  • -c:v hevc_videotoolbox (Apple encoding)

I don't know about jellyfin-ffmpeg and whether or not it has any added functionality, but as far as i can tell it won't help in this regard.

Let's go for a better solution.
Let's use the same logic from here: https://docs.linuxserver.io/images/docker-jellyfin/#nvidia
Which works for millions of docker pull every year:

Finally not less important, add to dockerfile it's not code, which is more sustainable, add code and not maintain will drive to caos, it's better to reuse which is maintained by giants.
PS(1): The format by far for HA it's not enough, the filters, the audio codec, the container, the format in container. PS(2): One thing will be much better after all, using copy to using muxing, which is what you did, but I will change a bit to use the long parameter to help others.

Sounds like you've got it all figured out. Why don't you submit a PR with your changes and once they have been tested working I will happily close this PR.

I don't need this feature, maybe the muxing part of it. But more broad people will have usage filled if intel QSV (iGPU) is adopt than this, no enconding limitations, vendor neutral drivers and better quality for W.

Maybe split this PR in small ones, this target multiple problems, but the title start it wrong, this need to be: Add nvidia only hardware encoding support for ffmpeg.

It's better plain software conversion working than a broken HA at the best "works in my machine", if the maintainer enjoy suffer, okay.

@Rob-Otman Rob-Otman changed the title Add hardware encoding support for ffmpeg Add nvidia cuda hardware accelerated decoding/encoding support Jan 27, 2026
@Rob-Otman
Copy link
Author

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523
I don't think container changes make sense to include on this PR

Mostly of this check for cuda it's not necessary, even for igpu in general:
From https://trac.ffmpeg.org/wiki/HWAccelIntro#Usewiththeffmpegcommand-linetool

Internal hwaccel decoders are enabled via the -hwaccel option (now supported in ffplay). The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

Maybe is better to switch for jellyfin-ffmpeg and just copy the compiled optmized version inside the container.
About your option about add container changes in this PR, I think this PR is not even necessary.

I think you're glossing over the nuance of how hardware decoding/encoding is actually enabled and used in ffmpeg.
The -hwaccel option is used to enable and specify a hardware acceleration method for video decoding (e.g., -hwaccel cuda for NVIDIA GPUs, -hwaccel vaapi for Intel hardware, -hwaccel videotoolbox for Apple devices). When researching for this change I did notice an auto option in the documentation, but I personally had issues with it incorrectly selecting methods and have found comments from others saying they have experienced the same. Even if the auto option were to work reliably, it still does not automatically handle or enable GPU acceleration for encoding, for which you must explicitly select a hardware-accelerated encoder via options like -c:v (or -vcodec), such as:

  • -c:v h264_nvenc (NVIDIA encoding)
  • -c:v h264_vaapi (VAAPI encoding)
  • -c:v hevc_videotoolbox (Apple encoding)

I don't know about jellyfin-ffmpeg and whether or not it has any added functionality, but as far as i can tell it won't help in this regard.

Let's go for a better solution.
Let's use the same logic from here: https://docs.linuxserver.io/images/docker-jellyfin/#nvidia
Which works for millions of docker pull every year:

Finally not less important, add to dockerfile it's not code, which is more sustainable, add code and not maintain will drive to caos, it's better to reuse which is maintained by giants.
PS(1): The format by far for HA it's not enough, the filters, the audio codec, the container, the format in container. PS(2): One thing will be much better after all, using copy to using muxing, which is what you did, but I will change a bit to use the long parameter to help others.

Sounds like you've got it all figured out. Why don't you submit a PR with your changes and once they have been tested working I will happily close this PR.

I don't need this feature, maybe the muxing part of it. But more broad people will have usage filled if intel QSV (iGPU) is adopt than this, no enconding limitations, vendor neutral drivers and better quality for W.

Maybe split this PR in small ones, this target multiple problems, but the title start it wrong, this need to be: Add nvidia only hardware encoding support for ffmpeg.

It's better plain software conversion working than a broken HA at the best "works in my machine", if the maintainer enjoy suffer, okay.

I've changed the title of the PR, which is the extent of what I've found to be actionable from your comments.

I suggest you attempt to make the changes you've suggested and if you end up with something more than a broken HA at the best "works in my machine" I will tip my hat to you.

@webysther
Copy link

Just my cents: igpu by far will help more people and using /dev/dri is no brainer, just use the linuxserver base as a sample.

It seems like you have covered your ask with 523
I don't think container changes make sense to include on this PR

Mostly of this check for cuda it's not necessary, even for igpu in general:
From https://trac.ffmpeg.org/wiki/HWAccelIntro#Usewiththeffmpegcommand-linetool

Internal hwaccel decoders are enabled via the -hwaccel option (now supported in ffplay). The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

Maybe is better to switch for jellyfin-ffmpeg and just copy the compiled optmized version inside the container.
About your option about add container changes in this PR, I think this PR is not even necessary.

I think you're glossing over the nuance of how hardware decoding/encoding is actually enabled and used in ffmpeg.
The -hwaccel option is used to enable and specify a hardware acceleration method for video decoding (e.g., -hwaccel cuda for NVIDIA GPUs, -hwaccel vaapi for Intel hardware, -hwaccel videotoolbox for Apple devices). When researching for this change I did notice an auto option in the documentation, but I personally had issues with it incorrectly selecting methods and have found comments from others saying they have experienced the same. Even if the auto option were to work reliably, it still does not automatically handle or enable GPU acceleration for encoding, for which you must explicitly select a hardware-accelerated encoder via options like -c:v (or -vcodec), such as:

  • -c:v h264_nvenc (NVIDIA encoding)
  • -c:v h264_vaapi (VAAPI encoding)
  • -c:v hevc_videotoolbox (Apple encoding)

I don't know about jellyfin-ffmpeg and whether or not it has any added functionality, but as far as i can tell it won't help in this regard.

Let's go for a better solution.
Let's use the same logic from here: https://docs.linuxserver.io/images/docker-jellyfin/#nvidia
Which works for millions of docker pull every year:

Finally not less important, add to dockerfile it's not code, which is more sustainable, add code and not maintain will drive to caos, it's better to reuse which is maintained by giants.
PS(1): The format by far for HA it's not enough, the filters, the audio codec, the container, the format in container. PS(2): One thing will be much better after all, using copy to using muxing, which is what you did, but I will change a bit to use the long parameter to help others.

Sounds like you've got it all figured out. Why don't you submit a PR with your changes and once they have been tested working I will happily close this PR.

I don't need this feature, maybe the muxing part of it. But more broad people will have usage filled if intel QSV (iGPU) is adopt than this, no enconding limitations, vendor neutral drivers and better quality for W.
Maybe split this PR in small ones, this target multiple problems, but the title start it wrong, this need to be: Add nvidia only hardware encoding support for ffmpeg.
It's better plain software conversion working than a broken HA at the best "works in my machine", if the maintainer enjoy suffer, okay.

I've changed the title of the PR, which is the extent of what I've found to be actionable from your comments.

I suggest you attempt to make the changes you've suggested and if you end up with something more than a broken HA at the best "works in my machine" I will tip my hat to you.

I exchange the hat to you finish the docs and test if I delivery a docker image with all gpu's working and almost 3x less code than yours. We have a deal?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

0