E526 Add TP CLI argument to multimodal inference examples by faaany · Pull Request #29301 · vllm-project/vllm · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@faaany
Copy link
Contributor
@faaany faaany commented Nov 24, 2025

Purpose

Currently, the example scripts have hardcoded tensor_parallel_size values for different models. Users running these examples on different hardware configurations (e.g., varying GPU memory) often encounter OOM errors and need to manually edit the code to adjust tensor parallelism. This change provides a more user-friendly way to handle such scenarios.

Test Plan

# Override tensor parallel size for vision language model
python examples/offline_inference/vision_language_multi_image.py -m aria --tensor-parallel-size 2

# Use shorthand notation
python examples/offline_inference/audio_language.py -m qwen2_audio -tp 2

# If not specified, uses the model's default configuration
python examples/offline_inference/vision_language.py -m llama4

Test Result

  • Tested that the argument correctly overrides default settings
  • Verified backward compatibility (scripts work without the argument)
  • Confirmed help message displays correctly

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
@faaany faaany changed the title add tp as argument Add TP CLI argument to multimodal inference examples Nov 24, 2025
@mergify
Copy link
mergify bot commented Nov 24, 2025

Documentation preview: https://vllm--29301.org.readthedocs.build/en/29301/

@mergify mergify bot added the documentation Improvements or addi 8000 tions to documentation label Nov 24, 2025
Copy link
Contributor
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a --tensor-parallel-size (-tp) command-line argument to the audio_language.py, vision_language.py, and vision_language_multi_image.py example scripts. This allows users to override the default tensor parallel size for the models. The changes are implemented correctly by updating the engine_args before initializing the LLM engine.

I've identified a potential issue regarding input validation. The newly added tensor-parallel-size argument is not checked for positivity. Providing a non-positive value could lead to a crash. I've added comments with suggestions to add validation for this argument in all three modified files to make the scripts more robust.

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
@faaany
Copy link
Contributor Author
faaany commented Nov 24, 2025

cc @jikunshang @yma11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

0