8000 High-level API for multimodality · Issue #928 · abetlen/llama-cpp-python · GitHub
[go: up one dir, main page]

Skip to content
High-level API for multimodality #928
Closed
@remixer-dec

Description

@remixer-dec

Is your feature request related to a problem? Please describe.

Current high-level implementation of multimodality is relying on a specific prompt format.

Describe the solution you'd like

Models like Obsidian work with llama.cpp server and have a different format. It would be nice to have a high-level API for multimodality in llama-cpp-python to be able to pass image/images as an argument after initializing Llama() with all the paths to required extra-models, without relying on a pre-defined prompt format such as Llava15ChatHandler.

Describe alternatives you've considered
Alternatively, a custom prompt format class that supports images can be implemented, where prompt string is passed as an argument.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0