Is there a way to get response to image input?

Hi there,

I am currently working on [Bidi Streaming](https://google.github.io/adk-docs/streaming/custom-streaming-ws/) using websockets. I am trying to have a image streaming feature too. 
 When I send images and audio(or text) together I got response. But I couldn't find a way to get response to only image data. Is there a way to do that?

In main.py, I handle image data like this:

``` 
elif mime_type.startswith("image"):
            # Send image data (video frames)
            decoded_data = base64.b64decode(data)
            live_request_queue.send_realtime(Blob(data=decoded_data, mime_type=mime_type))
            print(f"[CLIENT TO AGENT]: {mime_type}: {len(decoded_data)} bytes")
```
on app.js I added:  

```

const startCamButton = document.getElementById('startCamButton')
startCamButton.addEventListener('click', async () => {
  try {
    startCamButton.disabled = true
    startAudioButton.disabled = true
    startAudio()
    is_audio = true

    connectWebsocket()
    // Start video capture at fps FPS
    await startVideoCapture(videoFrameHandler, fps)
  } catch (error) {
    console.error('Failed to start camera:', error)
    startCamButton.disabled = false
    alert(
      'Failed to access camera. Please make sure you have granted camera permissions.'
    )
  }
})

// Video frame handler
function videoFrameHandler(frameData) {
  // Send the frame data as base64
  sendMessage({
    mime_type: 'image/jpeg',
    data: arrayBufferToBase64(frameData),
  })
  console.log('[CLIENT TO AGENT] sent video frame: bytes')
}
```
To app.js.

With these additions, I don't get response to only image data. For example when I say tell me the number of fingers you see continuously, I only got response when I talk. I'd like to send webcam frames and get responses like "You are showing three fingers" without having to speak or type anything.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions