Skip to content

Audio sample in Gemma-cookbook fails due to get_placeholder_mask() missing image_features argument #234

@Manas-Nanivadekar

Description

@Manas-Nanivadekar

Description of the bug:

When running the provided audio sample in the Gemma-cookbook Colab notebook (link to my run), the execution fails with the following error:

TypeError: Gemma3nModel.get_placeholder_mask() missing 1 required positional argument: 'image_features'

The error originates from the forward method in modeling_gemma3n.py at this section:

_, special_audio_mask = self.get_placeholder_mask(
    input_ids, inputs_embeds=inputs_embeds, audio_features=audio_features
)

It appears that get_placeholder_mask() now requires an additional image_features argument, but the current audio processing flow does not supply it, resulting in the failure.

This issue is reproducible even when following the official audio example from the documentation:
https://ai.google.dev/gemma/docs/capabilities/audio

Actual vs expected behavior:

Actual behavior:

Running the sample code for audio input results in a TypeError due to a missing required argument in get_placeholder_mask().

Expected behavior:

The audio input example should run successfully and return the model’s output for the provided audio, without requiring any manual modification to the core library or notebook.

Any other information you'd like to share?

NA

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions