Eval bug: crash when pooling_type == LLAMA_POOLING_TYPE_MEAN

### Name and Version

Revision 9b169a4d4e01af7bc07a6981b53b27c18c9470d8


### Operating systems

Linux

### GGML backends

CPU

### Hardware

ARM Ampere

### Models

Qwen2.5-14B-Instruct-1M-Q5_K_M

### Problem description & steps to reproduce

Setting pooling_type = LLAMA_POOLING_TYPE_MEAN and calling llama_init_from_model() causes this crash:

```
/build/source/ggml/src/ggml.c:2738: GGML_ASSERT(ggml_can_mul_mat(a, b)) failed
```

Setting to LLAMA_POOLING_TYPE_LAST and changing nothing else works correctly.

### First Bad Commit

_No response_

### Relevant log output

```shell
/build/source/ggml/src/ggml.c:2738: GGML_ASSERT(ggml_can_mul_mat(a, b)) failed
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions