[RVV] add qemu cpu option for rvv fp16 by ken-unger · Pull Request #9642 · google/XNNPACK

ken-unger · 2026-03-08T01:20:38Z

Add cpu option 'zfh=true,x-zvfh=true' to enable fp16 scalar and vector in qemu rv64 cpu model. Note that this works correctly in the stock qemu-riscv64 for ubuntu 24.02 which I am using, but I'm unclear what version of qemu-riscv64 is used in xnnpack CT. In more recent versions it may be that 'x-zvfh=true' needs to be replaced with 'zvfh=true'

>qemu-riscv64 --version
qemu-riscv64 version 8.2.2 (Debian 1:8.2.2+ds-0ubuntu1.13)
Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers

Example usage with fp16.

 qemu-riscv64 -cpu rv64,zba=true,zbb=true,zbc=true,zbs=true,v=true,vlen=512,elen=64,vext_spec=v1.0,zfh=true,x-zvfh=true -L <path to toolchain>/sysroot ./build/linux/riscv64/test/f16-vabs-test
Running main() from <>/XNNPACK/build/linux/riscv64/googletest-source/googletest/src/gtest_main.cc
[==========] Running 20 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u1v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_eq (4 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_div (0 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_lt (0 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_gt (0 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.inplace (5 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u1v (13 ms total)

[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u2v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_eq (1 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_div (7 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_lt (7 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_gt (13 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.inplace (6 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u2v (41 ms total)

[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u4v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_eq (1 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_div (13 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_lt (19 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_gt (55 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.inplace (7 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u4v (104 ms total)

[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u8v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_eq (1 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_div (25 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_lt (77 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_gt (231 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.inplace (13 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u8v (353 ms total)

[----------] Global test environment tear-down
[==========] 20 tests from 4 test suites ran. (518 ms total)
[  PASSED  ] 20 tests.

ken-unger · 2026-03-09T16:26:48Z

Note that there are a few segv's in test cases when building/running with XNN_ENABLE_RISCV_FP16_VECTOR, since many of the f16 configs don't return NULL if there is no implementation (and a return of NULL is not always checked). However I hope to complete all the remaining necessary rvv fp16 kernels this month.

-- 7d64160 by Ken Unger <ken.j.unger@gmail.com>: add qemu cpu option for rvv fp16 FUTURE_COPYBARA_INTEGRATE_REVIEW=#9642 from ken-unger:qemu-riscv64 7d64160 PiperOrigin-RevId: 880698451

add qemu cpu option for rvv fp16

7d64160

dsharlet approved these changes Mar 9, 2026

View reviewed changes

copybara-service bot mentioned this pull request Mar 12, 2026

Copybara import of the project: #9672

Open

fbarchard approved these changes Mar 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RVV] add qemu cpu option for rvv fp16#9642

[RVV] add qemu cpu option for rvv fp16#9642
ken-unger wants to merge 1 commit intogoogle:masterfrom
ken-unger:qemu-riscv64

ken-unger commented Mar 8, 2026

Uh oh!

ken-unger commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ken-unger commented Mar 8, 2026

Uh oh!

ken-unger commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants