Skip to content

[RVV] add qemu cpu option for rvv fp16#9642

Open
ken-unger wants to merge 1 commit intogoogle:masterfrom
ken-unger:qemu-riscv64
Open

[RVV] add qemu cpu option for rvv fp16#9642
ken-unger wants to merge 1 commit intogoogle:masterfrom
ken-unger:qemu-riscv64

Conversation

@ken-unger
Copy link
Contributor

Add cpu option 'zfh=true,x-zvfh=true' to enable fp16 scalar and vector in qemu rv64 cpu model. Note that this works correctly in the stock qemu-riscv64 for ubuntu 24.02 which I am using, but I'm unclear what version of qemu-riscv64 is used in xnnpack CT. In more recent versions it may be that 'x-zvfh=true' needs to be replaced with 'zvfh=true'

>qemu-riscv64 --version
qemu-riscv64 version 8.2.2 (Debian 1:8.2.2+ds-0ubuntu1.13)
Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers

Example usage with fp16.

 qemu-riscv64 -cpu rv64,zba=true,zbb=true,zbc=true,zbs=true,v=true,vlen=512,elen=64,vext_spec=v1.0,zfh=true,x-zvfh=true -L <path to toolchain>/sysroot ./build/linux/riscv64/test/f16-vabs-test
Running main() from <>/XNNPACK/build/linux/riscv64/googletest-source/googletest/src/gtest_main.cc
[==========] Running 20 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u1v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_eq (4 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_div (0 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_lt (0 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.batch_gt (0 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u1v.inplace (5 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u1v (13 ms total)

[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u2v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_eq (1 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_div (7 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_lt (7 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.batch_gt (13 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u2v.inplace (6 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u2v (41 ms total)

[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u4v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_eq (1 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_div (13 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_lt (19 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.batch_gt (55 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u4v.inplace (7 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u4v (104 ms total)

[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u8v
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_eq
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_eq (1 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_div
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_div (25 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_lt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_lt (77 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_gt
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.batch_gt (231 ms)
[ RUN      ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.inplace
[       OK ] xnn_f16_vabs_ukernel__rvvfp16arith_u8v.inplace (13 ms)
[----------] 5 tests from xnn_f16_vabs_ukernel__rvvfp16arith_u8v (353 ms total)

[----------] Global test environment tear-down
[==========] 20 tests from 4 test suites ran. (518 ms total)
[  PASSED  ] 20 tests.

@ken-unger
Copy link
Contributor Author

Note that there are a few segv's in test cases when building/running with XNN_ENABLE_RISCV_FP16_VECTOR, since many of the f16 configs don't return NULL if there is no implementation (and a return of NULL is not always checked). However I hope to complete all the remaining necessary rvv fp16 kernels this month.

copybara-service bot pushed a commit that referenced this pull request Mar 12, 2026
--
7d64160 by Ken Unger <ken.j.unger@gmail.com>:

add qemu cpu option for rvv fp16

FUTURE_COPYBARA_INTEGRATE_REVIEW=#9642 from ken-unger:qemu-riscv64 7d64160
PiperOrigin-RevId: 880698451
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants