Qwen/Qwen3-0.6B

Primitive: /generate · Generate · Qwen3

Streaming

Overview

Hardware: — drives latency, throughput & cost

Cost is approximate — computed from list GPU prices; your actual price depends on the provider you deploy SIE with.

legal generation en

Quality

accuracy 0.4600

Performance RTX-PRO-6000 b1 c4

Throughput 621 tok/s

p50 latency 1.7s

scientific generation en

Quality

accuracy 0.2475

Performance RTX-PRO-6000 b1 c4

Throughput 598 tok/s

p50 latency 508.2ms

medical generation en

Quality

accuracy 0.2533

Performance RTX-PRO-6000 b1 c4

Throughput 593 tok/s

p50 latency 317.4ms

general generation en

Quality

accuracy 0.2367

Performance RTX-PRO-6000 b1 c4

Throughput 573 tok/s

p50 latency 216.5ms