qwen3.5:4b

qwen35 · 4.7B · Q4_K_M

THINKING MODEL ECO MODE

Gigabyte Technology Co., Ltd. H170M-D3H (Intel Core™ i5-6500)

16 GB · Ubuntu 24.04.4 LTS

Tested on March 5, 2026

Top 69% Compare

Global Score

61 /100

Good

Hardware Fit

69/100

Quality

58/100

Get this model

🦙

Ollama

ollama pull qwen3.5:4b

View on Ollama Library

ollama.com/library/qwen3.5

Get it in LM Studio

Search and download models directly from the app

🤗

Find on HuggingFace

GGUF versions & quantizations

Hardware

Machine: Gigabyte Technology Co., Ltd. H170M-D3H
CPU: Intel Core™ i5-6500
Cores: 4 total (4 perf)
Frequency: 3.2 GHz
RAM: 16 GB
GPU: GP104 [GeForce GTX 1070]
OS: Ubuntu 24.04.4 LTS
Arch: x64
Power Mode: low-power

Performance

Tokens/sec Tokens generated per second — higher is better: 35.7
Standard deviation Variation of speed across benchmark runs — lower means more consistent: ±0.3
First chunk latency Delay before the first streamed chunk arrives from runtime: 438 ms
Time to first token How fast the model starts responding — lower is better: 5.1 s
Load time Time to load the model into memory before inference: 40.1 s
Memory usage RAM consumed during inference — percentage of total system memory: 5.5 GB (35%)
Total tokens Total number of tokens generated across all benchmark prompts: 1429
Thinking tokens (est.) Estimated internal reasoning tokens — models with "thinking" generate more tokens: ~735

Score breakdown

Speed

50/50

Time to first token

11/20

Memory

8/30

Quality

Reasoning

18/20

Coding

1/20

Instruction following

8/20

Structured output

8/15

Math

13/15

Multilingual

10/10

Category levels

Reasoning: Strong Coding: Poor Instruction Following: Weak Structured Output: Adequate Math: Strong Multilingual: Strong

Metadata

Spec version: 0.2.1
Runtime: Ollama 0.17.6
Model format: GGUF
Hardware profile: ENTRY
Result hash: 1c55813a6fb4082a58b2ea2d6ccb21af43202153048b43d596d2ee2f3596c02c

Interpretation

Hardware fit: 69/100. Overall suitability: GOOD (Global 61/100). Category profile: Reasoning: Strong, Coding: Poor, Instruction Following: Weak, Structured Output: Adequate, Math: Strong, Multilingual: Strong.

Warnings

System was in low-power mode during this benchmark.
Significant swap activity during benchmark (+0.9 GB). Model may exceed available RAM — results are severely degraded.

Bench Environment

Thermal: nominal Swap delta: +0.9 GB CPU load: avg 38% (peak 45%)

Run yours now

$ npm install -g metrillm@latest

$ metrillm

Requires Node 20+ and Ollama or LM Studio running

Or run without installing: npx metrillm@latest