qwen3.5:2b

qwen35 · 2.3B · Q8_0

THINKING MODEL ECO MODE

LENOVO 20L7CTO1WW (Intel Core™ i5-8350U)

23 GB · Fedora Linux 43

Tested on April 29, 2026

Top 100% Compare

Global Score

13 /100

Not Rec.

Hardware Fit

41/100

Quality

1/100

Get this model

🦙

Ollama

ollama pull qwen3.5:2b

View on Ollama Library

ollama.com/library/qwen3.5

Get it in LM Studio

Search and download models directly from the app

🤗

Find on HuggingFace

GGUF versions & quantizations

Hardware

Machine: LENOVO 20L7CTO1WW
CPU: Intel Core™ i5-8350U
Cores: 8 threads (4 cores)
Frequency: 1.7 GHz
RAM: 23 GB
GPU: Kaby Lake-R GT2 [UHD Graphics 620]
OS: Fedora Linux 43
Arch: x64
Power Mode: low-power

Performance

Tokens/sec Tokens generated per second — higher is better: 4.5
Standard deviation Variation of speed across benchmark runs — lower means more consistent: ±0.2
First chunk latency Delay before the first streamed chunk arrives from runtime: 1.4 s
Time to first token How fast the model starts responding — lower is better: 31.1 s
Load time Time to load the model into memory before inference: 7.1 s
Memory usage RAM consumed during inference — percentage of total system memory: 3.8 GB (16%)
Total tokens Total number of tokens generated across all benchmark prompts: 1429
Thinking tokens (est.) Estimated internal reasoning tokens — models with "thinking" generate more tokens: ~663

Score breakdown

Speed

11/50

Time to first token

0/20

Memory

30/30

Quality

Reasoning

0/20

Coding

0/20

Instruction following

1/20

Structured output

0/15

Math

0/15

Multilingual

0/10

Category levels

Reasoning: Poor Coding: Poor Instruction Following: Poor Structured Output: Poor Math: Poor Multilingual: Poor

Metadata

Spec version: 0.2.1
Runtime: Ollama 0.21.2
Model format: GGUF
Hardware profile: ENTRY
Result hash: d233c25d18b7913b32fb8fec28760fa6e8173fcaf6a6f5ad1f08facde12dbff3

Interpretation

Hardware fit: 41/100. Overall suitability: NOT RECOMMENDED (Global 13/100). Category profile: Reasoning: Poor, Coding: Poor, Instruction Following: Poor, Structured Output: Poor, Math: Poor, Multilingual: Poor. Warning: model produced very low accuracy on quality tasks — results may be unusable despite good hardware performance.

Warnings

Model produced very low accuracy on quality tasks — results may be unusable despite good hardware performance.
System was in low-power mode during this benchmark.

Disqualifiers

Time to first token too high: 31145ms (maximum: 22196ms for ENTRY profile)

Bench Environment

Thermal: nominal Power: AC Swap delta: +0.2 GB CPU load: avg 64% (peak 79%)

Run yours now

$ npm install -g metrillm@latest

$ metrillm

Requires Node 20+ and Ollama or LM Studio running

Or run without installing: npx metrillm@latest