Understanding Model Benchmarks

A simple guide to the metrics used to evaluate abliterated AI models on Abliz.

UGI

How well a model handles sensitive topics without refusing

W/10

Willingness score - how far before the model refuses

NatInt

Natural Intelligence - general knowledge and reasoning

Writing

Creative writing ability, style, and output quality

UGI - Uncensored General Intelligence

Measures how well a model handles sensitive or controversial topics

What it measures:

UGI tests a model's knowledge and willingness to engage with topics that many AI models refuse to discuss. A higher UGI score means the model is more capable of providing information without unnecessary restrictions.

Hazardous

Knowledge of sensitive topics that typical AI models avoid discussing

Entertainment

Knowledge of adult or controversial entertainment and media content

SocPol

Knowledge of sensitive socio-political topics and current events

W/10 - Willingness Score

Measures how far a model can be pushed before it refuses to answer. Scale of 0-10, where 10 means the model almost never refuses.

W/10-Direct:Does the model outright refuse to respond?

W/10-Adherence:Does the model follow instructions without deviating?

Quick Reference

Metric	What it means	Higher is better?
UGI	Handles sensitive topics without refusing	Yes
W/10	Willingness to follow instructions	Yes
NatInt	General intelligence and reasoning	Yes
Writing	Creative writing quality	Yes
Political Lean	Political alignment (-100% to 100%)	Neutral
Error metrics	Cooking, GeoGuesser, Weight, etc.	No (lower is better)

Back to Model Rankings