How we assess a product
Engineering over marketing. The score rewards the facts that actually differentiate good from bad in a category — and it always travels with how confident we are in each fact.
What "good" means
Every category has a handful of fields that decide whether a product is well-made and worth its price. We score those, weight the rest by whether the data is even present, and roll it into a single 0–100 quality score per product. The judgement that matters to a buyer is fitness for purpose and cost of use over time, not the sticker price. A few examples of the kind of thing that counts:
- Raw denim / jeans — fabric weight and weave, fabric composition, country/mill of origin. Heavier, honestly-woven, single-origin denim ages differently than a stretchy fast-fashion 5-pocket.
- Welted footwear — construction (Goodyear-welted vs Blake vs cemented), leather grade, sole. Construction decides whether a shoe can be resoled — its real cost over years.
- Fragrance — concentration (parfum → EDP → EDT → EDC), longevity, projection. A cheaper parfum can out-last a pricier EDT.
Confidence, not vibes
Every value is backed by evidence and corroboration across independent sources, and surfaced in a band:
- Solid (confidence ≥ 0.4) — well-evidenced; stated as a fact, with the number.
- Low-confidence lead (0.1–0.4) — one thin source; a starting point, not a claim.
- Hidden (below 0.1) — not surfaced.
When independent sources disagree, both values stay visible — a disagreement is a buying signal, not noise to smooth over. A value seen by only one independent source is flagged as needing corroboration. Confidence comes from verifiable facts, not from any one author's say-so.
What we don't publish
We publish the qualitative method — which fields matter in a category and why. We
do not publish the scoring formula, the per-field weights, or the thresholds.
That weighting is the moat; trust is meant to come from the facts and their confidence, which you
can verify, not from us exposing the algorithm. The API returns a product's
total_score and which fields are missing — never the per-field breakdown.
Want your agent to use this method? Install the skill.