game-scoring

Use when working with candidate scoring, confidence calculation, softmax aggregation, or guess decision logic. Load for understanding how candidates are ranked, when the system decides to guess, and how semantic + geographic scores combine. Covers temperature tuning, entropy thresholds, and margin logic.

$ 설치

git clone https://github.com/discountedcookie/10x-mapmaster /tmp/10x-mapmaster && cp -r /tmp/10x-mapmaster/.opencode/skills/game-scoring ~/.claude/skills/10x-mapmaster

// tip: Run this command in your terminal to install the skill


name: game-scoring description: >- Use when working with candidate scoring, confidence calculation, softmax aggregation, or guess decision logic. Load for understanding how candidates are ranked, when the system decides to guess, and how semantic + geographic scores combine. Covers temperature tuning, entropy thresholds, and margin logic.

Game Scoring

Scoring and confidence calculation patterns specific to this game.

Announce: "I'm using game-scoring to understand scoring logic correctly."

Scoring Pipeline Overview

Player Description
       ↓
   Embedding
       ↓
Semantic Similarity (per place)
       ↓
Geographic Filtering (include/exclude regions)
       ↓
Combined Score + Softmax
       ↓
Confidence Metrics (max_prob, margin, entropy)
       ↓
Decision: Ask Question or Guess?

Semantic Similarity

Traits are matched via embedding similarity:

-- For each place, calculate trait similarity
WITH trait_similarities AS (
  SELECT 
    pt.place_id,
    1 - (te.embedding <=> v_description_embedding) AS similarity
  FROM place_traits pt
  JOIN embeddings te ON te.id = pt.embedding_id
)

Softmax Aggregation

NOT simple average. Softmax lets top traits dominate:

-- Softmax-weighted average
WITH softmax_weights AS (
  SELECT 
    place_id,
    similarity,
    exp(similarity / v_temperature) AS exp_sim,
    SUM(exp(similarity / v_temperature)) OVER (PARTITION BY place_id) AS sum_exp
  FROM trait_similarities
)
SELECT 
  place_id,
  SUM((exp_sim / sum_exp) * similarity) AS aggregated_score
FROM softmax_weights
GROUP BY place_id;

Temperature effect:

  • Low (0.1): Top traits dominate strongly
  • High (1.0): All traits contribute more equally

Confidence Metrics

Three metrics determine when to guess:

-- Calculate from candidate probabilities
SELECT
  MAX(probability) AS max_prob,           -- Top candidate confidence
  MAX(probability) - MAX(second_prob) AS margin,  -- Gap to #2
  -SUM(p * ln(p)) AS entropy              -- Spread of distribution
FROM candidates;
MetricHigh Value MeansWhen to Guess
max_probStrong #1 candidate> threshold (e.g., 0.7)
marginClear separation> threshold (e.g., 0.3)
entropySpread out (uncertain)< threshold (e.g., 1.0)

Guess Decision Logic

-- System guesses when confident
IF v_max_prob >= get_config_float('confidence.top_prob_threshold')
   AND v_margin >= get_config_float('confidence.margin_threshold')
   AND v_entropy <= get_config_float('confidence.entropy_threshold')
THEN
  -- Make a guess
  RETURN create_guess_turn(v_top_candidate);
ELSE
  -- Ask a question
  RETURN create_question_turn(v_best_question);
END IF;

Score Combination

Semantic and geographic scores combine:

-- Final score = semantic * (1 + geographic_bonus)
SELECT
  place_id,
  semantic_score,
  geographic_bonus,  -- From region matching
  semantic_score * (1 + geographic_bonus) AS combined_score
FROM scored_candidates
ORDER BY combined_score DESC;

Configuration Parameters

All thresholds come from game_logic.config:

-- Scoring
get_config_float('scoring.temperature', 0.5)
get_config_float('scoring.initial_candidate_threshold', 0.3)

-- Confidence
get_config_float('confidence.top_prob_threshold', 0.7)
get_config_float('confidence.margin_threshold', 0.3)
get_config_float('confidence.entropy_threshold', 1.5)

-- Question selection
get_config_float('questions.min_split_quality', 0.3)

Question Selection

Questions are ranked by split quality:

-- Perfect split = 0.5 yes, 0.5 no → quality = 1.0
-- All yes or all no → quality = 0.5
split_quality = 1.0 - ABS(0.5 - yes_ratio)

Best question maximizes information gain.

Answer Processing

Answers update candidate scores:

-- 'yes' answer for geographic question
-- Keep only candidates in the region
UPDATE candidates SET
  active = ST_Intersects(geom, region_geom)
WHERE session_id = v_session_id;

-- 'no' answer
-- Keep only candidates NOT in the region
UPDATE candidates SET
  active = NOT ST_Intersects(geom, region_geom)
WHERE session_id = v_session_id;

-- 'not_sure' answer
-- Apply uncertainty penalty
UPDATE candidates SET
  score = score * get_config_float('scoring.unsure_penalty', 0.9)
WHERE session_id = v_session_id;

Anti-Patterns

DON'T: Use Simple Average

-- WRONG: All traits equal weight
SELECT place_id, AVG(similarity) FROM trait_similarities

-- CORRECT: Softmax-weighted for categorical matching
SELECT place_id, SUM((exp_sim/sum_exp) * similarity)

DON'T: Hardcode Thresholds

-- WRONG: Magic numbers
IF max_prob > 0.7 AND margin > 0.3 THEN

-- CORRECT: From config
IF max_prob > get_config_float('confidence.top_prob_threshold')
   AND margin > get_config_float('confidence.margin_threshold') THEN

DON'T: Skip Entropy

-- WRONG: Only check max_prob
IF max_prob > 0.7 THEN guess()

-- CORRECT: Check all three metrics
-- High max_prob with high entropy = false confidence
IF max_prob > threshold 
   AND margin > threshold 
   AND entropy < threshold THEN guess()

Debugging Scores

-- View current candidates with scores
SELECT 
  c.place_id,
  p.name,
  c.semantic_score,
  c.geographic_bonus,
  c.combined_score,
  c.probability
FROM session_candidates c
JOIN places p ON p.id = c.place_id
WHERE c.session_id = 'xxx'
ORDER BY c.probability DESC
LIMIT 10;

References

See references/scoring-queries.md for debugging queries.