API Documentation

Score, label, and evaluate any dataset with AI agents. One API call, structured results.

Ready to try it? Get an API key from the Dashboard, then run the curl command below. Takes 30 seconds.

Quickstart

Score a batch of items in one API call. Here's a complete example:

curl
curl -X POST https://scorehive.polsia.app/api/evaluate \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sh_live_YOUR_KEY_HERE" \
  -d '{
    "name": "Search Quality Check",
    "context": "best pizza restaurants in NYC",
    "items": [
      {
        "title": "Best Pizza in NYC - Ultimate Guide",
        "url": "example.com/pizza-guide",
        "snippet": "Top 10 pizza places in New York City, rated by locals."
      },
      {
        "title": "NYC Weather Forecast",
        "url": "weather.com/nyc",
        "snippet": "Tomorrow will be sunny with highs near 72F."
      }
    ]
  }'

Response:

json — 200 OK
{
  "success": true,
  "evaluation": {
    "id": 42,
    "name": "Search Quality Check",
    "status": "completed",
    "total_items": 2,
    "avg_score": 0.62
  },
  "results": [
    {
      "input": { "title": "Best Pizza in NYC - Ultimate Guide", "..." },
      "scores": {
        "relevance": 0.95,
        "accuracy": 0.88,
        "intent_alignment": 0.92
      },
      "overall_score": 0.92,
      "confidence": 0.95,
      "reasoning": "Directly relevant guide to NYC pizza restaurants.",
      "flags": []
    },
    {
      "input": { "title": "NYC Weather Forecast", "..." },
      "scores": {
        "relevance": 0.10,
        "accuracy": 0.75,
        "intent_alignment": 0.05
      },
      "overall_score": 0.31,
      "confidence": 0.92,
      "reasoning": "Weather forecast is unrelated to pizza restaurants.",
      "flags": ["off_topic"]
    }
  ]
}

That's it. Two items scored against three criteria, with confidence and reasoning for each.

Authentication

All API requests require an API key. Pass it in one of two ways:

Header Option 1 (recommended)
X-API-Key: sh_live_YOUR_KEY_HERE
Header Option 2 (Bearer token)
Authorization: Bearer sh_live_YOUR_KEY_HERE

Getting an API Key

  1. Go to the Dashboard and scroll to API Key Management
  2. Enter the admin secret and click Authenticate
  3. Fill in a name and click Generate Key
  4. Copy the key immediately — it's shown only once
API keys start with sh_live_ and are hashed on our side. If you lose your key, you'll need to generate a new one.

Base URL

Base URL
https://scorehive.polsia.app

All endpoint paths below are relative to this base URL.

Endpoints

POST /api/evaluate Auth Required Score a batch of items

The core endpoint. Send an array of items and get back structured scores against a rubric.

Request Body
Field Type Required Description
items array Required Array of items to score. Each item can be a string or an object. Max 50 items per request.
name string Optional Name for this evaluation batch. Defaults to "Evaluation {date}".
context string Optional Context or query that items should be evaluated against (e.g., a search query).
rubric object Optional Custom scoring rubric. See Rubric Configuration. Uses default rubric if omitted.
Item Formats

Items can be plain strings or objects with any structure:

String items
"items": [
  "The Earth revolves around the Sun",
  "Water boils at 50 degrees Celsius"
]
Object items (search results, content, etc.)
"items": [
  {
    "title": "Best Pizza in NYC",
    "url": "example.com/pizza",
    "snippet": "Top rated pizza places..."
  }
]
Response
Field Type Description
success boolean Always true on success
evaluation object Evaluation metadata: id, name, status, total_items, avg_score
results array Scored items. Each has scores, overall_score, confidence, reasoning, flags
GET /api/evaluations Auth Required List evaluation history

Returns paginated list of your past evaluations, newest first.

Query Parameters
Param Type Default Description
limit integer 20 Number of evaluations to return. Max 100.
offset integer 0 Offset for pagination.
Response
json
{
  "success": true,
  "evaluations": [
    {
      "id": 42,
      "name": "Search Quality Check",
      "status": "completed",
      "total_items": 10,
      "scored_items": 10,
      "avg_score": 0.74,
      "created_at": "2026-04-01T12:00:00Z"
    }
  ],
  "total": 156,
  "limit": 20,
  "offset": 0
}
GET /api/evaluations/:id Auth Required Get evaluation details with all scored items

Returns a single evaluation with its full list of scored items and metadata.

Path Parameters
Param Type Description
id integer Evaluation ID
Response
json
{
  "success": true,
  "evaluation": {
    "id": 42,
    "name": "Search Quality Check",
    "rubric": { /* rubric config */ },
    "status": "completed",
    "total_items": 2,
    "avg_score": 0.62,
    "created_at": "2026-04-01T12:00:00Z"
  },
  "items": [
    {
      "id": 101,
      "input_data": { "title": "Best Pizza..." },
      "scores": { "relevance": 0.95, "accuracy": 0.88 },
      "overall_score": 0.92,
      "confidence": 0.95,
      "reasoning": "Directly relevant...",
      "flags": []
    }
  ]
}
GET /api/stats Auth Required Aggregate stats and score distribution

Returns aggregate statistics for your API key: totals, averages, score distribution, and 7-day trend.

Response
json
{
  "success": true,
  "stats": {
    "total_evaluations": 156,
    "total_items_scored": 2340,
    "global_avg_score": 0.72
  },
  "score_distribution": [
    { "bucket": "0.9-1.0", "count": 120 },
    { "bucket": "0.8-0.9", "count": 340 },
    // ... 10 buckets from 0.0-0.1 to 0.9-1.0
  ],
  "trend": [
    { "date": "2026-04-01", "evaluations": 12, "avg_score": 0.74 },
    // ... last 7 days
  ]
}

Rubric Configuration

A rubric defines what criteria items are scored against. Each criterion has a weight (how much it contributes to the overall score) and a description (what the AI evaluates).

Default Rubric

If you don't pass a rubric field, ScoreHive uses this default:

json — Default rubric
{
  "relevance": {
    "weight": 0.4,
    "description": "How relevant is this item to the query or context?"
  },
  "accuracy": {
    "weight": 0.3,
    "description": "How factually accurate is the content?"
  },
  "intent_alignment": {
    "weight": 0.3,
    "description": "How well does this align with the user intent?"
  }
}

Custom Rubric

Pass your own rubric to score items on whatever criteria matter to your use case. Weights should sum to 1.0.

json — Custom rubric example
{
  "rubric": {
    "content_quality": {
      "weight": 0.35,
      "description": "Is the content well-written and informative?"
    },
    "brand_safety": {
      "weight": 0.30,
      "description": "Is the content safe for brand placement?"
    },
    "audience_fit": {
      "weight": 0.20,
      "description": "Does this match the target audience?"
    },
    "freshness": {
      "weight": 0.15,
      "description": "Is the information current and up-to-date?"
    }
  }
}
💡
You can use any criterion names. The AI understands natural language descriptions. Be specific about what "good" looks like for each criterion.

Response Format

Every scored item in the results array has the same structure:

Field Type Range Description
overall_score float 0.0 – 1.0 Weighted average across all rubric criteria
scores object 0.0 – 1.0 each Individual score per rubric criterion
confidence float 0.0 – 1.0 AI confidence in the scoring. Lower values mean the item may need human review.
reasoning string 1-2 sentence explanation of the score
flags string[] Array of concern flags (e.g., off_topic, low_quality). Empty if no issues.

Score Interpretation

Range Label Meaning
0.7 – 1.0 High Strong match. Production-ready.
0.4 – 0.69 Medium Partial match. May need review.
0.0 – 0.39 Low Poor match. Likely irrelevant or incorrect.

Rate Limits & Pricing

Limit Value
Items per request 50
Concurrent scoring 5 items in parallel
💬
See the Pricing page for plan details. Free tier includes 100 evaluations/month. Pro ($49/mo) includes 10,000 evaluations/month plus custom rubrics and webhooks.

Error Handling

All errors follow a consistent format:

json — Error response
{
  "success": false,
  "message": "Description of what went wrong"
}
HTTP Status Codes
Status Meaning Common Cause
200 Success Request processed successfully
400 Bad Request Missing items array, more than 50 items, invalid rubric structure
401 Unauthorized Missing or invalid API key
403 Forbidden API key has been deactivated
404 Not Found Evaluation ID doesn't exist or belongs to another key
500 Server Error Internal error during evaluation
ScoreHive Built by Polsia