API Documentation

Score, label, and evaluate any dataset with AI agents. One API call, structured results.

⚡

Ready to try it? Get an API key from the Dashboard, then run the curl command below. Takes 30 seconds.

Quickstart

Score a batch of items in one API call. Here's a complete example:

curl

curl -X POST https://scorehive.polsia.app/api/evaluate \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sh_live_YOUR_KEY_HERE" \
  -d '{
    "name": "Search Quality Check",
    "context": "best pizza restaurants in NYC",
    "items": [
      {
        "title": "Best Pizza in NYC - Ultimate Guide",
        "url": "example.com/pizza-guide",
        "snippet": "Top 10 pizza places in New York City, rated by locals."
      },
      {
        "title": "NYC Weather Forecast",
        "url": "weather.com/nyc",
        "snippet": "Tomorrow will be sunny with highs near 72F."
      }
    ]
  }'

Response:

json — 200 OK

{
  "success": true,
  "evaluation": {
    "id": 42,
    "name": "Search Quality Check",
    "status": "completed",
    "total_items": 2,
    "avg_score": 0.62
  },
  "results": [
    {
      "input": { "title": "Best Pizza in NYC - Ultimate Guide", "..." },
      "scores": {
        "relevance": 0.95,
        "accuracy": 0.88,
        "intent_alignment": 0.92
      },
      "overall_score": 0.92,
      "confidence": 0.95,
      "reasoning": "Directly relevant guide to NYC pizza restaurants.",
      "flags": []
    },
    {
      "input": { "title": "NYC Weather Forecast", "..." },
      "scores": {
        "relevance": 0.10,
        "accuracy": 0.75,
        "intent_alignment": 0.05
      },
      "overall_score": 0.31,
      "confidence": 0.92,
      "reasoning": "Weather forecast is unrelated to pizza restaurants.",
      "flags": ["off_topic"]
    }
  ]
}

That's it. Two items scored against three criteria, with confidence and reasoning for each.

Authentication

All API requests require an API key. Pass it in one of two ways:

                        Header Option 1 (recommended)
                    
X-API-Key: sh_live_YOUR_KEY_HERE

                        Header Option 2 (Bearer token)
                    
Authorization: Bearer sh_live_YOUR_KEY_HERE

Getting an API Key

Go to the Dashboard and scroll to API Key Management
Enter the admin secret and click Authenticate
Fill in a name and click Generate Key
Copy the key immediately — it's shown only once

⚠

API keys start with sh_live_ and are hashed on our side. If you lose your key, you'll need to generate a new one.

Base URL

https://scorehive.polsia.app

All endpoint paths below are relative to this base URL.

Endpoints

POST /api/evaluate Auth Required Score a batch of items

The core endpoint. Send an array of items and get back structured scores against a rubric.

Request Body

Field	Type	Required	Description
items	array	Required	Array of items to score. Each item can be a string or an object. Max 50 items per request.
name	string	Optional	Name for this evaluation batch. Defaults to "Evaluation {date}".
context	string	Optional	Context or query that items should be evaluated against (e.g., a search query).
rubric	object	Optional	Custom scoring rubric. See Rubric Configuration. Uses default rubric if omitted.

Item Formats

Items can be plain strings or objects with any structure:

String items
"items": [
  "The Earth revolves around the Sun",
  "Water boils at 50 degrees Celsius"
]

Object items (search results, content, etc.)
"items": [
  {
    "title": "Best Pizza in NYC",
    "url": "example.com/pizza",
    "snippet": "Top rated pizza places..."
  }
]

Response

Field	Type	Description
success	boolean	Always `true` on success
evaluation	object	Evaluation metadata: `id`, `name`, `status`, `total_items`, `avg_score`
results	array	Scored items. Each has `scores`, `overall_score`, `confidence`, `reasoning`, `flags`

GET /api/evaluations Auth Required List evaluation history

Returns paginated list of your past evaluations, newest first.

Query Parameters

Param	Type	Default	Description
limit	integer	20	Number of evaluations to return. Max 100.
offset	integer	0	Offset for pagination.

Response

json
{
  "success": true,
  "evaluations": [
    {
      "id": 42,
      "name": "Search Quality Check",
      "status": "completed",
      "total_items": 10,
      "scored_items": 10,
      "avg_score": 0.74,
      "created_at": "2026-04-01T12:00:00Z"
    }
  ],
  "total": 156,
  "limit": 20,
  "offset": 0
}

GET /api/evaluations/:id Auth Required Get evaluation details with all scored items

Returns a single evaluation with its full list of scored items and metadata.

Path Parameters

Param	Type	Description
id	integer	Evaluation ID

Response

json
{
  "success": true,
  "evaluation": {
    "id": 42,
    "name": "Search Quality Check",
    "rubric": { /* rubric config */ },
    "status": "completed",
    "total_items": 2,
    "avg_score": 0.62,
    "created_at": "2026-04-01T12:00:00Z"
  },
  "items": [
    {
      "id": 101,
      "input_data": { "title": "Best Pizza..." },
      "scores": { "relevance": 0.95, "accuracy": 0.88 },
      "overall_score": 0.92,
      "confidence": 0.95,
      "reasoning": "Directly relevant...",
      "flags": []
    }
  ]
}

GET /api/stats Auth Required Aggregate stats and score distribution

Returns aggregate statistics for your API key: totals, averages, score distribution, and 7-day trend.

Response

json
{
  "success": true,
  "stats": {
    "total_evaluations": 156,
    "total_items_scored": 2340,
    "global_avg_score": 0.72
  },
  "score_distribution": [
    { "bucket": "0.9-1.0", "count": 120 },
    { "bucket": "0.8-0.9", "count": 340 },
    // ... 10 buckets from 0.0-0.1 to 0.9-1.0
  ],
  "trend": [
    { "date": "2026-04-01", "evaluations": 12, "avg_score": 0.74 },
    // ... last 7 days
  ]
}

Rubric Configuration

A rubric defines what criteria items are scored against. Each criterion has a weight (how much it contributes to the overall score) and a description (what the AI evaluates).

Default Rubric

If you don't pass a rubric field, ScoreHive uses this default:

json — Default rubric

{
  "relevance": {
    "weight": 0.4,
    "description": "How relevant is this item to the query or context?"
  },
  "accuracy": {
    "weight": 0.3,
    "description": "How factually accurate is the content?"
  },
  "intent_alignment": {
    "weight": 0.3,
    "description": "How well does this align with the user intent?"
  }
}

Custom Rubric

Pass your own rubric to score items on whatever criteria matter to your use case. Weights should sum to 1.0.

json — Custom rubric example

{
  "rubric": {
    "content_quality": {
      "weight": 0.35,
      "description": "Is the content well-written and informative?"
    },
    "brand_safety": {
      "weight": 0.30,
      "description": "Is the content safe for brand placement?"
    },
    "audience_fit": {
      "weight": 0.20,
      "description": "Does this match the target audience?"
    },
    "freshness": {
      "weight": 0.15,
      "description": "Is the information current and up-to-date?"
    }
  }
}

💡

You can use any criterion names. The AI understands natural language descriptions. Be specific about what "good" looks like for each criterion.

Response Format

Every scored item in the results array has the same structure:

Field	Type	Range	Description
overall_score	float	0.0 – 1.0	Weighted average across all rubric criteria
scores	object	0.0 – 1.0 each	Individual score per rubric criterion
confidence	float	0.0 – 1.0	AI confidence in the scoring. Lower values mean the item may need human review.
reasoning	string	—	1-2 sentence explanation of the score
flags	string[]	—	Array of concern flags (e.g., `off_topic`, `low_quality`). Empty if no issues.

Score Interpretation

Range	Label	Meaning
0.7 – 1.0	High	Strong match. Production-ready.
0.4 – 0.69	Medium	Partial match. May need review.
0.0 – 0.39	Low	Poor match. Likely irrelevant or incorrect.

Rate Limits & Pricing

Limit	Value
Items per request	50
Concurrent scoring	5 items in parallel

💬

See the Pricing page for plan details. Free tier includes 100 evaluations/month. Pro ($49/mo) includes 10,000 evaluations/month plus custom rubrics and webhooks.

Error Handling

All errors follow a consistent format:

json — Error response
{
  "success": false,
  "message": "Description of what went wrong"
}

HTTP Status Codes

Status	Meaning	Common Cause
200	Success	Request processed successfully
400	Bad Request	Missing `items` array, more than 50 items, invalid rubric structure
401	Unauthorized	Missing or invalid API key
403	Forbidden	API key has been deactivated
404	Not Found	Evaluation ID doesn't exist or belongs to another key
500	Server Error	Internal error during evaluation

ScoreHive Built by Polsia