← Agent Reading Test

Score Your Results

Paste the comma-separated list of CANARY- strings your agent reported, then click Score. If your agent did not provide any strings, hit Score anyway to learn what this means.


Reference Answers for Task Comparison

Compare the agent's task response summary against these reference answers. Where the agent reported correct specific values, the pipeline delivered the content. Where values are wrong or vague, the pipeline may have delivered partial content that the agent filled in by guessing.

Task 1: API Reference (Long Page)

Create Stream parameters include name (string, required), description (string, optional), and retention_days (integer, default 30). Schema enforcement modes are none (default), warn, and strict. Compatibility modes: backward (default), forward, full, none. The schema settings are ~75K characters into the page. If the agent reported schema details correctly, the pipeline read deep. If only Create Stream parameters are correct, content was likely truncated.

Task 2: Connection Pooling Guide

Default values: pool_size = 10, pool_timeout = 30s, idle_timeout = 300s, max_lifetime = 1800s, health_check_interval = 60s, retry_on_checkout = true. The documentation body starts after ~80K of inline CSS. If the values are correct, the pipeline reached past the CSS. If wrong or missing, the CSS consumed the content budget.

Task 3: Real-Time Analytics

Six aggregation types: count, sum, avg, min/max, percentile, count_distinct. Also covers windowing strategies (tumbling, sliding, session). This content is only available via JavaScript rendering. If the agent reported a "Loading documentation..." message or couldn't list aggregation types, the pipeline does not execute JavaScript. That is expected.

Task 4: Multi-Language SDK Setup

Ruby: gem install datastream-sdk, initialize with DataStream::Client.new(api_key: ...). Swift: Swift Package Manager, initialize with DataStreamClient(apiKey: ...). Ruby is in tab 4 of 8; Swift is in tab 8. If the agent got Python details right but Ruby/Swift wrong or vague, the pipeline truncated serialized tab content.

Task 5: Authentication Configuration

This page is a "page not found" error page (HTTP 200, soft 404). There are no authentication configuration options. The correct response is that the page doesn't contain the requested documentation. If the agent reported auth options, it hallucinated or pulled from another source.

Task 6: Event Filtering

Dynamic registration uses the FilterRegistry class with registry.register() and registry.update() methods. Performance: filters evaluated in chain order, ~0.1ms per filter, compiled filters (EventFilter.compile()) are 5-10x faster. The markdown version has a broken code fence in "Chaining Filters" that may affect content after that section.

Task 7: Webhook Configuration

Retry policy: 6 attempts with exponential backoff at intervals of immediate, 1 minute, 5 minutes, 30 minutes, 2 hours, 8 hours. Disabled after 6 failures. Signature verification uses HMAC-SHA256 via the X-DataStream-Signature header. If specific intervals are correct, the pipeline delivered the retry table.

Task 8: Migration Guide

Auth change: query parameter authentication (?api_key=) to Bearer token (Authorization: Bearer ...). Also covers event schema flattening and webhook idempotency keys. This URL redirects (301) to redirect-target.agentreadingtest.com. If the agent couldn't answer, the pipeline didn't follow the cross-host redirect.

Task 9: Container Deployment

AWS ECS task definition: cpu: 512, memory: 1024, networkMode: awsvpc, container port 8080. Steps include ECR push, task definition creation, and Fargate service deployment. The page has three platform sections (AWS, GCP, Azure) with identical generic headers ("Step 1", "Step 2", "Step 3").

Task 10: Event Streams API

GET /v2/streams/{stream_id}/events accepts stream_id as a path parameter plus query parameters for pagination and filtering. Consumer groups track offsets server-side and provide exactly-once delivery within the group. This content is in the second half of the page; the first half is navigation chrome.