Technology
This page reflects the engineering reference behind VerifiedSignal: presigned uploads, worker orchestration, deterministic LLM settings, Postgres as truth, and OpenSearch as a derived plane.
Stack overview
Frontend delivery, identity, object storage, OCR, and LLM inference map to managed services so teams spend integration effort on scoring quality and review UX.
Frontend hosting and global edge delivery.
Object storage with presigned URLs for direct uploads, reducing unnecessary compute on the hot path.
Authentication, including paths that align with AWS Marketplace–native identity flows.
High-accuracy OCR for tables, forms, and document geometry.
LLM reasoning (for example Claude 3.5 / Llama 3) for structured scoring and agentic sanity checks.
Visual overview
End-to-end view of major components, flows, and integrations.

Data plane
PostgreSQL is the system of record: users, permissions, final scores, billing, and authoritative outcomes. Amazon OpenSearch holds derived search and analytics state: full-text search, kNN vectors, and dashboard aggregations—treated as expendable relative to Postgres.
PostgreSQL is the system of record: users, permissions, final scores, billing, and authoritative outcomes.
Amazon OpenSearch holds derived search and analytics state: full-text search, kNN vectors, and dashboard aggregations—treated as expendable relative to Postgres.
Pipeline
From queued intake through Bedrock scoring, canonical persistence, OpenSearch indexing, and SSE completion events.
Step 1
Client submits file or URL; API creates a queued record in Postgres.
Step 2
Worker fetches bytes, computes content hashes, and deduplicates.
Step 3
Textract/worker extracts plain text and structure; progress published to Redis.
Step 4
Metadata, topical tags, and initial quality flags.
Step 5
Worker calls Bedrock for structured analysis via the Converse API.
Step 6
Validated scores written to the authoritative PostgreSQL layer.
Step 7
Document indexed into OpenSearch for retrieval and analytics.
Step 8
Final status pushed to the frontend via SSE.
LLM layer
Temperature at zero, schema discipline, and few-shot examples that include graceful failure reduce hallucinated fields.
Streaming UX
Nginx should disable proxy buffering for SSE (for example `X-Accel-Buffering: no`) so streams reach clients reliably.
Partial JSON token parsing streams fields to the UI as they complete; field state progresses through unseen → in progress → complete → validated → persisted.
{
"event": "stage",
"document_id": "doc_123",
"stage": "extract_text",
"status": "running",
"timestamp": "2026-03-26T22:11:10Z",
"progress": 35
}Marketplace & metering
Patterns for AWS Marketplace tokens, customer resolution, and scheduled metering that align billable usage with Postgres-grounded counts.
Deployment
The reference explicitly supports bare-metal style dev, containerized compose, and Fargate/RDS/OpenSearch Service production shapes.
API-first posture
Higher tiers expose CSV/JSON export and API access in the reference packaging—ideal for analysts wiring scores into notebooks, GRC tools, or newsroom CMS hooks.
Security & governance
Operational resilience is specified: malformed JSON, search outages, worker crashes, and Redis loss each have a playbook.
| Dimension | Mitigation strategy |
|---|---|
| LLM returns malformed JSON | Incremental parser with field-level validation; fallback to a final pass parse. |
| Search unavailable | Persist to Postgres first; retry indexing later; UI shows search as pending. |
| Worker crash | Idempotent stage checkpoints so work resumes from the last saved state. |
| Redis outage | UI falls back to polling; event logs persist in Postgres for replay. |
MVP scope
Clarity on exclusions keeps delivery honest: no custom training, no complex multi-tenant RBAC for MVP, no native mobile shell.
Architecture reviews, threat modeling, and integration design for your AWS estate.