Search

Cortex Search

Search the Web, Get Answers

Cortex Search is a search API platform that combines web scraping, hybrid retrieval (BM25 + vector), neural reranking, and RAG into a single API. Built with a Rust core for sub-50ms latency, a Python sidecar for anti-bot bypass, and OpenAI-compatible endpoints.

  • Rust core for sub-50ms latency on retrieval
  • OpenAI-compatible API endpoints
  • CDP network interception bypasses HTML fragility
  • ONNX neural reranker runs locally on CPU
Hybrid Retrieval
Network Interception
OpenAI Compatible
Deep Research

Key Features

Everything you need to transform your search operations

Hybrid Retrieval

BM25 (Tantivy) + vector search (Qdrant) fused with Reciprocal Rank Fusion, then reranked by ONNX DeBERTa-v3 cross-encoder.

Network Interception

CDP-based XHR/GraphQL interception bypasses DOM parsing entirely — 10x faster for JavaScript-heavy sites.

OpenAI Compatible

Drop-in /v1/chat/completions endpoint. Works with any OpenAI SDK — just change the base URL.

Deep Research

Multi-step sub-question generation and synthesis for complex queries that need multiple search rounds.

Rust Performance

All retrieval, fusion, chunking, and API routing in Rust. Python only where unavoidable (browser automation).

MCP Server

Built-in Model Context Protocol server for integrating search capabilities directly into AI agents and LLM workflows.

SDKs & Dashboard

Python and TypeScript SDKs, plus a Next.js dashboard with API playground, usage tracking, and interactive docs.

Modules

Powerful Components

Specialized modules designed for specific business needs

Fetch Router

Intelligent URL classification with auto-escalation from HTTP to stealth browsing

URL ClassifierHTTP FetchStealth BrowserAuto-Escalation

Retrieval Engine

Hybrid BM25 + vector retrieval with RRF fusion and neural reranking

BM25 (Tantivy)Vector (Qdrant)RRF FusionONNX Reranker

LLM Orchestrator

Multi-model routing with citation injection and SSE streaming

Multi-Model RoutingCitation ExtractionSSE StreamingOpenAI Compat

Web Crawler

BFS crawler with freshness scheduling and proxy rotation

BFS CrawlingFreshness SchedulerProxy Rotationrobots.txt

Python Sidecar

gRPC sidecar for anti-bot bypass and smart content extraction

Scrapling BypassCDP InterceptCloudflare SolverSmart Extract

Impact by Numbers

Measurable results that matter

50ms

Retrieval Latency

7

Rust Crates

22MB

Reranker Size

6

gRPC Services

Architecture

How It Works

Enterprise-grade architecture designed for scale and reliability

01

API Layer

Axum RESTOpenAI CompatMCP ServerWebSocket
02

Fetch Layer

HTTP ClientStealth BrowserCDP InterceptProxy Pool
03

Retrieval

BM25 (Tantivy)Vector (Qdrant)RRF FusionONNX Reranker
04

RAG Pipeline

ChunkerEmbedderGeneratorCitation Engine
05

Infrastructure

Redis CachePostgreSQLKubernetesOTel/Jaeger
Technology

Built With the Best

Modern, battle-tested technologies powering Cortex Search

Rust
Language
Axum
Backend
Python
Language
Playwright
Browser
Tantivy
Search
Qdrant
Vector Store
ONNX Runtime
ML
Redis
Cache
PostgreSQL
Database
gRPC
Communication
Next.js
Frontend
Docker
Container
Kubernetes
Orchestration

Ready to Transform with Cortex Search?

Schedule a personalized demo and discover how we can help your organization.