Elasticsearch
Elasticsearch - The Search and Vector Analytics Engine Behind the World's Largest Platforms
Elasticsearch
Elasticsearch development reached version 9.4.1 (May 2026) — a year after the 9.0 major release (April 15, 2025) that introduced Better Binary Quantization as the new default for vector indices. BBQ cuts heap requirements by 95%+ versus float32 while maintaining sub-20ms search latency with a 100MB memory footprint regardless of index size. Elastic N.V. posted $1.483 billion FY2025 revenue (+17% YoY). ELSER beats BM25 by 10–20% recall; ES|QL replaces query DSL for analytics. AGPL licensing (added 2025) makes the core OSI-approved. Wikipedia, Netflix, Uber, GitHub code search, and Shopify run Elasticsearch in production.
Build with ElasticsearchSearch
Who Should Use Elasticsearch?
Elasticsearch is the right search infrastructure when PostgreSQL's LIKE queries are too slow, when you need full-text relevance ranking, when your vectors number in the millions and a pgvector index would consume too much memory, or when you need to aggregate and analyze log data at petabyte scale. If your search requirements are a `WHERE title ILIKE '%query%'`, you don't need Elasticsearch. If you need relevance, facets, and vector search at scale, you do.
E-Commerce and Product Search
E-commerce search requires relevance scoring (not just keyword matching), faceted filtering (brand, price range, size, colour), typo tolerance, synonym expansion, and boosting rules (in-stock items rank higher). Elasticsearch's BM25F scoring, custom analyzers, and function_score queries handle all of these. We build Elasticsearch-powered product search for catalogues from 10,000 to 100 million SKUs with sub-100ms query latency at scale.
Log Analytics and Observability (ELK Stack)
The Elastic Stack (Elasticsearch + Logstash + Kibana + Beats) is the dominant open-source observability platform — ingesting, indexing, and visualizing logs, metrics, and traces from any application or infrastructure. ES|QL in 9.x makes log analysis queries readable and fast. For teams running their own observability infrastructure rather than paying for Datadog or New Relic, the ELK stack is the standard.
Semantic and Vector Search Applications
Elasticsearch 9.x with BBQ and ELSER is the production vector search solution for applications with millions to billions of embeddings. ELSER provides semantic search without fine-tuned models — point it at your text corpus and it retrieves semantically relevant results immediately. Combined with E5 multilingual models for cross-lingual search and RRF for hybrid scoring, ESRE builds production RAG pipelines from one platform.
Application Search with Complex Relevance
SaaS applications, content platforms, and knowledge bases need search that ranks results by recency, popularity, user context, and text relevance simultaneously. Elasticsearch's function_score and script_score queries combine multiple signals. We build application search for SaaS platforms where a simple ILIKE query produces irrelevant results but users expect Google-quality ranking.
Geospatial Search
Elasticsearch's geo_point and geo_shape field types support geo_distance queries (find restaurants within 2km), geo_bounding_box filters, and geo_polygon queries. Sorting by distance, filtering by shape, and aggregating by geographic region are first-class operations. We build location-aware search for delivery platforms, real estate applications, and logistics systems using Elasticsearch's geospatial capabilities.
Security Analytics and SIEM
Elastic Security (SIEM built on Elasticsearch) ingests logs from every infrastructure component, applies detection rules in near-real-time, and correlates events across sources for threat detection. The same Elasticsearch platform that powers application search handles petabyte-scale security telemetry. For organizations building their own SIEM rather than paying for enterprise security platforms, Elastic Security provides a mature foundation.
When Elasticsearch Might Not Be the Best Choice
We believe in honest communication. Here are scenarios where alternative solutions might be more appropriate:
Primary application database — Elasticsearch is eventually consistent, does not support ACID transactions, and optimizes for search over data integrity; use PostgreSQL or MongoDB as the primary database and sync to Elasticsearch for search
Simple search on small datasets (under 100K records) — PostgreSQL's pg_trgm extension or a FULLTEXT INDEX in MySQL handles search adequately on small datasets without the operational overhead of running an Elasticsearch cluster
Applications requiring ACID transactions — Elasticsearch does not support multi-document transactions; writes are eventually consistent; applications requiring transaction guarantees need a relational or document database as the primary store
Teams without infrastructure capacity — Elasticsearch requires JVM tuning, heap sizing, shard allocation planning, and index lifecycle management; production Elasticsearch clusters have meaningful operational overhead; managed Elastic Cloud reduces this but adds cost
Still Not Sure?
We're here to help you find the right solution. Let's have an honest conversation about your specific needs and determine if Elasticsearch is the right fit for your business.
Why Elasticsearch 9 Is the Enterprise Search and Vector Analytics Standard in 2026
Elasticsearch 9.x ships Better Binary Quantization as the default for vector indices — 95%+ heap reduction, sub-20ms search latency, 100MB memory footprint regardless of index size. ELSER (Elastic Sparse Encoder) beats BM25 by 10–20% recall without fine-tuning. ES|QL replaces the query DSL for analytics. Elastic N.V. posted $1.483B in FY2025 revenue with 76,700 GitHub stars. Wikipedia, Netflix, Uber, GitHub code search, and Shopify trust Elasticsearch for workloads where a LIKE query simply doesn't work.
9.4.1
Latest Version (May 2026)
Elastic release notes, May 2026$1.483 billion
FY2025 Revenue
Elastic N.V. FY2025 10-K, +17% YoY76,700
GitHub Stars
elastic/elasticsearch GitHub, May 202695%+
BBQ Heap Reduction vs Float32
Elastic Labs, Elasticsearch 9.1 releaseSub-10ms full-text search on billion-document indices via inverted indexes and BM25F relevance scoring — the architecture powering Wikipedia, GitHub code search, and every major e-commerce site's search box
Better Binary Quantization (BBQ) in Elasticsearch 9.x — default for dense_vector indices; 95%+ heap reduction versus float32; sub-20ms vector search with 100MB memory footprint regardless of index size; DiskBBQ reads quantized vectors from disk for 100M+ vector workloads
ELSER (Elastic Sparse Encoder) — sparse semantic retrieval model that beats BM25 by 10–20% recall without fine-tuning; works on out-of-domain text without labelled data; combined with E5 multilingual dense models via ESRE for single-call RAG pipelines
ESRE (Elasticsearch Relevance Engine) — bundles ELSER, E5 multilingual models, and BBQ; provides single-call RAG pipelines combining knn, RRF (Reciprocal Rank Fusion), text similarity reranking, result diversification, and rule-based pinning
ES|QL (Elasticsearch Query Language) — SQL-like analytics language introduced in 9.x; replaces the JSON query DSL for analytical workloads; pipe-based syntax for readable queries; optimized execution engine bypasses OLTP path for analytical queries
Near-real-time indexing — documents are searchable within 1 second of ingestion by default; refresh intervals are configurable per index; bulk indexing at tens of thousands of documents per second on standard hardware
76,700 GitHub stars, 25,900 forks — 7x the code commit volume of the Amazon OpenSearch fork; the reference implementation has a development velocity that OpenSearch cannot match with its community size
AGPL license added in 2025 — alongside SSPL and Elastic License 2.0, the core stack now has an OSI-approved license option; greater compatibility with open-source projects and legal teams requiring OSI approval
Elasticsearch in Practice
E-Commerce Product Search
Elasticsearch powers product search with relevance scoring, faceted filtering, typo tolerance via fuzziness, synonym expansion, and boosting rules that elevate in-stock and high-margin items. We build e-commerce search with custom analyzers for product attributes, multi-match queries weighting title over description, and function_score combining text relevance with popularity and recency signals. Elasticsearch's near-real-time indexing keeps search current within 1 second of inventory changes.
Example: Fashion retailer with 2M SKUs — Elasticsearch search with faceted filtering (brand, size, colour, price), typo tolerance, and personalized ranking boosts returning results in under 80ms
Log Aggregation and Observability (ELK Stack)
The ELK stack — Elasticsearch, Logstash (or Fluent Bit), and Kibana — ingests structured and unstructured logs from applications, servers, containers, and cloud services. We deploy production ELK stacks with index lifecycle management (hot-warm-cold tiers), automated shard optimization, role-based Kibana access, and alerting on log patterns. ES|QL in 9.x makes ad-hoc log analysis queries readable for developers without Elasticsearch query DSL knowledge.
Example: 50GB/day log pipeline — ELK stack with hot-warm-cold ILM policy; 30-day searchable retention on hot tier, 90-day on warm, 365-day on cold; Kibana dashboards for engineering and security teams
Semantic Vector Search with ELSER
Elasticsearch 9.x ELSER (Elastic Sparse Encoder) provides semantic search without training custom models — deploy ELSER, index your text corpus, and queries return semantically relevant results immediately. We build knowledge base search, document retrieval, and support ticket matching using ELSER for semantic recall plus BM25 for keyword precision, combined via RRF for hybrid scoring that beats either approach alone.
Example: Technical documentation search using ELSER — semantic queries return relevant articles even without keyword overlap; RRF hybrid scoring improves mean reciprocal rank by 35% over BM25 alone
RAG Pipeline with ESRE
ESRE (Elasticsearch Relevance Engine) bundles ELSER, E5 multilingual dense models, and BBQ vector compression into a single-call RAG pipeline. We build AI assistant backends on Elasticsearch: index document embeddings with BBQ, retrieve with ELSER sparse + E5 dense + RRF fusion, rerank with semantic similarity, apply rule-based pinning for controlled content, and pass the retrieved context to an LLM. One Elasticsearch cluster handles the entire retrieval layer.
Example: Enterprise AI assistant — ESRE retrieves relevant policy documents from a 10M-document corpus in under 50ms; retrieved context passed to Claude for answer generation with citation
Application Search for SaaS Platform
SaaS platforms with complex search requirements — searching across users, projects, documents, and activity simultaneously — benefit from Elasticsearch's cross-index search, field boosting, and aggregation framework. We build SaaS application search with per-tenant index isolation, field-level access control, real-time indexing on database writes, and search analytics via Kibana to track what users search for and what they don't find.
Example: Project management SaaS — global search across tasks, comments, files, and users; per-tenant Elasticsearch index isolation; real-time indexing; search analytics showing top failing queries
Geospatial Delivery and Logistics Search
Elasticsearch's native geospatial field types and query operators enable 'find nearest' searches that PostgreSQL's PostGIS handles but with Elasticsearch's scale and aggregation capabilities combined. We build geospatial search for delivery platforms (nearest available driver), real estate (properties in polygon, within radius), and logistics (depot coverage areas) using geo_distance sorting, geo_bounding_box filters, and geo_grid aggregations for heatmaps.
Example: On-demand delivery platform — Elasticsearch geo_distance query returns 5 nearest available drivers sorted by distance in under 20ms; geo_grid aggregation powers demand heatmap in operations dashboard
Elasticsearch Pros and Cons
Every technology has its strengths and limitations. Here's an honest assessment to help you make an informed decision.
Advantages
Best-in-Class Full-Text Search Performance
Elasticsearch's inverted index architecture delivers sub-10ms full-text search on billion-document indices. BM25F relevance scoring, custom analyzers (language-specific tokenization, synonyms, stemming), and the full query DSL provide relevance control that no SQL LIKE query or pg_trgm extension can match at scale.
Production-Ready Vector Search at 95% Less Memory
BBQ in Elasticsearch 9.x is a step-change in vector search economics — 95%+ heap reduction versus float32 means workloads that previously required 64GB nodes run on 8GB nodes. Sub-20ms latency with a 100MB footprint makes vector search economically viable for applications that previously couldn't justify the infrastructure cost.
Semantic Search Without Model Training
ELSER provides production-quality semantic search out of the box — no custom model training, no labelled data, no ML engineering required. Deploy ELSER, index your corpus, and semantic retrieval improves recall by 10–20% over BM25 immediately. For most applications, ELSER's out-of-domain performance is sufficient without fine-tuning.
Horizontal Scalability
Elasticsearch scales horizontally by adding nodes — index more data, handle more queries, add nodes to the cluster without downtime. Shard-based distribution automatically rebalances when nodes join or leave. The same architecture handles a startup's 100K-document index and Wikipedia's 90-language, multi-billion-document search.
Mature Ecosystem and Enterprise Support
76,700 GitHub stars, $1.483B FY2025 revenue, and 17% YoY growth reflect an enterprise-grade platform with a long-term support trajectory. Elastic offers commercial licenses with dedicated support, Elastic Cloud managed hosting, and enterprise features (RBAC, encrypted snapshots, audit logging). The OpenSearch fork confirms Elasticsearch's architecture is sound — Amazon built an entire product on it.
ES|QL: SQL-Like Analytics Without DSL
ES|QL (introduced in 9.x) replaces the verbose JSON query DSL for analytical workloads with a readable pipe-based syntax. SELECT-like queries, aggregations, and transformations written by any SQL-familiar developer — no Elasticsearch DSL expertise required. The ES|QL execution engine is separate from the OLTP path and optimized specifically for analytical query patterns.
Limitations
Operational Complexity
Running a production Elasticsearch cluster requires JVM heap sizing (typically 50% of available RAM, max 32GB per node for compressed OOPs), shard allocation planning, index lifecycle management, snapshot scheduling, and capacity planning. Misconfigured clusters produce split-brain scenarios, OOM crashes, and query timeouts.
We manage Elasticsearch clusters with Terraform-provisioned infrastructure, ILM policies for index lifecycle, automated snapshot testing, and monitoring dashboards for heap usage, GC pressure, and query latency. For teams that want managed infrastructure, Elastic Cloud eliminates cluster management entirely at a cost premium.
Eventual Consistency
Elasticsearch is eventually consistent — documents indexed are not immediately available for search (default 1-second refresh interval) and updates are not atomic across multiple documents. Applications requiring read-your-writes consistency or multi-document transactions cannot rely on Elasticsearch as their primary data store.
We architect Elasticsearch as a search index, not a primary database — the source of truth is PostgreSQL or MongoDB; Elasticsearch is a derived read model synced via change events. The 1-second default refresh interval is configurable per index for applications requiring faster or slower consistency.
Cost at Scale
Elasticsearch clusters require dedicated infrastructure — minimum 3 nodes for production HA, each requiring meaningful RAM and SSD storage. For workloads that don't justify dedicated infrastructure, the cost of running Elasticsearch exceeds the cost of alternatives. Elastic Cloud pricing adds to self-hosted infrastructure cost.
We right-size Elasticsearch deployments for the actual workload — single-node deployments for development and small production workloads; hot-warm-cold tiered storage for log retention workloads that need long retention without keeping everything on expensive hot storage. BBQ in 9.x reduces node memory requirements by 95% for vector workloads specifically.
Shard Overhead-Fetching
Elasticsearch's shard model requires careful initial planning — over-sharding (too many small shards) increases cluster overhead; under-sharding (too few large shards) limits parallelism. Reindexing to change shard count is an expensive operational task that requires duplicating data.
We follow Elastic's guidance: target 20–40GB per shard for most indices; use data streams for time-series data with automatic rollover; avoid per-tenant shards for multi-tenant SaaS (use field-based isolation instead). Proper initial shard planning eliminates most reindexing needs.
Elasticsearch Alternatives & Comparisons
We use all of these in production — the right choice depends on your project's constraints, team familiarity, and scale requirements.
Elasticsearch vs OpenSearch
Learn More About OpenSearchOpenSearch Advantages
- •Apache 2.0 licensed — fully open source with no SSPL or Elastic License restrictions
- •AWS OpenSearch Service provides managed hosting with native AWS integration
- •Compatible API with Elasticsearch 7.10 — migration from Elasticsearch is feasible
- •Amazon's commercial backing ensures long-term maintenance for AWS-committed teams
OpenSearch Limitations
- •7x less code commit activity vs Elasticsearch — significantly slower feature development velocity
- •Missing Elasticsearch 9.x features: BBQ, ELSER semantic search, ESRE, ES|QL
- •Smaller community: 5,800 commits vs Elasticsearch's 41,000; fewer Stack Overflow answers
- •OpenSearch's 2.47% market share vs Elasticsearch's dominance — fewer experienced operators available
OpenSearch is Best For:
- •Teams requiring strict Apache 2.0 licensing for legal or compliance reasons
- •AWS-native architectures where OpenSearch Service integration is a priority
- •Teams on Elasticsearch 7.x who cannot or won't upgrade and want a maintenance alternative
When to Choose OpenSearch
Choose OpenSearch when Apache 2.0 licensing is a hard requirement or you're deeply committed to AWS OpenSearch Service. Choose Elasticsearch when you need BBQ vector search, ELSER semantic search, ES|QL analytics, or the fastest-evolving search engine feature set in 2026.
Elasticsearch vs PostgreSQL
Learn More About PostgreSQLPostgreSQL Advantages
- •Full ACID transactions — consistent with your application database, no sync required
- •pg_trgm extension provides trigram-based fuzzy full-text search adequate for smaller datasets
- •pgvector for AI embeddings colocated with relational data — no separate vector database needed
- •No additional infrastructure — your existing PostgreSQL handles search at small-to-medium scale
PostgreSQL Limitations
- •LIKE and ILIKE queries don't scale beyond 100K–1M rows without significant performance degradation
- •No BM25 relevance scoring — full-text search ranks results by ts_rank, not true relevance
- •No faceted search, no synonym expansion, no built-in spell correction or fuzzy matching at scale
- •pgvector's HNSW/IVFFlat indexes consume significant memory at 10M+ vector scale vs Elasticsearch BBQ
PostgreSQL is Best For:
- •Applications with under 1 million searchable records where PostgreSQL full-text is adequate
- •Simple keyword search without complex relevance tuning requirements
- •Teams avoiding infrastructure complexity who can accept PostgreSQL's search limitations
When to Choose PostgreSQL
Choose PostgreSQL full-text search for small-to-medium datasets where relevance tuning is minimal. Choose Elasticsearch when you need BM25 relevance scoring, faceted filtering, synonym expansion, sub-10ms search on millions of documents, or semantic vector search at scale.
Elasticsearch vs Typesense
Learn More About TypesenseTypesense Advantages
- •Dramatically simpler to operate — single binary, zero JVM configuration, sub-minute setup
- •Typo-tolerant search out of the box without custom analyzers
- •Typesense Cloud managed hosting at significantly lower cost than Elastic Cloud
- •REST API that is simpler to learn than Elasticsearch's query DSL
Typesense Limitations
- •No vector search, no semantic search, no ELSER equivalent — purely lexical search
- •Limited aggregation and analytics capabilities vs Elasticsearch's aggregation framework and ES|QL
- •Not suitable for log analytics, observability (ELK), or security SIEM use cases
- •Smaller ecosystem and community — fewer operators, fewer integrations, less Stack Overflow coverage
Typesense is Best For:
- •E-commerce and application search where typo tolerance and speed matter more than semantic relevance
- •Startups wanting great search without Elasticsearch's operational complexity
- •Catalogue search with faceted filtering at under 10M document scale
When to Choose Typesense
Choose Typesense for application and product search where simplicity and fast setup matter and vector/semantic search is not required. Choose Elasticsearch for semantic search (ELSER), vector search (BBQ), log analytics (ELK), or any workload requiring the full power of the relevance engine at scale.
Technologies That Pair With This in Production
Services That Use This Technology
Questions from Developers and Teams
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It stores JSON documents, indexes them for near-real-time search, and supports: full-text search with relevance scoring (BM25F), vector/semantic search (BBQ + ELSER), aggregations and analytics (ES|QL), geospatial search, log analysis (ELK stack), and real-time data indexing. It powers search at Wikipedia, GitHub (code search), Netflix, Uber, Shopify, Adobe, Goldman Sachs, and Slack. Elasticsearch 9.4.1 is the latest stable version (May 2026). Elastic N.V. reported $1.483 billion in FY2025 revenue — a mature enterprise platform with 17% YoY growth.
Better Binary Quantization (BBQ) is Elasticsearch 9.x's default quantization format for dense_vector indices (since 9.1). It quantizes float32 vectors (typically 1536 or 3072 dimensions from OpenAI models) into binary representations, reducing heap requirements by 95%+ versus storing raw float32 vectors. A 10 million vector index that previously required 60GB+ of heap runs with 100MB memory footprint with BBQ, maintaining sub-20ms query latency. DiskBBQ extends this further — reading quantized vectors from disk rather than heap, enabling 100M+ vector workloads on standard hardware. BBQ makes large-scale vector search economically viable for applications that previously couldn't afford the infrastructure.
ELSER (Elastic Learned Sparse EncodeR) is a sparse retrieval model trained by Elastic that understands semantic meaning without requiring custom model training. Unlike BM25 (keyword matching), ELSER understands that a query for 'car' should retrieve documents mentioning 'automobile' or 'vehicle'. Unlike dense embedding models (which require billions of examples to fine-tune), ELSER works effectively on out-of-domain text immediately. In benchmark testing, ELSER beats BM25 by 10–20% recall on BEIR benchmarks. Combined with BM25 via Reciprocal Rank Fusion (RRF), hybrid retrieval consistently outperforms either approach alone. ELSER is available on all Elastic deployment types and does not require separate model hosting.
ES|QL (Elasticsearch Query Language) is a SQL-inspired, pipe-based query language introduced in Elasticsearch 8.11 and central to the 9.x roadmap. Instead of the verbose JSON query DSL: FROM logs-* | WHERE status == 500 | STATS count = COUNT(*) BY service | SORT count DESC | LIMIT 10 — this is readable by any SQL-familiar developer. ES|QL has its own optimized execution engine separate from the OLTP query path, designed specifically for analytical queries over large datasets. It supports STATS aggregations, EVAL computed fields, DISSECT and GROK for log parsing, ENRICH for data enrichment, and full WHERE filtering. The DSL still works — ES|QL is the preferred interface for analytics and ad-hoc queries.
OpenSearch is Amazon's 2021 fork of Elasticsearch 7.10 (when Elastic changed its license). Key differences in 2026: (1) Development velocity — elastic/elasticsearch has 76,700 GitHub stars and ~41,000 commits; OpenSearch has ~5,800 commits; Elasticsearch has 7x the code activity. (2) Features — Elasticsearch 9.x has BBQ, ELSER, ESRE, and ES|QL; OpenSearch has had slower feature development. (3) License — both now have AGPL options; Elasticsearch also offers Elastic License 2.0 and SSPL. (4) Managed services — AWS OpenSearch Service vs Elastic Cloud. Choose Elasticsearch for: latest vector search capabilities, ELSER semantic search, ES|QL analytics, and Elastic Cloud. Choose OpenSearch for: AWS-native deployments where OpenSearch Service integration matters or licensing requires strict OSI compliance.
Common sync patterns: (1) Logstash JDBC input — polls your database on a schedule and indexes new/updated records; simple to configure, has latency equal to the polling interval. (2) Debezium CDC (Change Data Capture) — connects to PostgreSQL's logical replication slot or MySQL's binlog; streams every row change to Kafka; near-real-time with sub-second indexing latency. (3) Application-level sync — your application writes to both the database and Elasticsearch in the same request; simple but risks consistency if one write fails. (4) Meilisearch/Typesense for simpler needs — if full Elasticsearch capability isn't required, lighter search engines with simpler sync are worth evaluating. We use Debezium + Kafka for production systems requiring real-time search freshness, and Logstash JDBC for batch indexing workflows.
Multi-tenant Elasticsearch architectures: (1) Index per tenant — each tenant gets their own index; strong isolation, simple queries, but doesn't scale beyond hundreds of tenants (index count limits). (2) Tenant field filtering — all tenants share an index; queries filter by tenant_id field; simpler cluster management, but tenant isolation depends on application-level filtering. (3) Index patterns with ILM — time-based indices per tenant with lifecycle management; good for log and event data. We recommend tenant field filtering for most SaaS applications (scalable to millions of tenants) with Elasticsearch's field-level security (Elastic license required) to enforce tenant isolation at the query level, preventing cross-tenant data leakage even if application filtering is misconfigured.
ELK (Elasticsearch, Logstash, Kibana) is the dominant open-source observability stack: Elasticsearch stores and indexes logs; Logstash (or the lighter Beats / Fluent Bit agents) collects and ships logs from applications, servers, and containers; Kibana provides dashboards, visualizations, alerting, and query interfaces. Modern deployments often use Beats (Filebeat, Metricbeat, Packetbeat) or Fluent Bit as lightweight agents instead of Logstash for ingestion. Elastic APM adds application performance monitoring — distributed tracing, transaction performance, and error tracking — integrated into the same Kibana interface. ES|QL in 9.x makes ad-hoc log analysis queries far more accessible to developers without Elasticsearch DSL expertise.
ESRE (Elasticsearch Relevance Engine) is Elastic's framework for building production AI retrieval pipelines from a single Elasticsearch deployment. It bundles: ELSER for sparse semantic retrieval, E5 multilingual dense embedding models for vector similarity, BBQ for memory-efficient vector storage, Reciprocal Rank Fusion (RRF) for combining sparse + dense scores, text similarity reranking for final ordering, result diversification to avoid near-duplicate results, and rule-based pinning for controlled content promotion. A single ESRE query returns a ranked list of relevant documents in under 50ms that can be passed directly to an LLM as RAG context. No separate vector database, no additional retrieval service, no orchestration framework required for the retrieval layer.
Production Elasticsearch architecture: (1) Minimum 3 nodes for HA (avoids split-brain with master election quorum). (2) Dedicated master nodes for large clusters (5+ data nodes) — prevents master instability during data node load. (3) JVM heap at 50% of available RAM, maximum 32GB per node (compressed OOPs efficiency drops above 32GB). (4) Fast NVMe SSD storage — Elasticsearch is I/O-intensive; spinning disk causes GC pressure and query latency. (5) Index Lifecycle Management (ILM) — automate index rollover, phase transitions (hot → warm → cold → delete), and snapshot scheduling. (6) Dedicated monitoring cluster — don't monitor Elasticsearch with the same Elasticsearch you're monitoring. (7) Snapshot repository (S3/GCS) — daily snapshots to object storage; test restore quarterly. We provision production Elasticsearch clusters with Terraform and deploy monitoring via Metricbeat to a separate monitoring cluster.
Still have questions?
Contact Us