From 1adab6e13661bb380dd703203cdd044329494626 Mon Sep 17 00:00:00 2001 From: Ryan Malloy Date: Wed, 18 Feb 2026 12:23:47 -0700 Subject: [PATCH] Update docs with 66k benchmark results, honest SP-GiST framing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - benchmarks.mdx: Add GiST conjunction screening and KNN sections, update all numbers to 66,440-object catalog, PG 17→18, show SP-GiST slower than seqscan at this scale with explanation of why - operators-gist.mdx: Real 66k performance tables for GiST and SP-GiST, rewrite KNN example with scalar subquery pattern, add CTE warning - conjunction-screening.mdx: Update catalog size, candidate counts, add KNN scalar subquery note, verified performance numbers --- .../docs/guides/conjunction-screening.mdx | 24 +-- .../content/docs/performance/benchmarks.mdx | 150 ++++++++++++++---- .../content/docs/reference/operators-gist.mdx | 55 +++++-- 3 files changed, 171 insertions(+), 58 deletions(-) diff --git a/docs/src/content/docs/guides/conjunction-screening.mdx b/docs/src/content/docs/guides/conjunction-screening.mdx index d57a832..267584d 100644 --- a/docs/src/content/docs/guides/conjunction-screening.mdx +++ b/docs/src/content/docs/guides/conjunction-screening.mdx @@ -17,7 +17,7 @@ Operational conjunction screening uses several established tools and data source - **CelesTrak SOCRATES**: Dr. Kelso's web-based close-approach listing. Updated regularly, covers the full public catalog. Not queryable; you read reports. - **Python scripts**: Propagate the catalog in a loop, compute pairwise distances, filter by threshold. Works for small catalogs. Does not scale. -The fundamental challenge: a catalog of 25,000+ tracked objects produces over 300 million unique pairs. Even checking each pair at a single epoch takes significant time. Checking over a 7-day window at 1-minute resolution is computationally prohibitive without pre-filtering. +The fundamental challenge: a catalog of 66,000+ tracked objects produces over 2 billion unique pairs. Even checking each pair at a single epoch takes significant time. Checking over a 7-day window at 1-minute resolution is computationally prohibitive without pre-filtering. ## What changes with pg_orrery @@ -89,7 +89,7 @@ INSERT INTO catalog VALUES (99901, 'Equatorial-LEO', CREATE INDEX catalog_orbit_gist ON catalog USING gist (tle); ``` -The index builds in milliseconds for a small table. For a full 25,000-object catalog, expect about 200ms. +The index builds in milliseconds for a small table. For a full 66,440-object catalog, build time is 93 ms (15 MB index). ### Check orbital parameters @@ -157,19 +157,21 @@ This should return only ISS itself (and not Equatorial-LEO, which has a differen Find the 3 closest objects to the ISS by altitude band separation, ordered by distance: ```sql -SET enable_seqscan = off; - +-- Scalar subquery probe enables GiST index-ordered scan SELECT name, - round((tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544))::numeric, 0) AS alt_dist_km + round((tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544 LIMIT 1))::numeric, 0) + AS alt_dist_km FROM catalog WHERE norad_id != 25544 -ORDER BY tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544) +ORDER BY tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544 LIMIT 1) LIMIT 3; - -RESET enable_seqscan; ``` -This uses the GiST distance operator for efficient ordering. PostgreSQL's KNN-GiST infrastructure handles this without computing all distances upfront. +This uses the GiST distance operator for efficient ordering. PostgreSQL's KNN-GiST infrastructure traverses the tree by increasing distance without computing all distances upfront. On a 66,440-object catalog, this completes in 2.1 ms for 10 neighbors. + + ### Self-overlap is always true @@ -206,7 +208,7 @@ The complete two-stage workflow for a larger catalog: AND c.norad_id != 25544; ``` - For the ISS in a 25,000-object catalog, this typically returns a few hundred candidates. + For the ISS in a 66,440-object catalog, this returns 9 candidates (all co-orbital vehicles: visiting spacecraft, modules, and debris). The GiST index scan completes in 4.6 ms vs. 63.3 ms for a sequential scan. 3. **Stage 2: Time-resolved distance computation:** @@ -253,7 +255,7 @@ ORDER BY actual_dist_km; ``` ### Monitoring over time diff --git a/docs/src/content/docs/performance/benchmarks.mdx b/docs/src/content/docs/performance/benchmarks.mdx index 4f38d2a..ac44970 100644 --- a/docs/src/content/docs/performance/benchmarks.mdx +++ b/docs/src/content/docs/performance/benchmarks.mdx @@ -6,7 +6,7 @@ sidebar: import { Aside, Tabs, TabItem } from "@astrojs/starlight/components"; -Measured performance numbers for pg_orrery's core operations. Every number on this page was produced by running the listed SQL query against a live PostgreSQL 17 instance with a single backend, no parallel workers, and no connection pooling overhead. +Measured performance numbers for pg_orrery's core operations. Every number on this page was produced by running the listed SQL query against a live PostgreSQL 18 instance with a single backend, no parallel workers, and no connection pooling overhead.