Fix L1 inclination pruning for HEO orbits, add 66k benchmark

Bug: inner_consistent used sma_low for footprint calculation, but ground footprint grows with altitude. High-SMA bins (GTO, HEO) need sma_high to compute the maximum footprint — using sma_low caused 453 false negatives at high-latitude observers (Tromsoe). Fix: use sma_high (not sma_low) in L1 inclination pruning. Added regression test: GTO-debris (inc 5 deg, e=0.73) at Tromsoe must return identical results from seqscan and index scan. Benchmark on 65,886-object catalog (full Space-Track including decayed): 80-92% pruning, zero false negatives across 7 query patterns. SP-GiST beats seqscan for high-latitude observers.
2026-02-17 23:05:49 -07:00 · 2026-02-17 23:05:49 -07:00 · 747b7ae60a
commit 747b7ae60a
parent 13d49c1072
8 changed files with 263819 additions and 4 deletions
--- a/bench/benchmark_results_66k.txt
+++ b/bench/benchmark_results_66k.txt
@ -0,0 +1,114 @@
 pg_orrery v0.7.0 SP-GiST Benchmark — 66k Full Space-Track Catalog
 ===================================================================
 Date: 2026-02-17
 Catalog: Space-Track USSPACECOM full catalog including decayed (65,886 objects)
 Host: Linux 6.16.5-arch1-1, PostgreSQL 17
 Branch: phase/spgist-orbital-trie
 Note: After fixing L1 inclination pruning (sma_low -> sma_high)
 Catalog Composition:
  LEO (<128 min):          59,537  (90.4%)
  MEO (128-720 min):        3,474  ( 5.3%)
  GEO/HEO (720-1500 min):  2,643  ( 4.0%)
  Super-GEO (>1500 min):     232  ( 0.4%)
 Index Build:
  SP-GiST: 55.2 ms, 11 MB
  GiST:   118.2 ms, 13 MB
  Table:              10 MB
 ==============================================================
 TIMING RESULTS (best of 2-3 runs, ms)
 ==============================================================
 Query Pattern                | Seqscan | SP-GiST | Delta
 -----------------------------|---------|---------|-------
 2h window, Eagle ID, 10 deg  |  12.5   |  14.0   | +1.5
 6h window, Eagle ID, 10 deg  |  12.2   |  15.6   | +3.4
 2h window, Tromsø, 10 deg    |  11.3   |  10.9   | -0.4  ★
 24h window, Eagle ID, 10 deg |  12.0   |  16.2   | +4.2
 ★ Tromsø (69.6°N): SP-GiST beats seqscan. High-latitude observers
  benefit most from inclination pruning.
 ==============================================================
 PRUNING RESULTS
 ==============================================================
 Query Pattern                | Candidates | % Pass | % Pruned
 -----------------------------|------------|--------|---------
 2h window, Eagle ID, 10 deg  |   12,964   |  19.7% |   80.3%
 6h window, Eagle ID, 10 deg  |   24,274   |  36.8% |   63.2%
 24h window, Eagle ID, 10 deg |   60,875   |  92.4% |    7.6%
 2h window, Eagle ID, 30 deg  |    9,680   |  14.7% |   85.3%
 2h window, equatorial, 10deg |    9,699   |  14.7% |   85.3%
 2h window, Tromsø 69.6°N     |    6,529   |   9.9% |   90.1%
 2h window, South Pole 85°S   |    5,248   |   8.0% |   92.0%
 ==============================================================
 CONSISTENCY CHECKS (all patterns)
 ==============================================================
 Query Pattern                | False Negatives | False Positives
 -----------------------------|-----------------|----------------
 2h Eagle 10deg               |        0        |       0
 6h Eagle 10deg               |        0        |       0
 24h Eagle 10deg              |        0        |       0
 2h Eagle 30deg               |        0        |       0
 2h Equator 10deg             |        0        |       0
 2h Tromsø 10deg              |        0        |       0
 2h South Pole 10deg          |        0        |       0
 ==============================================================
 SCALING TREND (2h Eagle 10deg, best-of-N)
 ==============================================================
 Catalog Size | Seqscan | SP-GiST | Delta  | Notes
 -------------|---------|---------|--------|------
 14,376       |  4.5 ms |  6.1 ms | +1.6ms | Active CelesTrak
 29,784       |  5.2 ms |  5.2 ms | +0.0ms | Active Space-Track (before fix)
 65,886       | 12.5 ms | 14.0 ms | +1.5ms | Full catalog incl decayed (after fix)
 The fix (sma_high instead of sma_low for footprint) adds ~1-2ms overhead
 by conservatively keeping more subtrees alive during L1 pruning. This is
 the correct trade-off: zero false negatives is non-negotiable.
 ==============================================================
 PLANNER BEHAVIOR (66k)
 ==============================================================
 PostgreSQL still chooses SP-GiST Index Only Scan by default:
  Index Only Scan using bench_spgist on bench_catalog
    Index Cond: (tle &? ...)
    Heap Fetches: 0
    Buffers: shared hit=4990
 Seqscan would read 1,297 pages. Index reads 4,990 pages (3.8x more).
 But Index Only Scan avoids all heap I/O.
 ==============================================================
 KEY FINDING: HIGH-LATITUDE OBSERVERS
 ==============================================================
 The SP-GiST index is most valuable for high-latitude observers:
  Tromsø (69.6°N): 90.1% pruned, SP-GiST BEATS seqscan by 0.4ms
  South Pole (85°S): 92.0% pruned
 High-latitude locations eliminate most LEO satellites via the
 inclination filter — only satellites with inc > ~60° can reach
 these latitudes. The SP-GiST trie prunes entire inclination
 subtrees at L1, making the index scan faster than touching
 every page in the table.
 ==============================================================
 WHAT THE 80-92% PRUNING MEANS IN PRACTICE
 ==============================================================
 For a 65,886-object catalog with a 2-hour window:
  - Without &? operator: 65,886 SGP4 predict_passes() calls
  - With &? operator:    12,964 SGP4 calls (Eagle) or 5,248 (South Pole)
  - Savings: 52,922-60,638 unnecessary propagation calls avoided
 At ~1ms per predict_passes() call (7-day window, 30s resolution),
 that's 53-61 seconds of saved computation per query.
--- a/bench/load_full_catalog.sql
+++ b/bench/load_full_catalog.sql
--- a/bench/spacetrack_full_all.tle
+++ b/bench/spacetrack_full_all.tle
--- a/bench/tle_archives.tar.gz
+++ b/bench/tle_archives.tar.gz
--- a/docs/src/content/docs/performance/benchmarks.mdx
+++ b/docs/src/content/docs/performance/benchmarks.mdx
@ -17,6 +17,7 @@ All benchmarks use PostgreSQL's `EXPLAIN (ANALYZE, BUFFERS)` for timing. The num
 | Operation | Count | Time | Rate | Notes |
 |-----------|-------|------|------|-------|
 | TLE propagation (SGP4) | 12,000 | 17 ms | 706K/sec | Mixed LEO/MEO/GEO |
 | Visibility cone filter (`&?`) | 65,886 | 12.5 ms | 5.3M/sec | 80% pruned, no SGP4 |
 | Planet observation (VSOP87) | 875 | 57 ms | 15.4K/sec | All 7 non-Earth planets, 125 times each |
 | Galilean moon observation | 1,000 | 63 ms | 15.9K/sec | L1.2 + VSOP87 pipeline |
 | Saturn moon observation | 800 | 53 ms | 15.1K/sec | TASS17 + VSOP87 |
@ -219,6 +220,65 @@ FROM predict_passes(
 A 7-day window at 30-second coarse scan resolution requires ~20,160 propagation calls for the coarse scan, plus bisection and ternary search calls for each pass found. Typical ISS result: 25--35 passes found in ~40 ms.
 ## Visibility cone filtering (`&?` operator)
 The `&?` operator answers "could this satellite possibly be visible from this observer?" using three geometric filters (altitude, inclination, RAAN) without any SGP4 propagation. This is the first stage of the pass prediction pipeline, reducing the number of satellites that need full propagation.
 ```sql
 -- Benchmark: filter a 66,000-object catalog
 -- Eagle, Idaho: 2-hour window, 10 deg minimum elevation
 EXPLAIN (ANALYZE, BUFFERS)
 SELECT count(*)
 FROM satellite_catalog
 WHERE tle &? ROW(
    observer('43.6977N 116.3535W 760m'),
    '2024-01-01 02:00:00+00'::timestamptz,
    '2024-01-01 04:00:00+00'::timestamptz,
    10.0
 )::observer_window;
 ```
 **65,886 TLEs filtered in 12.5 ms --- 80% pruned, 12,964 candidates survive.**
 The operator evaluates three geometric conditions per TLE: perigee altitude vs. maximum visible altitude, inclination + ground footprint vs. observer latitude, and RAAN alignment via J2 secular precession. Each check is a few floating-point operations --- no SGP4 initialization, no Kepler equation, no trigonometric series.
 ### Pruning rate by query pattern
 The pruning rate depends on observer latitude, query window duration, and minimum elevation. Shorter windows and higher latitudes prune more aggressively.
 | Query | Candidates | Pruned | Notes |
 |-------|-----------|--------|-------|
 | 2h, Eagle ID (43.7°N), 10° | 12,964 | 80.3% | Typical mid-latitude evening |
 | 2h, Tromsoe (69.6°N), 10° | 6,529 | 90.1% | High latitude: inclination filter strongest |
 | 2h, South Pole (85°S), 10° | 5,248 | 92.0% | Only polar-orbit satellites survive |
 | 2h, Equator (0°N), 10° | 9,699 | 85.3% | All inclinations pass latitude check; RAAN filter dominates |
 | 2h, Eagle ID, 30° | 9,680 | 85.3% | Higher elevation: altitude filter tighter |
 | 6h, Eagle ID, 10° | 24,274 | 63.2% | Wider RAAN window admits more candidates |
 | 24h, Eagle ID, 10° | 60,875 | 7.6% | RAAN filter bypassed (full Earth rotation) |
 ### SP-GiST index performance
 The optional SP-GiST index (`tle_spgist_ops`) builds a 2-level trie partitioned by semi-major axis and inclination. At 66,000 objects, the index adds 1--2 ms overhead compared to a sequential scan for mid-latitude observers, but **beats the sequential scan for high-latitude queries** where inclination pruning eliminates entire subtrees:
 | Query | Seqscan | SP-GiST | Difference |
 |-------|---------|---------|------------|
 | 2h, Eagle ID, 10° | 12.5 ms | 14.0 ms | +1.5 ms |
 | 2h, Tromsoe, 10° | 11.3 ms | 10.9 ms | **-0.4 ms** |
 The planner chooses the SP-GiST Index Only Scan by default at this catalog size, with zero heap fetches (all data served from index pages).
 <Aside type="tip" title="Where the index shines">
 The SP-GiST index is most valuable for high-latitude observers (60°+) and for catalogs larger than 30,000 objects. At typical CelesTrak catalog sizes (12--15,000 active satellites), the `&?` operator's sequential evaluation is fast enough that the index overhead exceeds the pruning benefit.
 </Aside>
 ### What the pruning means for predict_passes()
 For a 65,886-object catalog and a 2-hour window from Eagle, Idaho:
 - **Without `&?`:** 65,886 `predict_passes()` calls (each ~1 ms for a 7-day window)
 - **With `&?`:** 12,964 calls --- **52,922 unnecessary propagations avoided**
 - **Time saved:** ~53 seconds per query at typical propagation cost
 ## Reproducing these benchmarks
 <Tabs>
@ -265,4 +325,6 @@ A 7-day window at 30-second coarse scan resolution requires ~20,160 propagation
 The benchmarks demonstrate that pg_orrery's computation cost is low enough to treat orbital mechanics as a SQL primitive. Propagating an entire satellite catalog takes less time than a typical index scan on a moderately-sized table. Planet observation is fast enough to generate ephemeris tables with `generate_series`. Pork chop plots are feasible as interactive queries rather than batch jobs.
-The numbers also show where the bottlenecks are: VSOP87 series evaluation dominates everything except star observation and raw SGP4 propagation. If a future optimization effort targets one component, it should be the VSOP87 evaluation loop.
+The visibility cone filter (`&?`) is the fastest operation per evaluation --- three floating-point comparisons vs. the full SGP4 pipeline --- and its 80--92% pruning rate means the most expensive operation in a pass prediction pipeline (SGP4 propagation) only runs on the small fraction of the catalog that could actually produce a visible pass.
 The numbers also show where the bottlenecks are: VSOP87 series evaluation dominates everything except star observation, raw SGP4 propagation, and the visibility cone filter. If a future optimization effort targets one component, it should be the VSOP87 evaluation loop.
--- a/src/spgist_tle.c
+++ b/src/spgist_tle.c
@ -607,16 +607,19 @@ spgist_tle_inner_consistent(PG_FUNCTION_ARGS)
 			 *   i + footprint >= |phi|
 			 *
 			 * Use the parent SMA range to compute a conservative footprint.
-			 * The largest footprint comes from the lowest altitude.
+			 * The largest footprint comes from the HIGHEST altitude (footprint
 			 * grows with altitude: GEO sees 71+ degrees, LEO sees ~7 degrees).
 			 * Use sma_high for conservatism — never prune objects that the
 			 * leaf filter would accept.
 			 */
 			double obs_lat = fabs(win.obs.lat);
 			double sma_for_footprint;
 			double footprint;
 			if (parent_trav)
-				sma_for_footprint = parent_trav->sma_low;
+				sma_for_footprint = parent_trav->sma_high;
 			else
-				sma_for_footprint = WGS72_AE + 200.0;	/* conservative LEO */
+				sma_for_footprint = 50000.0;	/* above GEO — maximum footprint */
 			footprint = ground_footprint_deg(sma_for_footprint,
 											 win.min_el_deg) * DEG_TO_RAD;
--- a/test/expected/spgist_tle.out
+++ b/test/expected/spgist_tle.out
@ -305,6 +305,50 @@ ORDER BY name;
 RESET enable_indexscan;
 RESET enable_bitmapscan;
 -- ============================================================
 -- Test 13: HEO at high latitude — GTO-class orbit (low inc,
 -- high SMA, high eccentricity) from Tromsø (69.6°N).
 -- The large SMA gives a huge footprint that compensates for the
 -- low inclination.  Must pass the seqscan operator check.
 -- Regression test for the L1 pruning bug (sma_low vs sma_high).
 -- ============================================================
 -- GTO debris: inc 5 deg, perigee ~250 km, apogee ~35786 km
 INSERT INTO test_spgist (name, tle) VALUES ('GTO-DEBRIS',
    '1 99905U 24999E   24001.50000000  .00000100  00000+0  10000-3 0  9994
 2 99905   5.0000 210.0000 7300000  30.0000  61.0000  2.25600000 00001');
 -- Seqscan: GTO-DEBRIS from Tromsø — must be visible
 -- inc 5 deg + footprint(SMA ~25000) ~65 deg = 70 > 69.6
 SELECT name,
    tle &? ROW(
        observer('69.6N 19.0E 0m'),
        '2024-01-01 00:00:00+00'::timestamptz,
        '2024-01-02 00:00:00+00'::timestamptz,
        10.0
    )::observer_window AS visible
 FROM test_spgist
 WHERE name = 'GTO-DEBRIS';
    name    | visible 
 ------------+---------
 GTO-DEBRIS | t
 (1 row)
 -- Index scan: same query, must return the same result
 SET enable_seqscan = off;
 SELECT name,
    tle &? ROW(
        observer('69.6N 19.0E 0m'),
        '2024-01-01 00:00:00+00'::timestamptz,
        '2024-01-02 00:00:00+00'::timestamptz,
        10.0
    )::observer_window AS visible
 FROM test_spgist
 WHERE name = 'GTO-DEBRIS';
    name    | visible 
 ------------+---------
 GTO-DEBRIS | t
 (1 row)
 RESET enable_seqscan;
 -- ============================================================
 -- Cleanup
 -- ============================================================
 DROP TABLE test_spgist;
--- a/test/sql/spgist_tle.sql
+++ b/test/sql/spgist_tle.sql
@ -271,6 +271,45 @@ RESET enable_indexscan;
 RESET enable_bitmapscan;
 -- ============================================================
 -- Test 13: HEO at high latitude — GTO-class orbit (low inc,
 -- high SMA, high eccentricity) from Tromsø (69.6°N).
 -- The large SMA gives a huge footprint that compensates for the
 -- low inclination.  Must pass the seqscan operator check.
 -- Regression test for the L1 pruning bug (sma_low vs sma_high).
 -- ============================================================
 -- GTO debris: inc 5 deg, perigee ~250 km, apogee ~35786 km
 INSERT INTO test_spgist (name, tle) VALUES ('GTO-DEBRIS',
    '1 99905U 24999E   24001.50000000  .00000100  00000+0  10000-3 0  9994
 2 99905   5.0000 210.0000 7300000  30.0000  61.0000  2.25600000 00001');
 -- Seqscan: GTO-DEBRIS from Tromsø — must be visible
 -- inc 5 deg + footprint(SMA ~25000) ~65 deg = 70 > 69.6
 SELECT name,
    tle &? ROW(
        observer('69.6N 19.0E 0m'),
        '2024-01-01 00:00:00+00'::timestamptz,
        '2024-01-02 00:00:00+00'::timestamptz,
        10.0
    )::observer_window AS visible
 FROM test_spgist
 WHERE name = 'GTO-DEBRIS';
 -- Index scan: same query, must return the same result
 SET enable_seqscan = off;
 SELECT name,
    tle &? ROW(
        observer('69.6N 19.0E 0m'),
        '2024-01-01 00:00:00+00'::timestamptz,
        '2024-01-02 00:00:00+00'::timestamptz,
        10.0
    )::observer_window AS visible
 FROM test_spgist
 WHERE name = 'GTO-DEBRIS';
 RESET enable_seqscan;
 -- ============================================================
 -- Cleanup
 -- ============================================================