Fix GiST union 0-based indexing and palloc size, add 66k benchmark
Two bugs in gist_tle.c caused the && (overlap) operator to return zero results through the GiST index while sequential scan worked: 1. gist_tle_union read from vector[FirstOffsetNumber] (index 1), skipping vector[0] which holds the accumulated union key. Every internal node collapsed to a single-entry bounding box. Fixed: seed from vector[0], loop from 1. 2. All GiST key allocations used sizeof(tle_orbital_key) (32 bytes) or sizeof(pg_tle) (104 bytes), but INTERNALLENGTH is 112. index_form_tuple() copies typlen bytes, causing buffer overread. Fixed: TLE_TYPLEN constant (112) for all index datum allocations. The <-> (KNN distance) operator was unaffected because it uses gist_tle_distance, not gist_tle_consistent. Verified against 66,440-object catalog: - && consistency: 9 seqscan == 9 GiST (ISS conjunction) - <-> KNN: 10 nearest in 2.1ms via index-ordered scan - All 15 regression tests pass
This commit is contained in:
parent
eadbb45a3b
commit
347acf0906
168
bench/benchmark_results_mega.txt
Normal file
168
bench/benchmark_results_mega.txt
Normal file
@ -0,0 +1,168 @@
|
|||||||
|
pg_orrery Full Index Benchmark — 66k Catalog
|
||||||
|
===========================================================
|
||||||
|
Date: 2026-02-18
|
||||||
|
PostgreSQL: 18.1
|
||||||
|
Catalog: 66,440 objects (merged from 4 sources)
|
||||||
|
Sources: Space-Track (66,248), CelesTrak active (5 unique),
|
||||||
|
SatNOGS (110 unique), CelesTrak SupGP (77 unique + 8,167 epoch updates)
|
||||||
|
Includes: 362 Alpha-5 objects (NORAD > 99,999)
|
||||||
|
|
||||||
|
Orbital regime breakdown:
|
||||||
|
LEO (<2000km): 63,097 (95.0%)
|
||||||
|
GEO/HEO (>34000km): 1,760 ( 2.6%)
|
||||||
|
MEO (2000-20000km): 1,277 ( 1.9%)
|
||||||
|
GEO-transfer: 306 ( 0.5%)
|
||||||
|
|
||||||
|
Index sizes:
|
||||||
|
SP-GiST (tle_spgist_ops): 67 ms build, 11 MB
|
||||||
|
GiST (tle_ops): 93 ms build, 15 MB
|
||||||
|
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
SP-GiST: Visibility Cone (&?) — "Can this satellite pass over me?"
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
SP-GiST prunes by altitude band, inclination, and RAAN window.
|
||||||
|
The &? operator answers: "Could this satellite be visible from this
|
||||||
|
observer during this time window above this minimum elevation?"
|
||||||
|
|
||||||
|
Query │ SP-GiST │ Seqscan │ Candidates │ Pruned%
|
||||||
|
───────────────────────┼──────────┼──────────┼────────────┼────────
|
||||||
|
Eagle 2h/10deg │ 16.1 ms │ 12.1 ms │ 10,763 │ 83.8%
|
||||||
|
Eagle 24h/10deg │ 23.3 ms │ 12.5 ms │ 61,426 │ 7.5%
|
||||||
|
Equator 2h/10deg │ 16.8 ms │ 12.1 ms │ 10,174 │ 84.7%
|
||||||
|
Eagle 2h/45deg │ 16.9 ms │ 11.9 ms │ 6,796 │ 89.8%
|
||||||
|
|
||||||
|
Consistency: PASS (all 4 scenarios: 0 false neg, 0 false pos)
|
||||||
|
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
GiST: Overlap (&&) — "Does this satellite share my orbit band?"
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
GiST groups satellites by [altitude_low, altitude_high] × [inclination].
|
||||||
|
The && operator answers: "Do these two TLEs occupy overlapping orbit bands?"
|
||||||
|
Used for conjunction screening — finding potential collision partners.
|
||||||
|
|
||||||
|
Critical bugfix in this session:
|
||||||
|
Bug 1: palloc size mismatch (sizeof(pg_tle)=104 vs INTERNALLENGTH=112)
|
||||||
|
Bug 2: gist_tle_union used 1-based indexing (picksplit convention)
|
||||||
|
instead of 0-based (union convention), skipping vector[0]
|
||||||
|
|
||||||
|
Query │ GiST │ Seqscan │ Matches
|
||||||
|
───────────────────────┼──────────┼──────────┼────────
|
||||||
|
ISS conjunction │ 10.9 ms │ 63.3 ms │ 9
|
||||||
|
Starlink-230369 │ 9.5 ms │ 14.9 ms │ 0
|
||||||
|
SYNCOM 2 (GEO) │ 4.0 ms │ 7.2 ms │ 0
|
||||||
|
|
||||||
|
Consistency: PASS (ISS: 9 seqscan == 9 GiST, 0 mismatch)
|
||||||
|
|
||||||
|
ISS conjunction candidates (altitude + inclination overlap):
|
||||||
|
PROGRESS MS-31, PROGRESS MS-32, SOYUZ MS-28,
|
||||||
|
DRAGON FREEDOM 3, DRAGON CRS-33, CYGNUS NG-23,
|
||||||
|
HTV-X1, ISS (NAUKA), OBJECT E
|
||||||
|
— All ISS-visiting vehicles or co-orbital modules. ✓
|
||||||
|
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
GiST: KNN (<->) — "What's nearest to this orbit?"
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
GiST KNN uses altitude-band distance for index-ordered scans.
|
||||||
|
The <-> operator returns orbital altitude separation in km.
|
||||||
|
Probe must be a scalar subquery for index ordering to activate.
|
||||||
|
|
||||||
|
Query │ GiST KNN │ Buffers │ Notes
|
||||||
|
───────────────────────┼──────────┼─────────┼──────────────
|
||||||
|
10 nearest to ISS │ 2.1 ms │ 982 │ Index-ordered
|
||||||
|
10 nearest to SYNCOM 2 │ 0.2 ms │ 40 │ Index-ordered
|
||||||
|
100 nearest to ISS │ 1.4 ms │ 1,062 │ Index-ordered
|
||||||
|
Within 50km of ISS │ 16.0 ms │ 4,014 │ 12,496 matches
|
||||||
|
|
||||||
|
Pattern for KNN queries (probe as scalar subquery):
|
||||||
|
ORDER BY b.tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544 LIMIT 1)
|
||||||
|
LIMIT 10;
|
||||||
|
→ Index Scan using bench_gist_idx, Order By: tle <-> InitPlan
|
||||||
|
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
EXPLAIN ANALYZE Details
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
SP-GiST 2h/Eagle/10deg:
|
||||||
|
Index Only Scan using bench_spgist_idx
|
||||||
|
Heap Fetches: 0 (pure index scan)
|
||||||
|
Buffers: shared hit=4964
|
||||||
|
17.5 ms execution
|
||||||
|
|
||||||
|
SeqScan 2h/Eagle/10deg:
|
||||||
|
Seq Scan, Filter rows removed: 55,677
|
||||||
|
Buffers: shared hit=1338
|
||||||
|
12.5 ms execution
|
||||||
|
|
||||||
|
GiST && ISS conjunction:
|
||||||
|
Nested Loop → Index Scan using bench_gist_idx
|
||||||
|
Index Cond: (tle && a.tle)
|
||||||
|
Index Searches: 1, Buffers: shared hit=287
|
||||||
|
4.1 ms execution
|
||||||
|
|
||||||
|
GiST KNN 10 nearest ISS:
|
||||||
|
Index Scan using bench_gist_idx
|
||||||
|
Order By: (tle <-> InitPlan)
|
||||||
|
Index Searches: 1
|
||||||
|
2.1 ms execution
|
||||||
|
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
Pruning Summary
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
Scenario │ Catalog │ Candidates │ Candidate% │ Pruned%
|
||||||
|
─────────────────┼─────────┼────────────┼────────────┼────────
|
||||||
|
2h/Eagle/10deg │ 66,440 │ 10,763 │ 16.2% │ 83.8%
|
||||||
|
2h/Equator/10deg │ 66,440 │ 10,174 │ 15.3% │ 84.7%
|
||||||
|
2h/Eagle/45deg │ 66,440 │ 6,796 │ 10.2% │ 89.8%
|
||||||
|
24h/Eagle/10deg │ 66,440 │ 61,426 │ 92.5% │ 7.5%
|
||||||
|
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
Application Queries
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
"What's overhead right now?" (SP-GiST filter + SGP4 propagation):
|
||||||
|
15 satellites above horizon, top: NAVSTAR 57 at 81.7° el
|
||||||
|
107 ms (includes SGP4 propagation for each candidate)
|
||||||
|
|
||||||
|
ISS pass prediction (next 24h from 66k catalog):
|
||||||
|
6 passes found, max 87.6° elevation
|
||||||
|
3.8 ms
|
||||||
|
|
||||||
|
ISS conjunction screening (GiST && on 66k catalog):
|
||||||
|
9 co-orbital objects found
|
||||||
|
4.6 ms via GiST (vs 63.3 ms seqscan — 5.8x speedup)
|
||||||
|
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
Key Observations
|
||||||
|
═══════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
1. GiST && is the clear winner for conjunction screening:
|
||||||
|
- ISS: 10.9ms GiST vs 63.3ms seqscan (5.8x speedup)
|
||||||
|
- Only 287 buffer hits vs 1,338 for seqscan
|
||||||
|
- Returns exactly the right 9 co-orbital objects
|
||||||
|
|
||||||
|
2. GiST KNN is extremely fast for "nearest orbit" queries:
|
||||||
|
- 10 nearest: 2.1ms with index ordering
|
||||||
|
- GEO satellite: 0.15ms (sparse regime, fewer nodes to traverse)
|
||||||
|
- Requires scalar subquery probe pattern for index ordering
|
||||||
|
|
||||||
|
3. SP-GiST visibility cone handles 2h windows well:
|
||||||
|
- 83.8% pruning at 10° min_el (Eagle, 2h)
|
||||||
|
- 89.8% pruning at 45° min_el
|
||||||
|
- Falls behind seqscan at 24h windows (7.5% pruning not worth index overhead)
|
||||||
|
|
||||||
|
4. Both indexes are compact:
|
||||||
|
- SP-GiST: 11 MB for 66k objects (170 bytes/object)
|
||||||
|
- GiST: 15 MB for 66k objects (237 bytes/object)
|
||||||
|
- Build times: 67ms and 93ms respectively
|
||||||
|
|
||||||
|
5. Zero false positives/negatives across all consistency checks.
|
||||||
|
|
||||||
|
Alpha-5 support:
|
||||||
|
- Bill Gray's get_el.c parser handles Alpha-5 natively
|
||||||
|
- T0002 → 270002, A0001 → 100001, Z9999 → 339999 ✓
|
||||||
|
- Round-trip (parse → output) preserves Alpha-5 encoding ✓
|
||||||
|
- 362 Alpha-5 objects loaded and indexed without issues ✓
|
||||||
@ -44,6 +44,14 @@ PG_FUNCTION_INFO_V1(gist_tle_distance);
|
|||||||
/* Floating-point comparison tolerance (km and radians) */
|
/* Floating-point comparison tolerance (km and radians) */
|
||||||
#define KEY_EPSILON 1.0e-9
|
#define KEY_EPSILON 1.0e-9
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The SQL type's INTERNALLENGTH. sizeof(pg_tle) is 104 due to struct
|
||||||
|
* packing, but the SQL definition declares 112. All allocations that
|
||||||
|
* become index datums must use TLE_TYPLEN so that PostgreSQL's
|
||||||
|
* index_form_tuple() never reads past the allocation.
|
||||||
|
*/
|
||||||
|
#define TLE_TYPLEN 112
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* 2-D orbital key extracted from a TLE's mean elements.
|
* 2-D orbital key extracted from a TLE's mean elements.
|
||||||
* Altitude band (perigee/apogee) plus inclination range.
|
* Altitude band (perigee/apogee) plus inclination range.
|
||||||
@ -231,6 +239,11 @@ tle_alt_distance(PG_FUNCTION_ARGS)
|
|||||||
*
|
*
|
||||||
* Leaf entries carry the full pg_tle; we compress to tle_orbital_key.
|
* Leaf entries carry the full pg_tle; we compress to tle_orbital_key.
|
||||||
* Internal entries are already tle_orbital_key from union operations.
|
* Internal entries are already tle_orbital_key from union operations.
|
||||||
|
*
|
||||||
|
* The allocation must be TLE_TYPLEN bytes (matching INTERNALLENGTH),
|
||||||
|
* not sizeof(tle_orbital_key) or sizeof(pg_tle). GiST's
|
||||||
|
* index_form_tuple() copies typlen bytes from the datum pointer;
|
||||||
|
* under-allocating causes a heap buffer overread.
|
||||||
*/
|
*/
|
||||||
Datum
|
Datum
|
||||||
gist_tle_compress(PG_FUNCTION_ARGS)
|
gist_tle_compress(PG_FUNCTION_ARGS)
|
||||||
@ -241,7 +254,7 @@ gist_tle_compress(PG_FUNCTION_ARGS)
|
|||||||
if (entry->leafkey)
|
if (entry->leafkey)
|
||||||
{
|
{
|
||||||
pg_tle *tle = (pg_tle *) DatumGetPointer(entry->key);
|
pg_tle *tle = (pg_tle *) DatumGetPointer(entry->key);
|
||||||
tle_orbital_key *key = (tle_orbital_key *) palloc(sizeof(tle_orbital_key));
|
tle_orbital_key *key = (tle_orbital_key *) palloc0(TLE_TYPLEN);
|
||||||
|
|
||||||
tle_to_orbital_key(tle, key);
|
tle_to_orbital_key(tle, key);
|
||||||
|
|
||||||
@ -328,9 +341,8 @@ gist_tle_consistent(PG_FUNCTION_ARGS)
|
|||||||
*
|
*
|
||||||
* The union is [min(alt_low), max(alt_high)] x [min(inc_low), max(inc_high)].
|
* The union is [min(alt_low), max(alt_high)] x [min(inc_low), max(inc_high)].
|
||||||
*
|
*
|
||||||
* GiST convention: entryvec->vector[] is 1-based (FirstOffsetNumber),
|
* The entry vector is 0-based: valid indices are 0 .. entryvec->n - 1.
|
||||||
* vector[0] is unused. entryvec->n includes the unused slot, so
|
* This differs from picksplit's 1-based convention.
|
||||||
* valid indices are 1 .. entryvec->n - 1.
|
|
||||||
*/
|
*/
|
||||||
Datum
|
Datum
|
||||||
gist_tle_union(PG_FUNCTION_ARGS)
|
gist_tle_union(PG_FUNCTION_ARGS)
|
||||||
@ -341,17 +353,17 @@ gist_tle_union(PG_FUNCTION_ARGS)
|
|||||||
tle_orbital_key *result;
|
tle_orbital_key *result;
|
||||||
tle_orbital_key *cur;
|
tle_orbital_key *cur;
|
||||||
|
|
||||||
result = (tle_orbital_key *) palloc(sizeof(tle_orbital_key));
|
result = (tle_orbital_key *) palloc0(TLE_TYPLEN);
|
||||||
cur = (tle_orbital_key *) DatumGetPointer(entryvec->vector[FirstOffsetNumber].key);
|
cur = (tle_orbital_key *) DatumGetPointer(entryvec->vector[0].key);
|
||||||
*result = *cur;
|
*result = *cur;
|
||||||
|
|
||||||
for (i = FirstOffsetNumber + 1; i < entryvec->n; i++)
|
for (i = 1; i < entryvec->n; i++)
|
||||||
{
|
{
|
||||||
cur = (tle_orbital_key *) DatumGetPointer(entryvec->vector[i].key);
|
cur = (tle_orbital_key *) DatumGetPointer(entryvec->vector[i].key);
|
||||||
key_merge(result, cur);
|
key_merge(result, cur);
|
||||||
}
|
}
|
||||||
|
|
||||||
*sizep = sizeof(tle_orbital_key);
|
*sizep = TLE_TYPLEN;
|
||||||
|
|
||||||
PG_RETURN_POINTER(result);
|
PG_RETURN_POINTER(result);
|
||||||
}
|
}
|
||||||
@ -418,11 +430,12 @@ picksplit_cmp(const void *a, const void *b)
|
|||||||
* along whichever dimension has the greater spread. This prevents
|
* along whichever dimension has the greater spread. This prevents
|
||||||
* the tree from becoming biased toward one dimension.
|
* the tree from becoming biased toward one dimension.
|
||||||
*
|
*
|
||||||
* GiST convention: entryvec->vector[] is 1-based (FirstOffsetNumber),
|
* GiST convention for picksplit: entryvec->vector[] is 1-based
|
||||||
* vector[0] is unused/uninitialized. entryvec->n includes the unused
|
* (FirstOffsetNumber), vector[0] is unused/uninitialized.
|
||||||
* slot, so the actual entry count is (entryvec->n - 1) and valid
|
* entryvec->n includes the unused slot, so the actual entry count
|
||||||
* indices are FirstOffsetNumber .. entryvec->n - 1. The OffsetNumbers
|
* is (entryvec->n - 1) and valid indices are
|
||||||
* placed into spl_left[] and spl_right[] must also be 1-based.
|
* FirstOffsetNumber .. entryvec->n - 1. The OffsetNumbers placed
|
||||||
|
* into spl_left[] and spl_right[] must also be 1-based.
|
||||||
*/
|
*/
|
||||||
Datum
|
Datum
|
||||||
gist_tle_picksplit(PG_FUNCTION_ARGS)
|
gist_tle_picksplit(PG_FUNCTION_ARGS)
|
||||||
@ -495,8 +508,8 @@ gist_tle_picksplit(PG_FUNCTION_ARGS)
|
|||||||
splitvec->spl_nright = 0;
|
splitvec->spl_nright = 0;
|
||||||
|
|
||||||
/* Compute union keys and assign entries */
|
/* Compute union keys and assign entries */
|
||||||
left_union = (tle_orbital_key *) palloc(sizeof(tle_orbital_key));
|
left_union = (tle_orbital_key *) palloc0(TLE_TYPLEN);
|
||||||
right_union = (tle_orbital_key *) palloc(sizeof(tle_orbital_key));
|
right_union = (tle_orbital_key *) palloc0(TLE_TYPLEN);
|
||||||
|
|
||||||
/* Seed the unions from the first entry in each half */
|
/* Seed the unions from the first entry in each half */
|
||||||
cur = (tle_orbital_key *) DatumGetPointer(
|
cur = (tle_orbital_key *) DatumGetPointer(
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user