Update docs with 66k benchmark results, honest SP-GiST framing
- benchmarks.mdx: Add GiST conjunction screening and KNN sections, update all numbers to 66,440-object catalog, PG 17→18, show SP-GiST slower than seqscan at this scale with explanation of why - operators-gist.mdx: Real 66k performance tables for GiST and SP-GiST, rewrite KNN example with scalar subquery pattern, add CTE warning - conjunction-screening.mdx: Update catalog size, candidate counts, add KNN scalar subquery note, verified performance numbers
This commit is contained in:
parent
de742fc3aa
commit
1adab6e136
@ -17,7 +17,7 @@ Operational conjunction screening uses several established tools and data source
|
|||||||
- **CelesTrak SOCRATES**: Dr. Kelso's web-based close-approach listing. Updated regularly, covers the full public catalog. Not queryable; you read reports.
|
- **CelesTrak SOCRATES**: Dr. Kelso's web-based close-approach listing. Updated regularly, covers the full public catalog. Not queryable; you read reports.
|
||||||
- **Python scripts**: Propagate the catalog in a loop, compute pairwise distances, filter by threshold. Works for small catalogs. Does not scale.
|
- **Python scripts**: Propagate the catalog in a loop, compute pairwise distances, filter by threshold. Works for small catalogs. Does not scale.
|
||||||
|
|
||||||
The fundamental challenge: a catalog of 25,000+ tracked objects produces over 300 million unique pairs. Even checking each pair at a single epoch takes significant time. Checking over a 7-day window at 1-minute resolution is computationally prohibitive without pre-filtering.
|
The fundamental challenge: a catalog of 66,000+ tracked objects produces over 2 billion unique pairs. Even checking each pair at a single epoch takes significant time. Checking over a 7-day window at 1-minute resolution is computationally prohibitive without pre-filtering.
|
||||||
|
|
||||||
## What changes with pg_orrery
|
## What changes with pg_orrery
|
||||||
|
|
||||||
@ -89,7 +89,7 @@ INSERT INTO catalog VALUES (99901, 'Equatorial-LEO',
|
|||||||
CREATE INDEX catalog_orbit_gist ON catalog USING gist (tle);
|
CREATE INDEX catalog_orbit_gist ON catalog USING gist (tle);
|
||||||
```
|
```
|
||||||
|
|
||||||
The index builds in milliseconds for a small table. For a full 25,000-object catalog, expect about 200ms.
|
The index builds in milliseconds for a small table. For a full 66,440-object catalog, build time is 93 ms (15 MB index).
|
||||||
|
|
||||||
### Check orbital parameters
|
### Check orbital parameters
|
||||||
|
|
||||||
@ -157,19 +157,21 @@ This should return only ISS itself (and not Equatorial-LEO, which has a differen
|
|||||||
Find the 3 closest objects to the ISS by altitude band separation, ordered by distance:
|
Find the 3 closest objects to the ISS by altitude band separation, ordered by distance:
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
SET enable_seqscan = off;
|
-- Scalar subquery probe enables GiST index-ordered scan
|
||||||
|
|
||||||
SELECT name,
|
SELECT name,
|
||||||
round((tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544))::numeric, 0) AS alt_dist_km
|
round((tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544 LIMIT 1))::numeric, 0)
|
||||||
|
AS alt_dist_km
|
||||||
FROM catalog
|
FROM catalog
|
||||||
WHERE norad_id != 25544
|
WHERE norad_id != 25544
|
||||||
ORDER BY tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544)
|
ORDER BY tle <-> (SELECT tle FROM catalog WHERE norad_id = 25544 LIMIT 1)
|
||||||
LIMIT 3;
|
LIMIT 3;
|
||||||
|
|
||||||
RESET enable_seqscan;
|
|
||||||
```
|
```
|
||||||
|
|
||||||
This uses the GiST distance operator for efficient ordering. PostgreSQL's KNN-GiST infrastructure handles this without computing all distances upfront.
|
This uses the GiST distance operator for efficient ordering. PostgreSQL's KNN-GiST infrastructure traverses the tree by increasing distance without computing all distances upfront. On a 66,440-object catalog, this completes in 2.1 ms for 10 neighbors.
|
||||||
|
|
||||||
|
<Aside type="caution" title="Use scalar subqueries, not CTEs">
|
||||||
|
GiST index-ordered scans require the probe value to be visible to the planner as a constant. A `WITH iss AS (...)` CTE makes the probe opaque, forcing a full sequential scan and sort. Always use `(SELECT tle FROM ... LIMIT 1)` as the probe argument for KNN queries on large catalogs.
|
||||||
|
</Aside>
|
||||||
|
|
||||||
### Self-overlap is always true
|
### Self-overlap is always true
|
||||||
|
|
||||||
@ -206,7 +208,7 @@ The complete two-stage workflow for a larger catalog:
|
|||||||
AND c.norad_id != 25544;
|
AND c.norad_id != 25544;
|
||||||
```
|
```
|
||||||
|
|
||||||
For the ISS in a 25,000-object catalog, this typically returns a few hundred candidates.
|
For the ISS in a 66,440-object catalog, this returns 9 candidates (all co-orbital vehicles: visiting spacecraft, modules, and debris). The GiST index scan completes in 4.6 ms vs. 63.3 ms for a sequential scan.
|
||||||
|
|
||||||
3. **Stage 2: Time-resolved distance computation:**
|
3. **Stage 2: Time-resolved distance computation:**
|
||||||
|
|
||||||
@ -253,7 +255,7 @@ ORDER BY actual_dist_km;
|
|||||||
```
|
```
|
||||||
|
|
||||||
<Aside type="tip" title="Performance scaling">
|
<Aside type="tip" title="Performance scaling">
|
||||||
The GiST index is the key to scaling. Without it, screening a 25,000-object catalog for all-vs-all conjunctions means 300 million pair evaluations. With GiST, the `&&` operator reduces this to tens of thousands of candidate pairs. The `tle_distance()` computation on candidates is then feasible even at fine time resolution.
|
The GiST index is the key to scaling. Without it, screening a 66,440-object catalog for all-vs-all conjunctions means over 2 billion pair evaluations. With GiST, the `&&` operator reduces single-probe screening from 63 ms (sequential) to 4.6 ms (indexed), a 5.8x speedup. For the ISS, only 9 candidates survive from 66,440 objects. The `tle_distance()` computation on these survivors is then feasible even at 1-minute time resolution over multi-day windows.
|
||||||
</Aside>
|
</Aside>
|
||||||
|
|
||||||
### Monitoring over time
|
### Monitoring over time
|
||||||
|
|||||||
@ -6,7 +6,7 @@ sidebar:
|
|||||||
|
|
||||||
import { Aside, Tabs, TabItem } from "@astrojs/starlight/components";
|
import { Aside, Tabs, TabItem } from "@astrojs/starlight/components";
|
||||||
|
|
||||||
Measured performance numbers for pg_orrery's core operations. Every number on this page was produced by running the listed SQL query against a live PostgreSQL 17 instance with a single backend, no parallel workers, and no connection pooling overhead.
|
Measured performance numbers for pg_orrery's core operations. Every number on this page was produced by running the listed SQL query against a live PostgreSQL 18 instance with a single backend, no parallel workers, and no connection pooling overhead.
|
||||||
|
|
||||||
<Aside type="note" title="Methodology">
|
<Aside type="note" title="Methodology">
|
||||||
All benchmarks use PostgreSQL's `EXPLAIN (ANALYZE, BUFFERS)` for timing. The numbers are wall-clock execution time for the query, not per-function overhead. Each benchmark was run three times; the reported value is the median. Cold start was avoided by running each query once before measurement.
|
All benchmarks use PostgreSQL's `EXPLAIN (ANALYZE, BUFFERS)` for timing. The numbers are wall-clock execution time for the query, not per-function overhead. Each benchmark was run three times; the reported value is the median. Cold start was avoided by running each query once before measurement.
|
||||||
@ -17,7 +17,9 @@ All benchmarks use PostgreSQL's `EXPLAIN (ANALYZE, BUFFERS)` for timing. The num
|
|||||||
| Operation | Count | Time | Rate | Notes |
|
| Operation | Count | Time | Rate | Notes |
|
||||||
|-----------|-------|------|------|-------|
|
|-----------|-------|------|------|-------|
|
||||||
| TLE propagation (SGP4) | 12,000 | 17 ms | 706K/sec | Mixed LEO/MEO/GEO |
|
| TLE propagation (SGP4) | 12,000 | 17 ms | 706K/sec | Mixed LEO/MEO/GEO |
|
||||||
| Visibility cone filter (`&?`) | 65,886 | 12.5 ms | 5.3M/sec | 80% pruned, no SGP4 |
|
| Visibility cone filter (`&?`) | 66,440 | 12.1 ms | 5.5M/sec | 84% pruned (2h, 10°), no SGP4 |
|
||||||
|
| Conjunction screening (`&&`) | 66,440 | 4.6 ms | — | ISS: 9 co-orbital objects found |
|
||||||
|
| KNN altitude ordering (`<->`) | 66,440 | 2.1 ms | — | 10 nearest to ISS, index-ordered |
|
||||||
| Planet observation (VSOP87) | 875 | 57 ms | 15.4K/sec | All 7 non-Earth planets, 125 times each |
|
| Planet observation (VSOP87) | 875 | 57 ms | 15.4K/sec | All 7 non-Earth planets, 125 times each |
|
||||||
| Galilean moon observation | 1,000 | 63 ms | 15.9K/sec | L1.2 + VSOP87 pipeline |
|
| Galilean moon observation | 1,000 | 63 ms | 15.9K/sec | L1.2 + VSOP87 pipeline |
|
||||||
| Saturn moon observation | 800 | 53 ms | 15.1K/sec | TASS17 + VSOP87 |
|
| Saturn moon observation | 800 | 53 ms | 15.1K/sec | TASS17 + VSOP87 |
|
||||||
@ -25,7 +27,7 @@ All benchmarks use PostgreSQL's `EXPLAIN (ANALYZE, BUFFERS)` for timing. The num
|
|||||||
| Lambert transfer solve | 100 | 0.1 ms | 800K/sec | Single-rev prograde |
|
| Lambert transfer solve | 100 | 0.1 ms | 800K/sec | Single-rev prograde |
|
||||||
| Pork chop plot (150 x 150) | 22,500 | 8.3 s | 2.7K/sec | Full VSOP87 + Lambert pipeline |
|
| Pork chop plot (150 x 150) | 22,500 | 8.3 s | 2.7K/sec | Full VSOP87 + Lambert pipeline |
|
||||||
|
|
||||||
**Conditions:** PostgreSQL 17.2, single backend, no parallel workers, Intel Xeon E-2286G @ 4.0 GHz, 64 GB ECC DDR4-2666. Extension compiled with GCC 14.2, `-O2`.
|
**Conditions:** PostgreSQL 18.1, single backend, no parallel workers, Intel Xeon E-2286G @ 4.0 GHz, 64 GB ECC DDR4-2666. Extension compiled with GCC 14.2, `-O2`.
|
||||||
|
|
||||||
## TLE propagation
|
## TLE propagation
|
||||||
|
|
||||||
@ -225,7 +227,7 @@ A 7-day window at 30-second coarse scan resolution requires ~20,160 propagation
|
|||||||
The `&?` operator answers "could this satellite possibly be visible from this observer?" using three geometric filters (altitude, inclination, RAAN) without any SGP4 propagation. This is the first stage of the pass prediction pipeline, reducing the number of satellites that need full propagation.
|
The `&?` operator answers "could this satellite possibly be visible from this observer?" using three geometric filters (altitude, inclination, RAAN) without any SGP4 propagation. This is the first stage of the pass prediction pipeline, reducing the number of satellites that need full propagation.
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
-- Benchmark: filter a 66,000-object catalog
|
-- Benchmark: filter a 66,440-object catalog
|
||||||
-- Eagle, Idaho: 2-hour window, 10 deg minimum elevation
|
-- Eagle, Idaho: 2-hour window, 10 deg minimum elevation
|
||||||
EXPLAIN (ANALYZE, BUFFERS)
|
EXPLAIN (ANALYZE, BUFFERS)
|
||||||
SELECT count(*)
|
SELECT count(*)
|
||||||
@ -238,53 +240,133 @@ WHERE tle &? ROW(
|
|||||||
)::observer_window;
|
)::observer_window;
|
||||||
```
|
```
|
||||||
|
|
||||||
**65,886 TLEs filtered in 12.5 ms --- 80% pruned, 12,964 candidates survive.**
|
**66,440 TLEs filtered in 12.1 ms --- 83.8% pruned, 10,763 candidates survive.**
|
||||||
|
|
||||||
The operator evaluates three geometric conditions per TLE: perigee altitude vs. maximum visible altitude, inclination + ground footprint vs. observer latitude, and RAAN alignment via J2 secular precession. Each check is a few floating-point operations --- no SGP4 initialization, no Kepler equation, no trigonometric series.
|
The operator evaluates three geometric conditions per TLE: perigee altitude vs. maximum visible altitude, inclination + ground footprint vs. observer latitude, and RAAN alignment via J2 secular precession. Each check is a few floating-point operations --- no SGP4 initialization, no Kepler equation, no trigonometric series.
|
||||||
|
|
||||||
### Pruning rate by query pattern
|
### Pruning rate by query pattern
|
||||||
|
|
||||||
The pruning rate depends on observer latitude, query window duration, and minimum elevation. Shorter windows and higher latitudes prune more aggressively.
|
Measured against a 66,440-object catalog merged from Space-Track, CelesTrak, SatNOGS, and CelesTrak SupGP. The pruning rate depends on observer latitude, query window duration, and minimum elevation. Shorter windows and higher minimum elevations prune more aggressively.
|
||||||
|
|
||||||
| Query | Candidates | Pruned | Notes |
|
| Query | Candidates | Pruned | Notes |
|
||||||
|-------|-----------|--------|-------|
|
|-------|-----------|--------|-------|
|
||||||
| 2h, Eagle ID (43.7°N), 10° | 12,964 | 80.3% | Typical mid-latitude evening |
|
| 2h, Eagle ID (43.7°N), 10° | 10,763 | 83.8% | Typical mid-latitude evening |
|
||||||
| 2h, Tromsoe (69.6°N), 10° | 6,529 | 90.1% | High latitude: inclination filter strongest |
|
| 2h, Equator (0°N), 10° | 10,174 | 84.7% | All inclinations pass latitude check; RAAN filter dominates |
|
||||||
| 2h, South Pole (85°S), 10° | 5,248 | 92.0% | Only polar-orbit satellites survive |
|
| 2h, Eagle ID, 45° | 6,796 | 89.8% | Higher elevation: altitude filter tighter |
|
||||||
| 2h, Equator (0°N), 10° | 9,699 | 85.3% | All inclinations pass latitude check; RAAN filter dominates |
|
| 24h, Eagle ID, 10° | 61,426 | 7.5% | RAAN filter bypassed (full Earth rotation) |
|
||||||
| 2h, Eagle ID, 30° | 9,680 | 85.3% | Higher elevation: altitude filter tighter |
|
|
||||||
| 6h, Eagle ID, 10° | 24,274 | 63.2% | Wider RAAN window admits more candidates |
|
|
||||||
| 24h, Eagle ID, 10° | 60,875 | 7.6% | RAAN filter bypassed (full Earth rotation) |
|
|
||||||
|
|
||||||
### SP-GiST index performance
|
### SP-GiST index performance
|
||||||
|
|
||||||
The optional SP-GiST index (`tle_spgist_ops`) builds a 2-level trie partitioned by semi-major axis and inclination. At 66,000 objects, the index adds 1--2 ms overhead compared to a sequential scan for mid-latitude observers, but **beats the sequential scan for high-latitude queries** where inclination pruning eliminates entire subtrees:
|
The optional SP-GiST index (`tle_spgist_ops`) builds a 2-level trie partitioned by semi-major axis and inclination. At 66,440 objects, sequential evaluation of the `&?` operator (12 ms) is faster than the SP-GiST index scan (16--23 ms). The tree traversal overhead exceeds the pruning benefit at this catalog size because the `&?` operator itself is so cheap --- three floating-point comparisons per TLE.
|
||||||
|
|
||||||
| Query | Seqscan | SP-GiST | Difference |
|
| Query | Seqscan | SP-GiST | Candidates | Pruned |
|
||||||
|-------|---------|---------|------------|
|
|-------|---------|---------|------------|--------|
|
||||||
| 2h, Eagle ID, 10° | 12.5 ms | 14.0 ms | +1.5 ms |
|
| 2h, Eagle ID, 10° | 12.1 ms | 16.1 ms | 10,763 | 83.8% |
|
||||||
| 2h, Tromsoe, 10° | 11.3 ms | 10.9 ms | **-0.4 ms** |
|
| 2h, Equator, 10° | 12.1 ms | 16.8 ms | 10,174 | 84.7% |
|
||||||
|
| 2h, Eagle ID, 45° | 11.9 ms | 16.9 ms | 6,796 | 89.8% |
|
||||||
|
| 24h, Eagle ID, 10° | 12.5 ms | 23.3 ms | 61,426 | 7.5% |
|
||||||
|
|
||||||
The planner chooses the SP-GiST Index Only Scan by default at this catalog size, with zero heap fetches (all data served from index pages).
|
The SP-GiST index achieves zero heap fetches (pure Index Only Scan), but page traversal through 11 MB of index data (4,964 buffer hits) exceeds the cost of a 1,338-buffer sequential scan.
|
||||||
|
|
||||||
<Aside type="tip" title="Where the index shines">
|
<Aside type="tip" title="Where the SP-GiST index adds value">
|
||||||
The SP-GiST index is most valuable for high-latitude observers (60°+) and for catalogs larger than 30,000 objects. At typical CelesTrak catalog sizes (12--15,000 active satellites), the `&?` operator's sequential evaluation is fast enough that the index overhead exceeds the pruning benefit.
|
The `&?` operator prunes 84--90% of the catalog regardless of scan method. Its primary value is as a **gating filter** before expensive SGP4 propagation. For a 2-hour window, reducing 66,440 TLEs to ~10,000 candidates saves ~56,000 `predict_passes()` calls (each ~1 ms), a far greater benefit than the 4 ms difference between scan methods.
|
||||||
|
|
||||||
|
At larger catalog sizes (200k+ objects), the SP-GiST tree-level pruning should begin to outperform sequential evaluation. The crossover point depends on hardware, but the operator's pruning ratio is the dominant performance factor, not the scan method.
|
||||||
</Aside>
|
</Aside>
|
||||||
|
|
||||||
### What the pruning means for predict_passes()
|
### What the pruning means for predict_passes()
|
||||||
|
|
||||||
For a 65,886-object catalog and a 2-hour window from Eagle, Idaho:
|
For a 66,440-object catalog and a 2-hour window from Eagle, Idaho:
|
||||||
|
|
||||||
- **Without `&?`:** 65,886 `predict_passes()` calls (each ~1 ms for a 7-day window)
|
- **Without `&?`:** 66,440 `predict_passes()` calls (each ~1 ms for a 7-day window)
|
||||||
- **With `&?`:** 12,964 calls --- **52,922 unnecessary propagations avoided**
|
- **With `&?`:** 10,763 calls --- **55,677 unnecessary propagations avoided**
|
||||||
- **Time saved:** ~53 seconds per query at typical propagation cost
|
- **Time saved:** ~56 seconds per query at typical propagation cost
|
||||||
|
|
||||||
|
## Conjunction screening (`&&` operator)
|
||||||
|
|
||||||
|
The GiST index on the `tle` type enables indexed conjunction screening using the `&&` (overlap) operator. The index stores altitude band and inclination for each TLE, allowing PostgreSQL to skip entire subtrees of non-overlapping orbits.
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Benchmark: find ISS conjunction candidates in a 66,440-object catalog
|
||||||
|
EXPLAIN (ANALYZE, BUFFERS)
|
||||||
|
SELECT b.name
|
||||||
|
FROM satellite_catalog a
|
||||||
|
JOIN satellite_catalog b ON a.tle && b.tle AND a.norad_id != b.norad_id
|
||||||
|
WHERE tle_norad_id(a.tle) = 25544;
|
||||||
|
```
|
||||||
|
|
||||||
|
**9 co-orbital objects found in 4.6 ms (vs. 63.3 ms sequential scan --- 5.8x speedup).**
|
||||||
|
|
||||||
|
The GiST index scan hits 237 buffers compared to 1,338 for a sequential scan. The 9 objects returned are all ISS-visiting vehicles or co-orbital modules: PROGRESS MS-31, PROGRESS MS-32, SOYUZ MS-28, DRAGON FREEDOM 3, DRAGON CRS-33, CYGNUS NG-23, HTV-X1, ISS (NAUKA), and OBJECT E.
|
||||||
|
|
||||||
|
### GiST `&&` performance by orbital regime
|
||||||
|
|
||||||
|
| Probe satellite | GiST | Seqscan | Matches | Notes |
|
||||||
|
|----------------|------|---------|---------|-------|
|
||||||
|
| ISS (LEO, 51.6°) | 4.6 ms | 63.3 ms | 9 | Co-orbital vehicles |
|
||||||
|
| Starlink-230369 (LEO, 53°) | 9.5 ms | 14.9 ms | 0 | Dense LEO shell |
|
||||||
|
| SYNCOM 2 (GEO, 33°) | 4.0 ms | 7.2 ms | 0 | Sparse regime |
|
||||||
|
|
||||||
|
The GiST index provides the largest speedup for queries that return few matches, where the index prunes most of the tree without reading leaf pages. Dense LEO shells produce more candidates and reduce the speedup ratio.
|
||||||
|
|
||||||
|
### Index characteristics
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Build time | 93 ms (66,440 TLEs) |
|
||||||
|
| Index size | 15 MB (237 bytes/object) |
|
||||||
|
| Consistency | 0 false positives, 0 false negatives (verified against seqscan) |
|
||||||
|
|
||||||
|
## KNN altitude ordering (`<->` operator)
|
||||||
|
|
||||||
|
The `<->` operator computes altitude-band separation in km. With a GiST index, it supports index-ordered KNN queries --- PostgreSQL traverses the tree by increasing distance without computing all distances upfront.
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Benchmark: 10 nearest orbits to the ISS by altitude separation
|
||||||
|
EXPLAIN (ANALYZE, BUFFERS)
|
||||||
|
SELECT name,
|
||||||
|
round((tle <-> (SELECT tle FROM satellite_catalog
|
||||||
|
WHERE tle_norad_id(tle) = 25544 LIMIT 1))::numeric, 1)
|
||||||
|
AS alt_sep_km
|
||||||
|
FROM satellite_catalog
|
||||||
|
ORDER BY tle <-> (SELECT tle FROM satellite_catalog
|
||||||
|
WHERE tle_norad_id(tle) = 25544 LIMIT 1)
|
||||||
|
LIMIT 10;
|
||||||
|
```
|
||||||
|
|
||||||
|
**10 nearest in 2.1 ms, index-ordered (982 buffer hits).**
|
||||||
|
|
||||||
|
### KNN performance by scenario
|
||||||
|
|
||||||
|
| Query | Time | Buffers | Notes |
|
||||||
|
|-------|------|---------|-------|
|
||||||
|
| 10 nearest to ISS (LEO) | 2.1 ms | 982 | Dense regime, more nodes traversed |
|
||||||
|
| 10 nearest to SYNCOM 2 (GEO) | 0.2 ms | 40 | Sparse regime, fewer nodes |
|
||||||
|
| 100 nearest to ISS | 1.4 ms | 1,062 | Marginal cost per additional neighbor |
|
||||||
|
| All within 50 km of ISS | 16.0 ms | 4,014 | 12,496 matches |
|
||||||
|
|
||||||
|
<Aside type="caution" title="KNN requires a scalar subquery probe">
|
||||||
|
GiST index-ordered scans only activate when the probe value is visible to the planner as a constant. Use a **scalar subquery** for the probe TLE:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- This uses the index (scalar subquery → constant to planner):
|
||||||
|
ORDER BY tle <-> (SELECT tle FROM catalog WHERE tle_norad_id(tle) = 25544 LIMIT 1)
|
||||||
|
|
||||||
|
-- This does NOT use the index (CTE is opaque to the planner):
|
||||||
|
WITH iss AS (SELECT tle FROM catalog WHERE tle_norad_id(tle) = 25544)
|
||||||
|
SELECT ... ORDER BY tle <-> iss.tle -- falls back to full scan + sort
|
||||||
|
```
|
||||||
|
|
||||||
|
The CTE pattern works correctly but forces PostgreSQL to compute all distances and sort, which is much slower for large catalogs. For small catalogs (< 100 rows), the difference is negligible.
|
||||||
|
</Aside>
|
||||||
|
|
||||||
## Reproducing these benchmarks
|
## Reproducing these benchmarks
|
||||||
|
|
||||||
<Tabs>
|
<Tabs>
|
||||||
<TabItem label="Requirements">
|
<TabItem label="Requirements">
|
||||||
- PostgreSQL 17 with pg_orrery installed
|
- PostgreSQL 18 with pg_orrery installed
|
||||||
- A satellite catalog table with ~12,000 TLEs (see [Building TLE Catalogs](/guides/catalog-management/) or download directly from CelesTrak)
|
- A satellite catalog table (the numbers on this page use a 66,440-object catalog merged from Space-Track, CelesTrak, SatNOGS, and CelesTrak SupGP; see [Building TLE Catalogs](/guides/catalog-management/))
|
||||||
|
- GiST and SP-GiST indexes on the `tle` column for index benchmarks
|
||||||
- A star catalog table (any subset of Hipparcos or Yale BSC)
|
- A star catalog table (any subset of Hipparcos or Yale BSC)
|
||||||
- No concurrent queries during measurement
|
- No concurrent queries during measurement
|
||||||
- `shared_buffers` and `work_mem` at default or higher
|
- `shared_buffers` and `work_mem` at default or higher
|
||||||
@ -295,12 +377,16 @@ For a 65,886-object catalog and a 2-hour window from Eagle, Idaho:
|
|||||||
|
|
||||||
-- Load a TLE catalog (pg-orrery-catalog handles this)
|
-- Load a TLE catalog (pg-orrery-catalog handles this)
|
||||||
-- pg-orrery-catalog build --table satellite_catalog | psql -d mydb
|
-- pg-orrery-catalog build --table satellite_catalog | psql -d mydb
|
||||||
CREATE TABLE satellite_catalog (tle tle);
|
CREATE TABLE satellite_catalog (name text, tle tle);
|
||||||
-- (or COPY from CelesTrak bulk TLE file)
|
-- (or COPY from CelesTrak bulk TLE file)
|
||||||
|
|
||||||
|
-- Create both indexes for full benchmark coverage
|
||||||
|
CREATE INDEX idx_tle_gist ON satellite_catalog USING gist (tle tle_ops);
|
||||||
|
CREATE INDEX idx_tle_spgist ON satellite_catalog USING spgist (tle tle_spgist_ops);
|
||||||
|
|
||||||
-- Verify catalog size
|
-- Verify catalog size
|
||||||
SELECT count(*) FROM satellite_catalog;
|
SELECT count(*) FROM satellite_catalog;
|
||||||
-- Expected: ~12,000 rows
|
-- The numbers on this page use 66,440 rows
|
||||||
|
|
||||||
-- Disable parallel workers for baseline measurement
|
-- Disable parallel workers for baseline measurement
|
||||||
SET max_parallel_workers_per_gather = 0;
|
SET max_parallel_workers_per_gather = 0;
|
||||||
@ -326,6 +412,8 @@ For a 65,886-object catalog and a 2-hour window from Eagle, Idaho:
|
|||||||
|
|
||||||
The benchmarks demonstrate that pg_orrery's computation cost is low enough to treat orbital mechanics as a SQL primitive. Propagating an entire satellite catalog takes less time than a typical index scan on a moderately-sized table. Planet observation is fast enough to generate ephemeris tables with `generate_series`. Pork chop plots are feasible as interactive queries rather than batch jobs.
|
The benchmarks demonstrate that pg_orrery's computation cost is low enough to treat orbital mechanics as a SQL primitive. Propagating an entire satellite catalog takes less time than a typical index scan on a moderately-sized table. Planet observation is fast enough to generate ephemeris tables with `generate_series`. Pork chop plots are feasible as interactive queries rather than batch jobs.
|
||||||
|
|
||||||
The visibility cone filter (`&?`) is the fastest operation per evaluation --- three floating-point comparisons vs. the full SGP4 pipeline --- and its 80--92% pruning rate means the most expensive operation in a pass prediction pipeline (SGP4 propagation) only runs on the small fraction of the catalog that could actually produce a visible pass.
|
The visibility cone filter (`&?`) is the fastest operation per evaluation --- three floating-point comparisons vs. the full SGP4 pipeline --- and its 84--90% pruning rate means the most expensive operation in a pass prediction pipeline (SGP4 propagation) only runs on the small fraction of the catalog that could actually produce a visible pass.
|
||||||
|
|
||||||
The numbers also show where the bottlenecks are: VSOP87 series evaluation dominates everything except star observation, raw SGP4 propagation, and the visibility cone filter. If a future optimization effort targets one component, it should be the VSOP87 evaluation loop.
|
The GiST index provides the clearest speedup for conjunction screening: 5.8x faster than sequential scan for ISS `&&` queries, with 0 false positives or negatives verified against exhaustive sequential evaluation. KNN queries find the nearest orbits in 2 ms via index-ordered traversal, which would otherwise require computing and sorting all 66,440 distances.
|
||||||
|
|
||||||
|
The numbers also show where the bottlenecks are: VSOP87 series evaluation dominates everything except star observation, raw SGP4 propagation, and the geometric filters. If a future optimization effort targets one component, it should be the VSOP87 evaluation loop.
|
||||||
|
|||||||
@ -134,17 +134,23 @@ WHERE c.tle && iss.tle
|
|||||||
<TabItem label="kNN by altitude">
|
<TabItem label="kNN by altitude">
|
||||||
```sql
|
```sql
|
||||||
-- Find the 10 satellites with the closest altitude bands to the ISS
|
-- Find the 10 satellites with the closest altitude bands to the ISS
|
||||||
-- The <-> operator supports GiST ordering (ORDER BY ... <-> ...)
|
-- The <-> operator supports GiST index ordering (ORDER BY ... <-> ...)
|
||||||
WITH iss AS (
|
-- IMPORTANT: use a scalar subquery for the probe TLE so the planner
|
||||||
SELECT tle FROM satellite_catalog WHERE norad_id = 25544
|
-- can see it as a constant and activate index-ordered scan.
|
||||||
)
|
SELECT c.name,
|
||||||
SELECT c.norad_id, c.name,
|
round((c.tle <-> (SELECT tle FROM satellite_catalog
|
||||||
round((c.tle <-> iss.tle)::numeric, 1) AS alt_sep_km
|
WHERE tle_norad_id(tle) = 25544 LIMIT 1))::numeric, 1)
|
||||||
FROM satellite_catalog c, iss
|
AS alt_sep_km
|
||||||
WHERE c.norad_id != 25544
|
FROM satellite_catalog c
|
||||||
ORDER BY c.tle <-> iss.tle
|
WHERE tle_norad_id(c.tle) != 25544
|
||||||
|
ORDER BY c.tle <-> (SELECT tle FROM satellite_catalog
|
||||||
|
WHERE tle_norad_id(tle) = 25544 LIMIT 1)
|
||||||
LIMIT 10;
|
LIMIT 10;
|
||||||
```
|
```
|
||||||
|
|
||||||
|
<Aside type="caution" title="CTE pattern prevents index ordering">
|
||||||
|
A CTE like `WITH iss AS (SELECT tle ...)` makes the probe value opaque to the planner, forcing a full sequential scan and sort instead of an index-ordered traversal. Always use a scalar subquery `(SELECT tle FROM ... LIMIT 1)` for the probe argument. For small catalogs (< 100 rows) the difference is negligible; for large catalogs it is the difference between 2 ms and a full sort.
|
||||||
|
</Aside>
|
||||||
</TabItem>
|
</TabItem>
|
||||||
<TabItem label="Two-stage screening">
|
<TabItem label="Two-stage screening">
|
||||||
```sql
|
```sql
|
||||||
@ -171,10 +177,18 @@ ORDER BY dist_km;
|
|||||||
|
|
||||||
### Performance
|
### Performance
|
||||||
|
|
||||||
Without the GiST index, the `&&` operator requires a sequential scan of the entire catalog (O(n) per query). With the index, overlap queries run in O(log n) time. For a catalog of 12,000 active TLEs, this reduces conjunction screening from seconds to milliseconds.
|
Benchmarked against a 66,440-object catalog (Space-Track + CelesTrak + SatNOGS):
|
||||||
|
|
||||||
|
| Query | GiST | Seqscan | Matches | Speedup |
|
||||||
|
|-------|------|---------|---------|---------|
|
||||||
|
| ISS conjunction (`&&`) | 4.6 ms | 63.3 ms | 9 | 5.8x |
|
||||||
|
| 10 nearest to ISS (`<->` KNN) | 2.1 ms | — | 10 | Index-ordered |
|
||||||
|
| 10 nearest to GEO sat (`<->` KNN) | 0.2 ms | — | 10 | Sparse regime |
|
||||||
|
|
||||||
|
The GiST index (15 MB, 93 ms build) provides the clearest speedup for conjunction screening. The `&&` operator reduces the search from 1,338 buffer hits (sequential scan) to 237 buffer hits (index scan). KNN queries traverse the tree by increasing distance without computing all distances upfront.
|
||||||
|
|
||||||
<Aside type="tip">
|
<Aside type="tip">
|
||||||
The GiST index is most valuable for large catalogs (thousands of TLEs). For small catalogs (< 100 TLEs), sequential scans may be faster than the index overhead. PostgreSQL's query planner handles this decision automatically.
|
For small catalogs (< 100 TLEs), sequential scans may be faster than the index overhead. PostgreSQL's query planner handles this decision automatically. The GiST index shows the largest relative speedup when the query returns few matches against a large catalog --- exactly the conjunction screening pattern.
|
||||||
</Aside>
|
</Aside>
|
||||||
|
|
||||||
### Index Maintenance
|
### Index Maintenance
|
||||||
@ -288,15 +302,24 @@ During the index scan, inner nodes are pruned by altitude band (level 0) and inc
|
|||||||
|
|
||||||
### Performance
|
### Performance
|
||||||
|
|
||||||
The `&?` operator eliminates 80-90% of a satellite catalog without SGP4 propagation --- this is the primary value, regardless of whether a sequential scan or index scan evaluates it. At typical catalog sizes (10-30k objects), the operator evaluates the full catalog in under 10 ms, and PostgreSQL's query planner may choose a sequential scan over the index.
|
The `&?` operator eliminates 84--90% of a satellite catalog without SGP4 propagation --- this is the primary value, regardless of whether a sequential scan or index scan evaluates it.
|
||||||
|
|
||||||
The SP-GiST index becomes advantageous at larger catalog sizes (100k+ objects) where tree-level pruning avoids examining individual TLEs in entire subtrees. The index is most effective for:
|
Benchmarked against a 66,440-object catalog:
|
||||||
|
|
||||||
|
| Query | Seqscan | SP-GiST | Candidates | Pruned |
|
||||||
|
|-------|---------|---------|------------|--------|
|
||||||
|
| 2h, Eagle ID, 10° | 12.1 ms | 16.1 ms | 10,763 | 83.8% |
|
||||||
|
| 2h, Equator, 10° | 12.1 ms | 16.8 ms | 10,174 | 84.7% |
|
||||||
|
| 2h, Eagle ID, 45° | 11.9 ms | 16.9 ms | 6,796 | 89.8% |
|
||||||
|
| 24h, Eagle ID, 10° | 12.5 ms | 23.3 ms | 61,426 | 7.5% |
|
||||||
|
|
||||||
|
At 66k objects, the sequential scan is faster than the SP-GiST index for all tested scenarios. The `&?` operator is so cheap per evaluation (three floating-point comparisons) that tree traversal overhead exceeds the pruning benefit at this catalog size. The index is most effective for:
|
||||||
|
|
||||||
- **Short query windows** (1-6 hours): The RAAN filter aggressively eliminates satellites whose orbital plane is not currently aligned with the observer
|
- **Short query windows** (1-6 hours): The RAAN filter aggressively eliminates satellites whose orbital plane is not currently aligned with the observer
|
||||||
- **Mid-latitude observers** (30-60 degrees): The inclination filter eliminates equatorial and low-inclination satellites
|
- **Higher minimum elevation** (> 20 degrees): The altitude filter eliminates distant MEO/GEO objects
|
||||||
- **High minimum elevation** (> 20 degrees): The altitude filter eliminates distant MEO/GEO objects
|
- **Larger catalogs** (200k+ objects): Tree-level pruning avoids examining individual TLEs in entire subtrees
|
||||||
|
|
||||||
For 24-hour query windows, the RAAN filter self-disables (full Earth rotation makes it meaningless), and only the altitude and inclination filters apply.
|
For 24-hour query windows, the RAAN filter self-disables (full Earth rotation makes it meaningless), and only the altitude and inclination filters apply. The real value of the `&?` operator is as a gating filter before expensive SGP4 propagation, not the scan method itself.
|
||||||
|
|
||||||
### Index Maintenance
|
### Index Maintenance
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user