pg_orrery/docs/DESIGN.md
Ryan Malloy 15fa553c0e Add optional JPL DE440/441 ephemeris support (v0.3.0)
Clean-room DE binary reader (~400 lines C) with Chebyshev/Clenshaw
evaluation — no GPL dependency on jpl_eph. Per-backend lazy
initialization preserves PARALLEL SAFE. Existing VSOP87/ELP82B
functions stay IMMUTABLE; new _de() variants are STABLE with
automatic fallback to compiled-in ephemerides on any DE failure.

Implementation:
- de_reader.c: header parse, record seek, Clenshaw recurrence
- eph_provider.c: GUC (pg_orbit.ephemeris_path), lazy init,
  ICRS-to-ecliptic frame rotation, on_proc_exit cleanup
- de_funcs.c: 11 new SQL functions (_de variants + diagnostics)
- Constant chain of custody rules 6-8 (frame rotation,
  same-provider, AU consistency)

Extract observe_from_geocentric() to astro_math.h for shared use
by planet_funcs.c, moon_funcs.c, and de_funcs.c.

57 → 68 functions, 11 → 12 regression test suites, all passing.
2026-02-16 19:54:48 -07:00

32 KiB

pg_orbit Design Document

Internal architecture notes. Documents WHY decisions were made, not how to use the extension. Intended audience: future maintainers who need to modify pg_orbit without breaking physical correctness.

1. Constant Chain of Custody

This is the single most critical design constraint in the extension. Get it wrong and positions silently drift by kilometers. There is no runtime check that can detect this class of error after the fact.

The Problem

Two-Line Elements are not raw orbital measurements. They are mean elements produced by a differential correction process that fits observed positions against an SGP4 propagator running with a specific set of geopotential constants (WGS-72). The mean elements absorb geodetic model biases -- the eccentricity, inclination, and mean motion encode corrections that only make physical sense when propagated with the same constants used to generate them.

Substituting WGS-84 constants (or any other set) into the propagator does not "upgrade" accuracy. It breaks the internal consistency of the element set. The resulting position error can exceed the natural prediction error of the TLE by an order of magnitude.

The Rules

  1. SGP4/SDP4 propagation: WGS-72 constants only (mu, ae, J2, J3, J4, ke). These flow through sat_code's norad_in.h defines and are never overridden.

  2. Coordinate output (geodetic lat/lon/alt, topocentric az/el/range): WGS-84 ellipsoid (a = 6378.137 km, f = 1/298.257223563). This is the modern standard for ground-station positioning and GPS receivers.

  3. TEME frame: The SGP4 output frame (True Equator, Mean Equinox) uses only 4 of the 106 IAU-80 nutation terms. Applying the full nutation model would "correct" for effects that SGP4 already accounts for internally, introducing error rather than removing it.

  4. No other combination is valid. WGS-72 for propagation, WGS-84 for output. Perigee and apogee altitudes from tle_perigee() and tle_apogee() use WGS-72 AE because they derive from mean elements. Geodetic altitude from eci_to_geodetic() uses WGS-84 because it converts a physical position.

Constant Inventory

Constant Source Paper Value pg_orbit Location sat_code Location
ae (equatorial radius) Hoots & Roehrich STR#3 6378.135 km types.h:20 (WGS72_AE) norad_in.h:64 (earth_radius_in_km)
J2 Hoots & Roehrich STR#3 0.001082616 types.h:21 (WGS72_J2) norad_in.h:69 (xj2)
J3 Hoots & Roehrich STR#3 -2.53881e-6 types.h:22 (WGS72_J3) norad_in.h:63 (xj3)
J4 Hoots & Roehrich STR#3 -1.65597e-6 types.h:23 (WGS72_J4) norad_in.h:79 (xj4)
ke Hoots & Roehrich STR#3 0.0743669161331734132 min^-1 types.h:24 (WGS72_KE) norad_in.h:83 (xke)
mu Hoots & Roehrich STR#3 398600.8 km^3/s^2 types.h:19 (WGS72_MU) (implicit in xke)
WGS-84 a NIMA TR8350.2 6378.137 km types.h:31 (WGS84_A) --
WGS-84 f NIMA TR8350.2 1/298.257223563 types.h:32 (WGS84_F) --

Note that types.h carries a parallel copy of the WGS-72 constants even though sat_code defines them in norad_in.h. This is intentional: types.h is the single header for all pg_orbit C sources, and norad_in.h is an internal sat_code header not meant for external consumers. The GiST index (gist_tle.c) and TLE accessor functions (tle_type.c) need KE and AE without pulling in sat_code internals. The values MUST be identical.

Why Two Copies of AE?

tle_perigee(), tle_apogee(), and the GiST altitude-band computation all use WGS72_KE and WGS72_AE from types.h. They compute:

a_er = (KE / n)^(2/3)           [earth radii]
perigee_km = a_er * (1 - e) * AE - AE

This MUST use WGS-72 AE (6378.135), not WGS-84 (6378.137), because n is a mean motion fitted against the WGS-72 geopotential. Using the wrong radius shifts every altitude by 2 meters -- small in absolute terms but wrong in principle, and the error compounds in index operations.

2. SGP4 Implementation Choice

pg_orbit wraps Bill Gray's sat_code library (MIT license, Project Pluto).

Why sat_code over alternatives

Vallado's reference implementation (from the STR#3 revision paper) is the canonical source but has two problems: it is written in C++ with heavy use of global state, and its license is unclear for embedding in a PostgreSQL extension.

libsgp4 (various forks on GitHub) is typically a C++ class hierarchy that assumes an object-per-satellite lifecycle. This conflicts with PostgreSQL's per-call execution model where we cannot persist C++ objects across function invocations.

sat_code was chosen because:

  1. Pure C linkage. All public functions are declared extern "C" in norad.h (lines 97-133). The library is compiled as C++ but exposes a flat C function interface: SGP4_init(), SGP4(), SDP4_init(), SDP4(), parse_elements(), select_ephemeris().

  2. No global mutable state. The propagator state lives in a caller- allocated double params[N_SAT_PARAMS] array. This maps directly to PostgreSQL's palloc-based memory model: allocate before propagation, free after.

  3. Includes deep-space SDP4. Many SGP4 implementations only handle near-earth orbits (period < 225 minutes). sat_code includes the full SDP4 with lunar/solar perturbations via deep.cpp, handling GEO, Molniya, and GPS orbits.

  4. MIT license. Compatible with the PostgreSQL License for embedding in a shared library.

  5. Actively maintained. Used in Bill Gray's Find_Orb production astrometry software. Bug fixes reach us through the git submodule.

Build Integration

The Makefile compiles sat_code's .cpp files with g++ and links the resulting .o files into the PostgreSQL shared library alongside our C sources. The -lstdc++ link flag pulls in the C++ runtime. This is the same pattern used by PostGIS for GEOS integration (C extension linking C++ library objects).

src/*.c       --> gcc --> .o  --|
lib/sat_code/*.cpp -> g++ -> .o --|--> pg_orbit.so
                              -lstdc++ -lm

The -I$(SAT_CODE_DIR) flag lets our C files #include "norad.h" directly.

3. Type System Design

Design Principles

Every pg_orbit type is fixed-size, not varlena. This means:

  • No TOAST overhead (no detoasting on access)
  • Direct pointer access via PG_GETARG_POINTER(n) -- no copy
  • Predictable memory layout for binary I/O (tle_recv/tle_send)
  • All types use ALIGNMENT = double to satisfy the strictest member alignment requirement (all structs contain double fields)
  • STORAGE = plain -- the type is never compressed or moved to TOAST

TLE Type (112 bytes)

The TLE type stores parsed mean elements, not raw text.

This is the most important type design decision. Alternatives considered:

  1. Store raw 69+69 character lines, parse on every propagation. Rejected. Parsing is ~10x slower than propagation itself for a pre-initialized model. Every sgp4_propagate() call would pay the parse cost.

  2. Store raw text plus parsed cache. Rejected. Doubles the storage for no benefit. The parsed form round-trips perfectly through write_elements_in_tle_format().

  3. Store parsed mean elements only. Chosen. Input validation happens once at tle_in() time via sat_code's parse_elements(). The text representation is reconstructed on output via write_elements_in_tle_format().

Storage layout:

Offset  Size  Field
  0     88    11 doubles: epoch, inclination, raan, eccentricity,
              arg_perigee, mean_anomaly, mean_motion,
              mean_motion_dot, mean_motion_ddot, bstar
              (one unused slot in the double array for alignment)
 80     12    3 int32: norad_id, elset_num, rev_num
 92     12    classification (1), ephemeris_type (1),
              intl_desig (9), pad (1)
----
112 bytes total

The SQL definition declares INTERNALLENGTH = 112. This is smaller than the raw two-line text (138+ bytes with line separator) and avoids the 4-byte varlena header overhead.

Angular elements are stored in radians (matching sat_code's internal representation). Accessor functions convert to degrees for human consumption. Mean motion is stored in radians/minute; the accessor returns revolutions/day.

ECI Position Type (48 bytes)

Six doubles: x, y, z (km), vx, vy, vz (km/s).

SGP4 outputs velocity in km/min. We convert to km/s at the boundary (sgp4_funcs.c, lines 181-183: vel[i] / 60.0). This conversion happens exactly once, at the point where the pg_eci struct is populated. Internally, all velocity in pg_orbit is km/s.

Geodetic Type (24 bytes)

Three doubles: lat, lon (radians), alt (km above WGS-84).

Radians internally, degrees in text representation. The accessor functions perform the conversion.

Observer Type (24 bytes)

Three doubles: lat, lon (radians), alt_m (meters above WGS-84).

Note the asymmetry with geodetic: observer altitude is in meters (matching GPS receiver output and ground station databases), while geodetic altitude is in km (matching orbital altitude conventions). This prevents unit confusion at the API boundary -- you set up a ground station in meters, you get satellite altitude in kilometers.

Topocentric Type (32 bytes)

Four doubles: azimuth, elevation (radians), range_km, range_rate (km/s).

Pass Event Type (48 bytes)

Three int64 (TimestampTz): aos_time, max_el_time, los_time. Three doubles: max_elevation (degrees), aos_azimuth (degrees), los_azimuth (degrees).

Times are stored as native PostgreSQL TimestampTz values (microseconds since 2000-01-01 00:00:00 UTC). This allows direct comparison with SQL timestamp expressions without conversion.

4. TEME to ECEF Transform

SGP4 outputs position and velocity in the TEME (True Equator, Mean Equinox) frame. Converting to Earth-fixed coordinates for geodetic and topocentric output requires a frame rotation.

The Rotation

TEME to ECEF is a single Z-axis rotation by the negative of the Greenwich Mean Sidereal Time (GMST) angle:

    [x_ecef]   [ cos(g)  sin(g)  0 ] [x_teme]
    [y_ecef] = [-sin(g)  cos(g)  0 ] [y_teme]
    [z_ecef]   [   0       0     1 ] [z_teme]

This is implemented in coord_funcs.c:teme_to_ecef() and pass_funcs.c:teme_to_ecef() (duplicated -- see note below).

GMST Computation

GMST is computed from the Julian date using the IAU 1982 formula, which is the same low-precision model that SGP4 uses internally. From Vallado (2013) Eq. 3-47:

    T_UT1 = (JD - 2451545.0) / 36525.0
    GMST  = 67310.54841
          + (876600*3600 + 8640184.812866) * T_UT1
          + 0.093104 * T_UT1^2
          - 6.2e-6 * T_UT1^3

The result is in seconds of time, converted to radians by multiplying by pi/43200 and normalized to [0, 2*pi).

We deliberately do NOT use a higher-precision GMST model (e.g., IAU 2000A). The SGP4 output is only accurate to the precision of its own GMST model; applying a more precise rotation would not improve the final position and could introduce a systematic offset.

Velocity Transform

The velocity transform includes the Earth angular velocity cross product term, accounting for the rotating reference frame:

    v_ecef = R(-g) * v_teme + omega_E x r_ecef

where omega_E = 7.2921158553e-5 rad/s. In pass_funcs.c, the velocity cross product uses omega_E in rad/min (multiplied by 60.0) because sat_code outputs velocity in km/min.

ECEF to Geodetic

The ECEF-to-geodetic conversion uses Bowring's iterative method, which converges in 2-3 iterations for LEO altitudes:

    phi_0 = atan2(z, p * (1 - e^2))          [initial estimate]
    N     = a / sqrt(1 - e^2 * sin^2(phi))   [prime vertical radius]
    phi   = atan2(z + e^2 * N * sin(phi), p)  [iterate]

Convergence criterion: 1.0e-12 radians (sub-millimeter at Earth's surface). Maximum 10 iterations as a safety bound, though LEO cases converge in 2-3.

We chose Bowring's method over Vermeille's or Heikkinen's closed-form solutions because it is simpler, well-tested, and the iteration cost is negligible compared to the SGP4 propagation that precedes it.

Why Duplicated Helpers

coord_funcs.c and pass_funcs.c each contain their own static copies of pg_tle_to_sat_code(), gmst_from_jd(), teme_to_ecef(), observer_to_ecef(), and ecef_to_topocentric(). All .c files link into a single .so, so we could share these through a common object file. We chose duplication instead because:

  1. Each translation unit is self-contained. No hidden coupling between files through shared internal functions.
  2. The functions are small (5-20 lines each). The binary size increase is negligible.
  3. The compiler can inline them within each translation unit.
  4. If the helpers ever need to diverge (e.g., pass_funcs using km/min velocity while coord_funcs uses km/s), they can do so without affecting each other.

5. Pass Prediction Algorithm

The Core Problem

Given a TLE and a ground observer, find all time intervals where the satellite is above the observer's local horizon. SGP4 has no closed-form inverse for the elevation function -- you cannot algebraically solve "at what time does elevation = 0?" You have to evaluate the function at many points and search for zero crossings.

This is the same fundamental approach used by Skyfield's find_events() and Gpredict's pass finder.

Algorithm

  1. Coarse scan: Step through the search window at 30-second intervals, evaluating elevation at each step. A 30-second step is fine enough for LEO (~90 min period, ~10 min pass duration) that no pass shorter than about 1 minute gets skipped, and coarse enough that a 7-day window requires ~20,000 propagation calls (fast on modern hardware).

  2. AOS detection: When elevation transitions from negative to positive between consecutive coarse steps, a rising edge has been found. Bisect the interval to locate AOS to within 0.1 second tolerance (BISECT_TOL_JD = 0.1/86400).

  3. LOS detection: Continue the coarse scan from AOS until elevation goes negative again. Bisect to refine LOS to the same 0.1 second tolerance.

  4. Peak elevation: Between AOS and LOS, the elevation function is unimodal (one peak). Use ternary search (50 iterations) to locate the maximum. Ternary search is chosen over golden section because the convergence rate is adequate and the implementation is simpler.

  5. Degenerate pass filter: Passes shorter than 10 seconds (MIN_PASS_DURATION_JD) are discarded. These are typically numerical noise from orbits just barely grazing the horizon.

  6. Minimum elevation filter: Passes whose peak elevation falls below the caller-specified threshold are silently skipped. The scan resumes from their LOS.

  7. Post-LOS gap: After finding a complete pass, the scan resumes 1 minute past LOS (POST_LOS_GAP_JD) to avoid re-detecting the same descending arc.

Error Handling in the Scan

If SGP4 returns a hard error (negative semi-major axis, nearly parabolic, convergence failure) during the elevation scan, elevation_at_jd() returns -pi radians. This is well below any real horizon elevation, so the scan treats the point as "below horizon" and continues. The pass finder does NOT abort the query on propagation errors during scanning -- a TLE might be valid for part of the search window and decayed for the rest.

SRF (Set-Returning Function) Pattern

predict_passes() uses PostgreSQL's ValuePerCall SRF protocol:

  • First call: allocate a predict_passes_ctx struct in funcctx->multi_call_memory_ctx, copy the TLE and observer into it, initialize the scan position.
  • Each subsequent call: call find_next_pass() starting from the current scan position. If a pass is found, advance current_jd past LOS and return the pass. If not, return DONE.

The TLE and observer are copied into the multi_call context because the original function arguments may be freed between calls.

6. GiST Index Architecture

The Indexing Problem

The natural query pattern for conjunction screening is: "which TLEs have orbits that could intersect with this TLE?" Full conjunction analysis requires propagating both objects and computing distance at every timestep -- far too expensive for an index scan.

Altitude-Band Approximation

Every orbit has a perigee altitude and an apogee altitude. Two orbits can only come close to each other if their altitude ranges overlap. This is a necessary but not sufficient condition for conjunction.

The GiST index stores [perigee_km, apogee_km] as its internal key (tle_alt_range struct: two doubles, 16 bytes). This is derived from the mean elements via Kepler's third law:

    a = (KE / n)^(2/3)           [earth radii, WGS-72]
    perigee_km = a * (1 - e) * AE - AE
    apogee_km  = a * (1 + e) * AE - AE

The altitude band uses WGS-72 constants because n and e are WGS-72 mean elements.

GiST Support Functions

Function Strategy Description
compress -- Leaf: extract [perigee, apogee] from pg_tle. Internal: pass through (already a range).
decompress -- Identity. We operate on compressed keys directly.
consistent && (3), @> (7), <@ (8) Does the subtree's bounding range overlap/contain the query range?
union -- [min(low), max(high)] over all entries in a node.
penalty -- Increase in altitude span (km) from expanding the node's range to include the new entry.
picksplit -- Sort entries by altitude midpoint, split at the median.
same -- Endpoint comparison within 1e-9 km tolerance.
distance <-> (15) Minimum gap between altitude bands (0 if overlapping). Drives KNN queries.

Always Recheck

gist_tle_consistent() always sets *recheck = true. Altitude band overlap eliminates the vast majority of non-candidates (a LEO satellite cannot conjunct with a GEO satellite), but it cannot confirm conjunction. Two objects at the same altitude but on opposite sides of the earth are not close. After the index scan filters candidates, the query must propagate both TLEs to the reference time and compute actual distance.

Picksplit Strategy

The picksplit function sorts entries by the midpoint of their altitude band ((perigee + apogee) / 2) and splits at the median. This is optimal for a 1-D range index because it minimizes overlap between the two resulting subtrees. Orbits at similar altitudes end up in the same subtree, which matches the access pattern of conjunction screening (you are usually querying within a narrow altitude band).

7. Memory Management

Allocation Strategy

All heap allocation goes through palloc() / pfree(). No malloc(), no new, no static buffers.

  • Single-shot propagation (sgp4_propagate, tle_distance): palloc(sizeof(double) * N_SAT_PARAMS) for the params array, propagate, pfree(params). The params array lives in the current memory context and is freed before the function returns.

  • Set-returning functions (sgp4_propagate_series, ground_track, predict_passes): The SRF context struct (containing tle_t, params[N_SAT_PARAMS], and scan state) is allocated in funcctx->multi_call_memory_ctx. This context survives across calls and is freed automatically when the SRF completes.

  • Type I/O functions (tle_in, eci_in, etc.): The result struct is allocated with palloc() in the current context. PostgreSQL manages its lifecycle.

No Global Mutable State

For v0.1.0/v0.2.0 functions, there are no file-scope variables, no static locals that accumulate state, no caches. Every function computes from its arguments alone.

The v0.3.0 DE ephemeris layer introduces per-backend static state in eph_provider.c (file descriptor, coefficient cache, init flags). This is safe because each backend gets its own copy after fork(). The handle is cleaned up via on_proc_exit(). All pg_orbit functions remain PARALLEL SAFE -- parallel workers each open their own DE handle independently.

sat_code itself has no global mutable state. The propagator state is entirely in the params[] array and the tle_t struct, both of which are caller-provided.

SRF Context Layout

For sgp4_propagate_series and ground_track, the params array is embedded directly in the context struct (not a separate allocation):

typedef struct {
    tle_t       sat;
    double      params[N_SAT_PARAMS];  /* ~92 doubles, embedded */
    int         is_deep;
    double      epoch_jd;
    int64       start_ts;
    int64       step_usec;
} propagate_series_ctx;

This puts everything in a single allocation, reducing palloc overhead and keeping the data contiguous in memory for cache locality during the per-row propagation loop.

8. Error Handling

Error Classification

sat_code returns integer error codes from SGP4() and SDP4():

Code Constant Severity Meaning pg_orbit Response
0 -- OK Normal Return result
-1 SXPX_ERR_NEARLY_PARABOLIC Fatal eccentricity >= 1 ereport(ERROR)
-2 SXPX_ERR_NEGATIVE_MAJOR_AXIS Fatal Decayed orbit ereport(ERROR)
-3 SXPX_WARN_ORBIT_WITHIN_EARTH Warning Entire orbit below surface ereport(NOTICE), return result
-4 SXPX_WARN_PERIGEE_WITHIN_EARTH Warning Perigee below surface ereport(NOTICE), return result
-5 SXPX_ERR_NEGATIVE_XN Fatal Negative mean motion ereport(ERROR)
-6 SXPX_ERR_CONVERGENCE_FAIL Fatal Kepler equation diverged ereport(ERROR)

The distinction between warnings (-3, -4) and errors (-1, -2, -5, -6) is physical. A satellite with perigee within the earth is plausible during reentry or shortly after launch -- the state vector is still mathematically valid. A negative semi-major axis means the orbit has decayed past the point where the model produces meaningful output.

Error Handling in Different Contexts

Direct propagation (sgp4_propagate, tle_distance): Fatal errors abort the query via ereport(ERROR). Warnings emit a NOTICE and return the result.

Pass prediction (elevation_at_jd in pass_funcs.c): Fatal errors return -pi elevation instead of aborting. The scan treats this as "below horizon" and continues. A TLE might be valid for the first 3 days of a 7-day search window and then decay -- the pass finder should return the valid passes, not abort.

SRF propagation (sgp4_propagate_series): Fatal errors abort the entire series. There is no meaningful way to "skip" a bad timestep in a continuous time series.

Input Validation

TLE parsing errors from parse_elements() are caught in tle_in() and reported as ERRCODE_INVALID_TEXT_REPRESENTATION. Invalid TLEs never make it into storage.

select_ephemeris() returning -1 (invalid mean motion or eccentricity range) is caught at propagation time, not at storage time. A TLE with marginal elements might parse correctly but fail when you try to initialize the propagator.

9. Theory-to-Code Mapping

This table maps key equations from the SGP4 theory papers to their implementation in pg_orbit and sat_code.

Theory Paper What Code Location
Mean element recovery Brouwer (1959) Recover original mean motion (xnodp) and semi-major axis (aodp) from input TLE elements, removing secular J2 perturbations sat_code/common.cpp:sxpall_common_init() lines 17-35
Secular perturbations Lane & Cranford (1969), Hoots & Roehrich STR#3 Secular rates of mean anomaly, argument of perigee, and RAAN due to J2, J4 sat_code/common.cpp:sxpx_common_init() lines 86-101
Atmospheric drag Hoots & Roehrich STR#3 B* formulation of drag, C1/C2/C4 coefficients, perigee-dependent s parameter sat_code/common.cpp:sxpx_common_init() lines 47-84; sat_code/sgp4.cpp:SGP4_init()
Short-period perturbations Lane & Cranford (1969), Brouwer (1959) Oscillatory corrections to radius, argument of latitude, node, and inclination sat_code/common.cpp:sxpx_posn_vel() lines 121-229
Kepler equation Classical Newton-Raphson with second-order correction, bounded first step sat_code/common.cpp:sxpx_posn_vel() lines 175-208
Deep-space resonance Hujsak (1979) Lunar and solar gravitational perturbations, geopotential resonance for 12-hour and 24-hour orbits sat_code/deep.cpp:Deep_dpinit(), Deep_dpsec(), Deep_dpper()
Near-earth propagation Hoots & Roehrich STR#3 SGP4 main loop: secular + short-period + drag terms sat_code/sgp4.cpp:SGP4()
Deep-space propagation Hoots & Roehrich STR#3 SDP4: SGP4 core + deep-space secular/periodic corrections sat_code/sdp4.cpp:SDP4()
Semi-major axis from n Kepler's third law a = (KE / n)^(2/3) in earth radii src/tle_type.c:tle_perigee() line 415; src/gist_tle.c:tle_to_alt_range() line 76
GMST Vallado (2013) Eq. 3-47 Greenwich Mean Sidereal Time from Julian date src/coord_funcs.c:gmst_from_jd() lines 59-73; src/pass_funcs.c:gmst_from_jd() lines 129-151
TEME to ECEF Vallado (2013) Z-axis rotation by -GMST, velocity cross-product correction src/coord_funcs.c:teme_to_ecef() lines 83-103; src/pass_funcs.c:teme_to_ecef() lines 157-179
Geodetic from ECEF Bowring (1976) Iterative latitude from ECEF Cartesian coordinates on WGS-84 src/coord_funcs.c:ecef_to_geodetic() lines 109-138
Topocentric transform Standard SEZ ECEF range vector rotated to South-East-Zenith, azimuth from north src/coord_funcs.c:ecef_to_topocentric() lines 163-188
Observer to ECEF Geodesy standard WGS-84 ellipsoid surface point to Cartesian src/coord_funcs.c:observer_to_ecef() lines 143-156
Range rate Dot product Projection of relative velocity onto line-of-sight unit vector src/coord_funcs.c:eci_to_topocentric() line 618
Near/deep selection Hoots & Roehrich STR#3 Period threshold: 225 minutes (n < 2*pi/225 rad/min) sat_code/norad.h:select_ephemeris()

10. JPL DE Ephemeris Architecture (v0.3.0)

v0.3.0 adds optional JPL DE440/441 ephemeris support (~0.1 milliarcsecond accuracy) without modifying any existing VSOP87/ELP82B code path. This section documents the architectural decisions specific to DE integration.

The Fundamental Tension

pg_orbit's core properties (compiled-in coefficients, no file I/O, no mutable state) are precisely what DE441 challenges. A ~3GB binary file introduces file dependency, per-backend state (file descriptor, coefficient cache), and OS-level file descriptor management across PostgreSQL's multi-process model.

The architecture resolves this by treating DE as an additive layer: new _de() function variants alongside existing functions. Existing functions are untouched. VSOP87 is the bulkhead -- always compiled in, always works, always IMMUTABLE.

Why Separate Functions (Not a GUC Switch)

Considered and rejected: a single planet_observe() that dispatches to DE or VSOP87 based on a GUC setting.

The problem is volatility. VSOP87 functions are IMMUTABLE -- their data is compiled into the binary. PostgreSQL can:

  • Cache results in expression indexes (almanac tables, pork chop grids)
  • Constant-fold during planning (bake results into prepared statements)

If planet_observe() dispatched to DE based on a GUC, it would need to be STABLE, losing these optimizations for ALL users, even those without a DE file. Separate _de() variants let VSOP87 users keep IMMUTABLE and DE users get STABLE -- honest volatility for both.

Clean-Room DE Reader Design

The JPL DE binary format (used by DE405, DE430, DE440, DE441) consists of fixed-size records of double values:

  • Record 1 (header): Start/end JD, record interval, number of coefficients, AU value, Earth-Moon mass ratio (EMRAT), coefficient layout table (3 values per body: offset, ncoeff, nsub)
  • Record 2: Constant values
  • Records 3+: Chebyshev polynomial coefficients covering a time interval for all 13 body groups

The reader is implemented in ~250 lines of C (de_reader.c), using:

  • Raw POSIX I/O (open/lseek/read, not fread) to avoid libc buffering issues across fork
  • Clenshaw recurrence for Chebyshev evaluation (~15 lines)
  • Single-record coefficient cache (most queries hit consecutive times)

Why Not jpl_eph

Bill Gray's jpl_eph (GPL-2+) would be the obvious choice, but:

  1. GPL-2+ license constrains pg_orbit's licensing flexibility
  2. Uses global statics (static int init_err_code)
  3. Written in C++ (jpleph.cpp); pg_orbit is pure C
  4. We only need position queries, not velocity or nutation

The format is well-documented and the algorithm is straightforward. A clean-room implementation in ~250 lines avoids all four issues.

Per-Backend Lazy Initialization

PostgreSQL parallel workers are forked from the postmaster, not the leader. They don't inherit the leader's file descriptors or initialized handles.

Each backend (including parallel workers) gets its own independently- opened handle via lazy init on first _de() call:

Backend A calls planet_observe_de() -> opens own fd, own cache
Worker W1 calls planet_observe_de() -> opens own fd, own cache
Worker W2 calls planet_observe_de() -> opens own fd, own cache

Static per-backend variables (de_handle_ptr, de_init_attempted) are safe because after fork(), each process gets its own copy-on-write address space.

Critical rule: Never open the DE file in _PG_init(). That runs in the postmaster, and the file descriptor would be inherited by all children with undefined stream behavior. Always lazy init.

Cleanup is via on_proc_exit(eph_cleanup, 0), registered in _PG_init().

ICRS-to-Ecliptic Frame Rotation

DE ephemerides return positions in the ICRS equatorial frame. The pg_orbit observation pipeline expects ecliptic J2000. The conversion happens at the provider boundary in eph_provider.c:

DE position (ICRS equatorial) -> equatorial_to_ecliptic() -> ecliptic J2000
                                  ^
                                  already in astro_math.h since v0.2.0

This rotation is applied to BOTH the target body and Earth positions before they enter observe_from_geocentric(). The frame conversion happens exactly once, at the provider boundary.

Earth Position from DE

DE ephemerides don't store Earth directly. They store the Earth-Moon Barycenter (EMB, body 3) and the Moon (body 10). Earth's position is derived:

Earth = EMB - Moon / (1 + EMRAT)

where EMRAT (~81.300587) is the Earth-Moon mass ratio from the DE header. This calculation happens in eph_provider.c:de_get_earth_helio_ecliptic().

Constant Chain of Custody Extension

Three new rules for the DE pipeline:

Rule What Why
6 DE positions pass through equatorial_to_ecliptic() at the provider boundary DE returns ICRS equatorial; the observation pipeline expects ecliptic J2000
7 Both target and Earth must come from the same provider Mixing DE Mars with VSOP87 Earth introduces frame inconsistency
8 DE header AU must match compiled-in AU_KM (149597870.7) AU defines the length scale; a mismatch corrupts all distance calculations

Fallback Strategy

Every _de() function follows this pattern:

  1. Try to get position from DE
  2. If DE succeeds, use DE result
  3. If DE fails (no file, JD out of range, I/O error):
    • If DE was explicitly configured (GUC set), emit ereport(NOTICE)
    • Fall back to VSOP87/ELP82B equivalent
    • Return the VSOP87 result

When pg_orbit.ephemeris_path is empty (default), DE functions fall back silently -- no NOTICE, no overhead, identical results to the non-DE variants.

Lauren Bugs (DE-Specific)

Bug What "Can't Happen" How It Happens Mitigation
L1 File won't change mid-query DBA replaces file; new parallel worker opens new version Per-backend handle is stable for backend lifetime
L2 File will always be there Docker restart with ephemeral volume VSOP87 fallback always available
L3 File is always valid Partial download, bit rot, wrong endianness Canary validation: Earth at J2000.0 ~1 AU from Sun; byte-order detection via AU constant
L4 All backends use same version Primary has DE441, replica has DE440 AU consistency check; log ephemeris version at NOTICE
L5 read() always succeeds NFS timeout, disk error, fd limit Every read() return checked; propagate as ereport(ERROR)