pypi-query-mcp/VERSION_SORTING_FIX.md
Ryan Malloy 251ceb4c2d fix: implement semantic version sorting
- Add sort_versions_semantically function using packaging library
- Fix issue where pre-release versions appeared before stable (5.2rc1 vs 5.2.5)
- Handle edge cases: dev, post, invalid versions with graceful fallback
- Add comprehensive test suite covering all scenarios
- Maintain backward compatibility with existing functionality
2025-08-15 11:53:40 -06:00

6.0 KiB

Semantic Version Sorting Fix

Problem Description

The get_package_versions tool was using basic string sorting for package versions instead of semantic version sorting. This caused incorrect ordering where pre-release versions (like 5.2rc1) appeared before stable versions (like 5.2.5) when they should come after.

Specific Issue

  • Problem: "5.2rc1" was appearing before "5.2.5" in version lists
  • Root Cause: Using sorted(releases.keys(), reverse=True) performs lexicographic string sorting
  • Impact: Misleading version order in package version queries

Solution Implemented

1. Added Semantic Version Sorting Function

File: /pypi_query_mcp/core/version_utils.py

def sort_versions_semantically(versions: list[str], reverse: bool = True) -> list[str]:
    """Sort package versions using semantic version ordering.
    
    This function properly sorts versions by parsing them as semantic versions,
    ensuring that pre-release versions (alpha, beta, rc) are ordered correctly
    relative to stable releases.
    """

Key Features:

  • Uses the packaging.version.Version class for proper semantic parsing
  • Handles pre-release versions correctly (alpha < beta < rc < stable)
  • Gracefully handles invalid versions by falling back to string sorting
  • Maintains original version strings in output
  • Comprehensive logging for debugging

2. Updated Package Query Functions

File: /pypi_query_mcp/tools/package_query.py

Changes Made:

  1. Import: Added from ..core.version_utils import sort_versions_semantically
  2. format_version_info(): Replaced basic sorting with semantic sorting
  3. format_package_info(): Updated available_versions to use semantic sorting

Before:

# Sort versions (basic sorting, could be improved with proper version parsing)
sorted_versions = sorted(releases.keys(), reverse=True)

After:

# Sort versions using semantic version ordering
sorted_versions = sort_versions_semantically(list(releases.keys()), reverse=True)

Test Results

1. Unit Tests - Semantic Version Sorting

Test 1 - Pre-release ordering:
  Input:  ['5.2rc1', '5.2.5', '5.2.0', '5.2a1', '5.2b1']
  Output: ['5.2.5', '5.2.0', '5.2rc1', '5.2b1', '5.2a1']
  ✅ PASS: Correct pre-release ordering

2. Task Requirement Validation

Task requirement validation:
  Input: ['5.2rc1', '5.2.5']
  Output: ['5.2.5', '5.2rc1']
  Requirement: '5.2rc1' should come after '5.2.5'
  ✅ PASS: Requirement met!

3. Pre-release Ordering Validation

Pre-release ordering validation:
  Input: ['1.0.0', '1.0.0rc1', '1.0.0b1', '1.0.0a1']
  Output: ['1.0.0', '1.0.0rc1', '1.0.0b1', '1.0.0a1']
  Expected order: stable > rc > beta > alpha
  ✅ PASS: Pre-release ordering correct!

4. Real Package Testing

Django (complex versioning with pre-releases):

Recent versions: ['5.2.5', '5.2.4', '5.2.3', '5.2.2', '5.2.1', '5.2', '5.2rc1', '5.2b1', '5.2a1', '5.1.11']
String-sorted:   ['5.2rc1', '5.2b1', '5.2a1', '5.2.5', '5.2.4', '5.2.3', '5.2.2', '5.2.1', '5.2', '5.1.9']
✅ Semantic sorting correctly places stable versions before pre-releases

NumPy (simple versioning):

Recent versions: ['2.3.2', '2.3.1', '2.3.0', '2.2.6', '2.2.5', '2.2.4', '2.2.3', '2.2.2', '2.2.1', '2.2.0']
✅ Both sorting methods produce identical results (as expected for simple versions)

5. Edge Cases Testing

Complex versions with dev, post, and invalid versions:

Input:  ['1.0.0', '1.0.0.post1', '1.0.0.dev0', '1.0.0a1', '1.0.0b1', '1.0.0rc1', '1.0.1', 'invalid-version', '1.0']
Output: ['1.0.1', '1.0.0.post1', '1.0.0', '1.0', '1.0.0rc1', '1.0.0b1', '1.0.0a1', '1.0.0.dev0', 'invalid-version']
✅ Handles all edge cases correctly

6. Regression Testing

poetry run python -m pytest tests/ -v
============================= test session starts ==============================
64 passed in 9.25s
✅ All existing tests continue to pass

Implementation Details

Semantic Version Ordering Rules

  1. Stable versions come before pre-release versions of the same base version
  2. Pre-release ordering: alpha < beta < rc < stable
  3. Development versions (dev) come before alpha versions
  4. Post-release versions (post) come after stable versions
  5. Invalid versions are sorted lexicographically and placed after valid versions

Error Handling

  • Invalid version strings are gracefully handled
  • Falls back to string sorting for unparseable versions
  • Logs warnings for invalid versions (debug level)
  • Maintains all original version strings in output

Performance Considerations

  • Minimal performance impact (parsing is fast)
  • Uses efficient sorting algorithms
  • Caches parsed versions during single sort operation
  • No breaking changes to existing API

Verification Commands

# Run standalone semantic version tests
python test_version_sorting_standalone.py

# Test with real PyPI packages
poetry run python test_real_packages.py

# Test specific task requirement
poetry run python test_specific_case.py

# Run full test suite
poetry run python -m pytest tests/ -v

Files Modified

  1. /pypi_query_mcp/core/version_utils.py: Added sort_versions_semantically() function
  2. /pypi_query_mcp/tools/package_query.py: Updated to use semantic version sorting

Dependencies

  • Uses existing packaging library (already a dependency in pyproject.toml)
  • No new dependencies added
  • Compatible with Python 3.10+

Conclusion

The semantic version sorting fix successfully resolves the issue where pre-release versions were incorrectly appearing before stable versions. The implementation:

  • Fixes the specific problem mentioned (5.2rc1 vs 5.2.5)
  • Handles all pre-release types correctly (alpha, beta, rc)
  • Manages edge cases (dev, post, invalid versions)
  • Maintains backward compatibility
  • Passes all existing tests
  • Uses robust, industry-standard version parsing

The fix provides accurate, intuitive version ordering that matches user expectations and semantic versioning standards.