astro-discovery/docs/src/content/docs/how-to/customize-llm-instructions.md
Ryan Malloy 74cffc2842 Complete how-to guide documentation
Add comprehensive problem-oriented how-to guides following Diátaxis framework:
- Block specific bots from crawling the site
- Customize LLM instructions for AI assistants
- Add team members to humans.txt
- Filter sitemap pages
- Configure cache headers for discovery files
- Environment-specific configuration
- Integration with Astro content collections
- Custom templates for discovery files
- ActivityPub/Fediverse integration via WebFinger

Each guide provides:
- Clear prerequisites
- Step-by-step solutions
- Multiple approaches/variations
- Expected outcomes
- Alternative approaches
- Common issues and troubleshooting

Total: 9 guides, 6,677 words
2025-11-08 23:32:22 -07:00

5.6 KiB

title description
Customize LLM Instructions Provide custom instructions for AI assistants using llms.txt

Configure how AI assistants interact with your site by customizing instructions in llms.txt.

Prerequisites

  • Integration installed and configured
  • Understanding of your site's main use cases
  • Knowledge of your API endpoints (if applicable)

Add Basic Instructions

Provide clear guidance for AI assistants:

// astro.config.mjs
discovery({
  llms: {
    description: 'Technical documentation for the Discovery API',
    instructions: `
When helping users with this site:
1. Check the documentation before answering
2. Provide code examples when relevant
3. Link to specific documentation pages
4. Use the search API for queries
    `.trim()
  }
})

Highlight Key Features

Guide AI assistants to important capabilities:

discovery({
  llms: {
    description: 'E-commerce platform for sustainable products',
    keyFeatures: [
      'Carbon footprint calculator for all products',
      'Subscription management with flexible billing',
      'AI-powered product recommendations',
      'Real-time inventory tracking'
    ]
  }
})

Document Important Pages

Direct AI assistants to critical resources:

discovery({
  llms: {
    importantPages: [
      {
        name: 'API Documentation',
        path: '/docs/api',
        description: 'Complete API reference with examples'
      },
      {
        name: 'Getting Started Guide',
        path: '/docs/quick-start',
        description: 'Step-by-step setup instructions'
      },
      {
        name: 'FAQ',
        path: '/help/faq',
        description: 'Common questions and solutions'
      }
    ]
  }
})

Describe Your APIs

Help AI assistants use your endpoints correctly:

discovery({
  llms: {
    apiEndpoints: [
      {
        path: '/api/search',
        method: 'GET',
        description: 'Search products by name, category, or tag'
      },
      {
        path: '/api/products/:id',
        method: 'GET',
        description: 'Get detailed product information'
      },
      {
        path: '/api/calculate-carbon',
        method: 'POST',
        description: 'Calculate carbon footprint for a cart'
      }
    ]
  }
})

Set Brand Voice Guidelines

Maintain consistent communication style:

discovery({
  llms: {
    brandVoice: [
      'Professional yet approachable',
      'Focus on sustainability and environmental impact',
      'Use concrete examples, not abstract concepts',
      'Avoid jargon unless explaining technical features',
      'Emphasize long-term value over short-term savings'
    ]
  }
})

Load Content Dynamically

Pull important pages from content collections:

import { getCollection } from 'astro:content';

discovery({
  llms: {
    importantPages: async () => {
      const docs = await getCollection('docs');

      // Filter to featured pages only
      return docs
        .filter(doc => doc.data.featured)
        .map(doc => ({
          name: doc.data.title,
          path: `/docs/${doc.slug}`,
          description: doc.data.description
        }));
    }
  }
})

Add Custom Sections

Include specialized information:

discovery({
  llms: {
    customSections: {
      'Data Privacy': `
We are GDPR compliant. User data is encrypted at rest and in transit.
Data retention policy: 90 days for analytics, 7 years for transactions.
      `.trim(),

      'Rate Limits': `
API rate limits:
- Authenticated: 1000 requests/hour
- Anonymous: 60 requests/hour
- Burst: 20 requests/second
      `.trim(),

      'Support Channels': `
For assistance:
- Documentation: https://example.com/docs
- Email: support@example.com (response within 24h)
- Community: https://discord.gg/example
      `.trim()
    }
  }
})

Environment-Specific Instructions

Different instructions for development vs production:

discovery({
  llms: {
    instructions: import.meta.env.PROD
      ? `Production site - use live API endpoints at https://api.example.com`
      : `Development site - API endpoints may be mocked or unavailable`
  }
})

Verify Your Configuration

Build and check the output:

npm run build
npm run preview
curl http://localhost:4321/llms.txt

Look for your instructions, features, and API documentation in the formatted output.

Expected Result

Your llms.txt will contain structured information:

# example.com

> E-commerce platform for sustainable products

---

## Key Features

- Carbon footprint calculator for all products
- AI-powered product recommendations

## Instructions for AI Assistants

When helping users with this site:
1. Check the documentation before answering
2. Provide code examples when relevant

## API Endpoints

- `GET /api/search`
  Search products by name, category, or tag
  Full URL: https://example.com/api/search

AI assistants will use this information to provide accurate, context-aware help.

Alternative Approaches

Multiple llms.txt files: Create llms-full.txt for comprehensive docs, llms.txt for summary.

Dynamic generation: Use a build script to extract API docs from OpenAPI specs.

Language-specific versions: Generate different files for different locales (llms-en.txt, llms-es.txt).

Common Issues

Too much information: Keep it concise. AI assistants prefer focused, actionable guidance.

Outdated instructions: Use lastUpdate: 'auto' or automate updates from your CMS.

Missing context: Don't assume knowledge. Explain domain-specific terms and workflows.

Unclear priorities: List most important pages/features first. AI assistants may prioritize early content.