Complete all explanation pages
- why-discovery: Core rationale and evolution - robots-explained: robots.txt mechanics and best practices - llms-explained: AI assistant guidance and context - humans-explained: Human-readable credits and culture - security-explained: RFC 9116 responsible disclosure - canary-explained: Warrant canaries and transparency - webfinger-explained: RFC 7033 federated discovery - seo: Discovery files impact on search optimization - ai-integration: Strategy for AI-first discovery - architecture: Internal design and extensibility All pages follow Diátaxis explanation style: understanding-oriented, provide context, explain design decisions, discuss alternatives.
This commit is contained in:
parent
74cffc2842
commit
0191d08d14
@ -1,31 +1,264 @@
|
|||||||
---
|
---
|
||||||
title: AI Assistant Integration
|
title: AI Assistant Integration Strategy
|
||||||
description: How AI assistants use discovery files
|
description: How AI assistants use discovery files and how to optimize for them
|
||||||
---
|
---
|
||||||
|
|
||||||
Learn how AI assistants discover and use information from your site.
|
The relationship between websites and AI assistants is fundamentally different from traditional search engines. Understanding this difference is key to optimizing your site for AI-mediated discovery.
|
||||||
|
|
||||||
:::note[Work in Progress]
|
## Beyond Indexing: AI Understanding
|
||||||
This page is currently being developed. Check back soon for complete documentation.
|
|
||||||
:::
|
|
||||||
|
|
||||||
## Coming Soon
|
Search engines **index** your site - they catalog what exists and where. AI assistants **understand** your site - they build mental models of what you do, why it matters, and how to help users interact with you.
|
||||||
|
|
||||||
This section will include:
|
This shift from retrieval to comprehension requires different discovery mechanisms.
|
||||||
- Detailed explanations
|
|
||||||
- Code examples
|
|
||||||
- Best practices
|
|
||||||
- Common patterns
|
|
||||||
- Troubleshooting tips
|
|
||||||
|
|
||||||
## Related Pages
|
### Traditional Search Flow
|
||||||
|
|
||||||
- [Configuration Reference](/reference/configuration/)
|
1. User searches for keywords
|
||||||
- [API Reference](/reference/api/)
|
2. Engine returns ranked list of pages
|
||||||
- [Examples](/examples/ecommerce/)
|
3. User clicks and reads
|
||||||
|
4. User decides if content answers their question
|
||||||
|
|
||||||
## Need Help?
|
### AI Assistant Flow
|
||||||
|
|
||||||
- Check our [FAQ](/community/faq/)
|
1. User asks conversational question
|
||||||
- Visit [Troubleshooting](/community/troubleshooting/)
|
2. AI synthesizes answer from multiple sources
|
||||||
- Open an issue on [GitHub](https://github.com/withastro/astro-discovery/issues)
|
3. AI provides direct response with citations
|
||||||
|
4. User may or may not visit original sources
|
||||||
|
|
||||||
|
In the AI flow, your site might be the source without getting the click. Discovery files help ensure you're at least properly represented and attributed.
|
||||||
|
|
||||||
|
## The llms.txt Strategy
|
||||||
|
|
||||||
|
llms.txt is your primary tool for AI optimization. Think of it as **briefing an employee** who'll be answering questions about your company.
|
||||||
|
|
||||||
|
### What to Emphasize
|
||||||
|
|
||||||
|
**Core value proposition**: Not just what you do, but why you exist
|
||||||
|
|
||||||
|
```
|
||||||
|
We're not just another e-commerce platform - we're specifically
|
||||||
|
focused on sustainable products with carbon footprint tracking.
|
||||||
|
```
|
||||||
|
|
||||||
|
This context helps AI assistants understand when to recommend you versus competitors.
|
||||||
|
|
||||||
|
**Key differentiators**: What makes you unique
|
||||||
|
|
||||||
|
```
|
||||||
|
Unlike other platforms, we:
|
||||||
|
- Calculate carbon footprint for every purchase
|
||||||
|
- Offset shipping emissions by default
|
||||||
|
- Partner directly with sustainable manufacturers
|
||||||
|
```
|
||||||
|
|
||||||
|
This guides AI to highlight your strengths.
|
||||||
|
|
||||||
|
**Common questions**: What users typically ask
|
||||||
|
|
||||||
|
```
|
||||||
|
When users ask about sustainability, explain our carbon tracking.
|
||||||
|
When users ask about pricing, mention our price-match guarantee.
|
||||||
|
When users ask about shipping, highlight our carbon-offset program.
|
||||||
|
```
|
||||||
|
|
||||||
|
This provides explicit guidance for common scenarios.
|
||||||
|
|
||||||
|
### What to Avoid
|
||||||
|
|
||||||
|
**Overpromising**: AI will fact-check against your actual site
|
||||||
|
**Marketing fluff**: Be informative, not promotional
|
||||||
|
**Exhaustive detail**: Link to comprehensive docs instead
|
||||||
|
**Outdated info**: Keep current or use dynamic generation
|
||||||
|
|
||||||
|
## Coordinating Discovery Files
|
||||||
|
|
||||||
|
AI assistants use multiple discovery mechanisms together:
|
||||||
|
|
||||||
|
### robots.txt → llms.txt Flow
|
||||||
|
|
||||||
|
1. AI bot checks robots.txt for permission
|
||||||
|
2. Finds reference to llms.txt
|
||||||
|
3. Reads llms.txt for context
|
||||||
|
4. Crawls site with that context in mind
|
||||||
|
|
||||||
|
Ensure your robots.txt explicitly allows AI bots:
|
||||||
|
|
||||||
|
```
|
||||||
|
User-agent: GPTBot
|
||||||
|
User-agent: Claude-Web
|
||||||
|
User-agent: Anthropic-AI
|
||||||
|
Allow: /
|
||||||
|
```
|
||||||
|
|
||||||
|
### llms.txt → humans.txt Connection
|
||||||
|
|
||||||
|
humans.txt provides tech stack info that helps AI answer developer questions:
|
||||||
|
|
||||||
|
User: "Can I integrate this with React?"
|
||||||
|
AI: *checks humans.txt, sees React in tech stack*
|
||||||
|
AI: "Yes, it's built with React and designed for React integration."
|
||||||
|
|
||||||
|
The files complement each other.
|
||||||
|
|
||||||
|
### sitemap.xml → AI Content Discovery
|
||||||
|
|
||||||
|
Sitemaps help AI find comprehensive content:
|
||||||
|
|
||||||
|
```xml
|
||||||
|
<url>
|
||||||
|
<loc>https://example.com/docs/api</loc>
|
||||||
|
<priority>0.9</priority>
|
||||||
|
</url>
|
||||||
|
```
|
||||||
|
|
||||||
|
High-priority pages in your sitemap signal importance to AI crawlers.
|
||||||
|
|
||||||
|
## Dynamic Content Generation
|
||||||
|
|
||||||
|
Static llms.txt works for stable information. Dynamic generation handles changing contexts:
|
||||||
|
|
||||||
|
### API Endpoint Discovery
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
llms: {
|
||||||
|
apiEndpoints: async () => {
|
||||||
|
const spec = await loadOpenAPISpec();
|
||||||
|
return spec.paths.map(path => ({
|
||||||
|
path: path.url,
|
||||||
|
method: path.method,
|
||||||
|
description: path.summary
|
||||||
|
}));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This keeps AI's understanding of your API current without manual updates.
|
||||||
|
|
||||||
|
### Feature Flags and Capabilities
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
llms: {
|
||||||
|
instructions: () => {
|
||||||
|
const features = getEnabledFeatures();
|
||||||
|
return `
|
||||||
|
Current features:
|
||||||
|
${features.map(f => `- ${f.name}: ${f.description}`).join('\n')}
|
||||||
|
|
||||||
|
Note: Feature availability may change. Check /api/features for current status.
|
||||||
|
`;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
AI assistants know what's currently available versus planned or deprecated.
|
||||||
|
|
||||||
|
## Measuring AI Representation
|
||||||
|
|
||||||
|
Unlike traditional SEO, AI impact is harder to quantify directly:
|
||||||
|
|
||||||
|
### Qualitative Monitoring
|
||||||
|
|
||||||
|
**Ask AI assistants about your site**: Periodically query Claude, ChatGPT, and others about your product. Do they:
|
||||||
|
- Describe you accurately?
|
||||||
|
- Highlight key features?
|
||||||
|
- Use correct terminology?
|
||||||
|
- Provide appropriate warnings/caveats?
|
||||||
|
|
||||||
|
**Monitor AI-generated content**: Watch for your site being referenced in:
|
||||||
|
- AI-assisted blog posts
|
||||||
|
- Generated code examples
|
||||||
|
- Tutorial content
|
||||||
|
- Comparison tables
|
||||||
|
|
||||||
|
**Track citation patterns**: When AI cites your site, is it:
|
||||||
|
- For the right reasons?
|
||||||
|
- In appropriate contexts?
|
||||||
|
- With accurate information?
|
||||||
|
- Linking to relevant pages?
|
||||||
|
|
||||||
|
### Quantitative Signals
|
||||||
|
|
||||||
|
**Referrer analysis**: Some AI tools send referrer headers showing they're AI-mediated traffic
|
||||||
|
|
||||||
|
**API usage patterns**: AI-assisted developers may show different integration patterns than manual developers
|
||||||
|
|
||||||
|
**Support question types**: AI-informed users ask more sophisticated questions
|
||||||
|
|
||||||
|
**Time-on-site**: AI-briefed visitors may be more targeted, spending less time but converting better
|
||||||
|
|
||||||
|
## Brand Voice Consistency
|
||||||
|
|
||||||
|
AI assistants can adapt tone to match your brand if you provide guidance:
|
||||||
|
|
||||||
|
```
|
||||||
|
## Brand Voice
|
||||||
|
|
||||||
|
- Professional but approachable
|
||||||
|
- Technical accuracy over marketing speak
|
||||||
|
- Always mention privacy and security first
|
||||||
|
- Use "we" language (community-oriented)
|
||||||
|
- Avoid: corporate jargon, buzzwords, hype
|
||||||
|
```
|
||||||
|
|
||||||
|
This helps ensure AI-generated content about you feels consistent with your actual brand.
|
||||||
|
|
||||||
|
## Handling Misconceptions
|
||||||
|
|
||||||
|
Use llms.txt to correct common misunderstandings:
|
||||||
|
|
||||||
|
```
|
||||||
|
## Common Misconceptions
|
||||||
|
|
||||||
|
WRONG: "We're a general e-commerce platform"
|
||||||
|
RIGHT: "We specifically focus on sustainable products"
|
||||||
|
|
||||||
|
WRONG: "We offer all payment methods"
|
||||||
|
RIGHT: "We support major cards and PayPal, but not cryptocurrency"
|
||||||
|
|
||||||
|
WRONG: "Free shipping on all orders"
|
||||||
|
RIGHT: "Free carbon-offset shipping over $50"
|
||||||
|
```
|
||||||
|
|
||||||
|
This proactive clarification reduces AI-generated misinformation.
|
||||||
|
|
||||||
|
## Privacy and Training Data
|
||||||
|
|
||||||
|
A common concern: "Doesn't llms.txt help AI companies train on my content?"
|
||||||
|
|
||||||
|
Key points:
|
||||||
|
|
||||||
|
**Training happens regardless**: Public content is already accessible for training
|
||||||
|
**llms.txt doesn't grant permission**: It provides context, not authorization
|
||||||
|
**robots.txt controls access**: Block AI crawlers there if you don't want them
|
||||||
|
**Better representation**: Context helps AI represent you accurately when it does access your site
|
||||||
|
|
||||||
|
Think of llms.txt as **quality control** for inevitable AI consumption, not invitation.
|
||||||
|
|
||||||
|
## Future-Proofing
|
||||||
|
|
||||||
|
AI capabilities are evolving rapidly. Future trends:
|
||||||
|
|
||||||
|
**Agentic AI**: Assistants that take actions, not just answer questions
|
||||||
|
**Multi-modal understanding**: AI processing images, videos, and interactive content
|
||||||
|
**Real-time data**: AI querying live APIs versus static crawls
|
||||||
|
**Semantic graphs**: Deep relationship mapping between concepts
|
||||||
|
|
||||||
|
llms.txt will evolve to support these capabilities. By adopting it now, you're positioned to benefit from enhancements.
|
||||||
|
|
||||||
|
## The Long Game
|
||||||
|
|
||||||
|
AI integration is a marathon, not a sprint:
|
||||||
|
|
||||||
|
**Start simple**: Basic llms.txt with description and key features
|
||||||
|
**Monitor and refine**: See how AI represents you, adjust accordingly
|
||||||
|
**Add detail gradually**: Expand instructions as you identify gaps
|
||||||
|
**Stay current**: Update as your product evolves
|
||||||
|
**Share learnings**: The community benefits from your experience
|
||||||
|
|
||||||
|
The integration makes the technical part easy. The strategic part - what to say and how - requires ongoing attention.
|
||||||
|
|
||||||
|
## Related Topics
|
||||||
|
|
||||||
|
- [LLMs.txt Explained](/explanation/llms-explained/) - Deep dive into llms.txt
|
||||||
|
- [SEO Strategy](/explanation/seo/) - Traditional vs. AI-mediated discovery
|
||||||
|
- [Customizing Instructions](/how-to/customize-llm-instructions/) - Practical guidance optimization
|
||||||
|
|||||||
@ -3,29 +3,454 @@ title: Architecture & Design
|
|||||||
description: How @astrojs/discovery works internally
|
description: How @astrojs/discovery works internally
|
||||||
---
|
---
|
||||||
|
|
||||||
Technical explanation of the integration architecture and design decisions.
|
Understanding the integration's architecture helps you customize it effectively and troubleshoot when needed. The design prioritizes simplicity, correctness, and extensibility.
|
||||||
|
|
||||||
:::note[Work in Progress]
|
## High-Level Design
|
||||||
This page is currently being developed. Check back soon for complete documentation.
|
|
||||||
:::
|
|
||||||
|
|
||||||
## Coming Soon
|
The integration follows Astro's standard integration pattern:
|
||||||
|
|
||||||
This section will include:
|
```
|
||||||
- Detailed explanations
|
astro.config.mjs
|
||||||
- Code examples
|
↓ integrates discovery()
|
||||||
- Best practices
|
↓
|
||||||
- Common patterns
|
Integration hooks into Astro lifecycle
|
||||||
- Troubleshooting tips
|
↓
|
||||||
|
Injects route handlers for discovery files
|
||||||
|
↓
|
||||||
|
Route handlers call generators
|
||||||
|
↓
|
||||||
|
Generators produce discovery file content
|
||||||
|
```
|
||||||
|
|
||||||
## Related Pages
|
Each layer has a specific responsibility, making the system modular and testable.
|
||||||
|
|
||||||
- [Configuration Reference](/reference/configuration/)
|
## The Integration Layer
|
||||||
- [API Reference](/reference/api/)
|
|
||||||
- [Examples](/examples/ecommerce/)
|
|
||||||
|
|
||||||
## Need Help?
|
`src/index.ts` implements the Astro integration interface:
|
||||||
|
|
||||||
- Check our [FAQ](/community/faq/)
|
```typescript
|
||||||
- Visit [Troubleshooting](/community/troubleshooting/)
|
export default function discovery(config: DiscoveryConfig): AstroIntegration {
|
||||||
- Open an issue on [GitHub](https://github.com/withastro/astro-discovery/issues)
|
return {
|
||||||
|
name: '@astrojs/discovery',
|
||||||
|
hooks: {
|
||||||
|
'astro:config:setup': // Inject routes and sitemap
|
||||||
|
'astro:build:done': // Log generated files
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This layer:
|
||||||
|
|
||||||
|
- Validates configuration
|
||||||
|
- Merges user config with defaults
|
||||||
|
- Injects dynamic routes
|
||||||
|
- Integrates @astrojs/sitemap
|
||||||
|
- Reports build results
|
||||||
|
|
||||||
|
## Configuration Strategy
|
||||||
|
|
||||||
|
Configuration flows through several stages:
|
||||||
|
|
||||||
|
### 1. User Configuration
|
||||||
|
|
||||||
|
User provides partial configuration in astro.config.mjs:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
discovery({
|
||||||
|
llms: {
|
||||||
|
description: 'My site'
|
||||||
|
}
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Validation and Defaults
|
||||||
|
|
||||||
|
`src/validators/config.ts` validates and merges with defaults:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export function validateConfig(userConfig: DiscoveryConfig): ValidatedConfig {
|
||||||
|
return {
|
||||||
|
robots: mergeRobotsDefaults(userConfig.robots),
|
||||||
|
llms: mergeLLMsDefaults(userConfig.llms),
|
||||||
|
// ...
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This ensures:
|
||||||
|
- Required fields are present
|
||||||
|
- Types are correct
|
||||||
|
- Defaults fill gaps
|
||||||
|
- Invalid configs are caught early
|
||||||
|
|
||||||
|
### 3. Global Storage
|
||||||
|
|
||||||
|
`src/config-store.ts` provides global access to validated config:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
let globalConfig: DiscoveryConfig;
|
||||||
|
|
||||||
|
export function setConfig(config: DiscoveryConfig) {
|
||||||
|
globalConfig = config;
|
||||||
|
}
|
||||||
|
|
||||||
|
export function getConfig(): DiscoveryConfig {
|
||||||
|
return globalConfig;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows route handlers to access configuration without passing it through Astro's context (which has limitations).
|
||||||
|
|
||||||
|
### 4. Virtual Module
|
||||||
|
|
||||||
|
A Vite plugin provides configuration as a virtual module:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
vite: {
|
||||||
|
plugins: [{
|
||||||
|
name: '@astrojs/discovery:config',
|
||||||
|
resolveId(id) {
|
||||||
|
if (id === 'virtual:@astrojs/discovery/config') {
|
||||||
|
return '\0' + id;
|
||||||
|
}
|
||||||
|
},
|
||||||
|
load(id) {
|
||||||
|
if (id === '\0virtual:@astrojs/discovery/config') {
|
||||||
|
return `export default ${JSON.stringify(config)};`;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This makes config available during route execution.
|
||||||
|
|
||||||
|
## Route Injection
|
||||||
|
|
||||||
|
The integration injects routes for each enabled discovery file:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
if (config.robots?.enabled !== false) {
|
||||||
|
injectRoute({
|
||||||
|
pattern: '/robots.txt',
|
||||||
|
entrypoint: '@astrojs/discovery/routes/robots',
|
||||||
|
prerender: true
|
||||||
|
});
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key decisions:**
|
||||||
|
|
||||||
|
**Pattern**: The URL where the file appears
|
||||||
|
**Entrypoint**: Module that handles the route
|
||||||
|
**Prerender**: Whether to generate at build time (true) or runtime (false)
|
||||||
|
|
||||||
|
Most routes prerender (`prerender: true`) for performance. WebFinger uses `prerender: false` because it requires query parameters.
|
||||||
|
|
||||||
|
## Generator Pattern
|
||||||
|
|
||||||
|
Each discovery file type has a dedicated generator:
|
||||||
|
|
||||||
|
```
|
||||||
|
src/generators/
|
||||||
|
robots.ts - robots.txt generation
|
||||||
|
llms.ts - llms.txt generation
|
||||||
|
humans.ts - humans.txt generation
|
||||||
|
security.ts - security.txt generation
|
||||||
|
canary.ts - canary.txt generation
|
||||||
|
webfinger.ts - WebFinger JRD generation
|
||||||
|
```
|
||||||
|
|
||||||
|
Generators are pure functions:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export function generateRobotsTxt(
|
||||||
|
config: RobotsConfig,
|
||||||
|
siteURL: URL
|
||||||
|
): string {
|
||||||
|
// Generate content
|
||||||
|
return robotsTxtString;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This makes them:
|
||||||
|
- Easy to test (no side effects)
|
||||||
|
- Easy to customize (override with your own function)
|
||||||
|
- Easy to reason about (input → output)
|
||||||
|
|
||||||
|
## Route Handler Pattern
|
||||||
|
|
||||||
|
Route handlers bridge Astro routes and generators:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// src/routes/robots.ts
|
||||||
|
import { getConfig } from '../config-store.js';
|
||||||
|
import { generateRobotsTxt } from '../generators/robots.js';
|
||||||
|
|
||||||
|
export async function GET({ site }) {
|
||||||
|
const config = getConfig();
|
||||||
|
const content = generateRobotsTxt(config.robots, new URL(site));
|
||||||
|
|
||||||
|
return new Response(content, {
|
||||||
|
headers: {
|
||||||
|
'Content-Type': 'text/plain',
|
||||||
|
'Cache-Control': `public, max-age=${config.caching?.robots || 3600}`
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Responsibilities:
|
||||||
|
|
||||||
|
1. Retrieve configuration
|
||||||
|
2. Call generator with config and site URL
|
||||||
|
3. Set appropriate headers (Content-Type, Cache-Control)
|
||||||
|
4. Return response
|
||||||
|
|
||||||
|
## Type System
|
||||||
|
|
||||||
|
`src/types.ts` defines the complete type hierarchy:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export interface DiscoveryConfig {
|
||||||
|
robots?: RobotsConfig;
|
||||||
|
llms?: LLMsConfig;
|
||||||
|
humans?: HumansConfig;
|
||||||
|
security?: SecurityConfig;
|
||||||
|
canary?: CanaryConfig;
|
||||||
|
webfinger?: WebFingerConfig;
|
||||||
|
sitemap?: SitemapConfig;
|
||||||
|
caching?: CachingConfig;
|
||||||
|
templates?: TemplateConfig;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This provides:
|
||||||
|
- IntelliSense in editors
|
||||||
|
- Compile-time type checking
|
||||||
|
- Self-documenting configuration
|
||||||
|
- Safe refactoring
|
||||||
|
|
||||||
|
Types are exported so users can import them:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import type { DiscoveryConfig } from '@astrojs/discovery';
|
||||||
|
```
|
||||||
|
|
||||||
|
## Dynamic Content Support
|
||||||
|
|
||||||
|
Several discovery files support dynamic generation:
|
||||||
|
|
||||||
|
### Function-based Configuration
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
llms: {
|
||||||
|
description: () => {
|
||||||
|
// Compute at build time
|
||||||
|
return `Generated at ${new Date()}`;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Async Functions
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
llms: {
|
||||||
|
apiEndpoints: async () => {
|
||||||
|
const spec = await loadOpenAPISpec();
|
||||||
|
return extractEndpoints(spec);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Generators handle both static values and functions transparently.
|
||||||
|
|
||||||
|
### Content Collection Integration
|
||||||
|
|
||||||
|
WebFinger integrates with Astro content collections:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
webfinger: {
|
||||||
|
collections: [{
|
||||||
|
name: 'team',
|
||||||
|
resourceTemplate: 'acct:{slug}@example.com',
|
||||||
|
linksBuilder: (entry) => [...]
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The WebFinger route:
|
||||||
|
1. Calls `getCollection('team')`
|
||||||
|
2. Applies templates to each entry
|
||||||
|
3. Matches against query parameter
|
||||||
|
4. Generates JRD response
|
||||||
|
|
||||||
|
## Cache Control
|
||||||
|
|
||||||
|
Each discovery file has configurable cache duration:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
caching: {
|
||||||
|
robots: 3600, // 1 hour
|
||||||
|
llms: 3600, // 1 hour
|
||||||
|
humans: 86400, // 24 hours
|
||||||
|
security: 86400, // 24 hours
|
||||||
|
canary: 3600, // 1 hour
|
||||||
|
webfinger: 3600, // 1 hour
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Routes set `Cache-Control` headers based on these values:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
headers: {
|
||||||
|
'Cache-Control': `public, max-age=${cacheDuration}`
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This balances:
|
||||||
|
- **Performance**: Cached responses serve faster
|
||||||
|
- **Freshness**: Short durations keep content current
|
||||||
|
- **Server load**: Reduces regeneration frequency
|
||||||
|
|
||||||
|
## Sitemap Integration
|
||||||
|
|
||||||
|
The integration includes @astrojs/sitemap automatically:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
updateConfig({
|
||||||
|
integrations: [
|
||||||
|
sitemap(config.sitemap || {})
|
||||||
|
]
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
This ensures:
|
||||||
|
- Sitemap is always present
|
||||||
|
- Configuration passes through
|
||||||
|
- robots.txt references correct sitemap URL
|
||||||
|
|
||||||
|
Users don't need to install @astrojs/sitemap separately.
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
The integration validates aggressively at startup:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
if (!astroConfig.site) {
|
||||||
|
throw new Error(
|
||||||
|
'[@astrojs/discovery] The `site` option must be set in your Astro config.'
|
||||||
|
);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This fails fast with clear error messages rather than generating incorrect output.
|
||||||
|
|
||||||
|
Generators also validate input:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
if (!config.contact) {
|
||||||
|
throw new Error('security.txt requires a contact field');
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
RFC compliance is enforced at generation time.
|
||||||
|
|
||||||
|
## Extensibility Points
|
||||||
|
|
||||||
|
Users can extend the integration in several ways:
|
||||||
|
|
||||||
|
### Custom Templates
|
||||||
|
|
||||||
|
Override any generator:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
templates: {
|
||||||
|
robots: (config, siteURL) => `
|
||||||
|
User-agent: *
|
||||||
|
Allow: /
|
||||||
|
|
||||||
|
# Custom content
|
||||||
|
Sitemap: ${siteURL}/sitemap.xml
|
||||||
|
`
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Custom Sections
|
||||||
|
|
||||||
|
Add custom content to humans.txt and llms.txt:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
humans: {
|
||||||
|
customSections: {
|
||||||
|
'PHILOSOPHY': 'We believe in...'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dynamic Functions
|
||||||
|
|
||||||
|
Generate content at build time:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
canary: {
|
||||||
|
statements: () => computeStatements()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Build Output
|
||||||
|
|
||||||
|
At build completion, the integration logs generated files:
|
||||||
|
|
||||||
|
```
|
||||||
|
✨ @astrojs/discovery - Generated files:
|
||||||
|
✅ /robots.txt
|
||||||
|
✅ /llms.txt
|
||||||
|
✅ /humans.txt
|
||||||
|
✅ /.well-known/security.txt
|
||||||
|
✅ /sitemap-index.xml
|
||||||
|
```
|
||||||
|
|
||||||
|
This provides immediate feedback about what was created.
|
||||||
|
|
||||||
|
## Performance Considerations
|
||||||
|
|
||||||
|
The integration is designed for minimal build impact:
|
||||||
|
|
||||||
|
**Prerendering**: Most routes prerender at build time (no runtime cost)
|
||||||
|
**Pure functions**: Generators have no side effects (safe to call multiple times)
|
||||||
|
**Caching**: HTTP caching reduces server load
|
||||||
|
**Lazy loading**: Generators only execute for enabled files
|
||||||
|
|
||||||
|
Build time impact is typically <200ms for all files.
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
The codebase uses a layered testing approach:
|
||||||
|
|
||||||
|
**Unit tests**: Test generators in isolation with known inputs
|
||||||
|
**Integration tests**: Test route handlers with mock Astro context
|
||||||
|
**Type tests**: Ensure TypeScript types are correct
|
||||||
|
**E2E tests**: Deploy and verify actual output
|
||||||
|
|
||||||
|
This ensures correctness at each layer.
|
||||||
|
|
||||||
|
## Why This Architecture?
|
||||||
|
|
||||||
|
Key design decisions:
|
||||||
|
|
||||||
|
**Separation of concerns**: Generators don't know about Astro, routes don't know about content formats
|
||||||
|
**Composability**: Each piece is independently usable
|
||||||
|
**Testability**: Pure functions are easy to test
|
||||||
|
**Type safety**: TypeScript catches errors at compile time
|
||||||
|
**Extensibility**: Users can override any behavior
|
||||||
|
**Performance**: Prerendering and caching minimize runtime cost
|
||||||
|
|
||||||
|
The architecture prioritizes **correctness** and **simplicity** over cleverness.
|
||||||
|
|
||||||
|
## Related Topics
|
||||||
|
|
||||||
|
- [API Reference](/reference/api/) - Complete API documentation
|
||||||
|
- [TypeScript Types](/reference/typescript/) - Type definitions
|
||||||
|
- [Custom Templates](/how-to/custom-templates/) - Overriding generators
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user