Operating steps
- Collect the server-card and tools/list output.
- Break the payload into tool name, description, input schema, enum, and example buckets.
- Estimate token load under the target model and context window.
- Set a threshold for eager loading, lazy loading, and removal.
Common risks
- Token estimates vary by tokenizer and model family.
- Tiny tools can still be expensive when schemas repeat deeply nested objects.
- Budget reports should be updated whenever the server-card changes.