Skip to content

Supported Features

This table summarize the status of features on Spyre. By default, those features are planned to be developed using vLLM engine V1.

Feature Status Note
Chunked Prefill 🗓️
Automatic Prefix Caching 🗓️
LoRA 🗓️
Prompt Adapter Being deprecated in vLLM vllm#13981
Speculative Decoding 🗓️
Guided Decoding 🗓️
Enc-dec No plans for now
Multi Modality 🗓️
LogProbs
Prompt logProbs 🚧
Best of Deprecated in vLLM vllm#13361
Beam search
Tensor Parallel
Pipeline Parallel 🗓️
Expert Parallel 🗓️
Data Parallel 🗓️
Prefill Decode Disaggregation 🗓️
Quantization ⚠️
Sleep Mode 🗓️
Embedding models
  • ✅ Fully operational.
  • ⚠️ Experimental support.
  • 🚧 Under active development.
  • 🗓️ Planned.
  • ⛔ Not planned or deprecated.