https://github.com/Dao-AILab/flash-attention
https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-enable-flash-attention
OLLAMA_FLASH_ATTENTION=1
Post a Comment
No comments:
Post a Comment