Llama Stack Server for development purposes
Few Next Gen UI Agent modules use Llama Stack server for LLM inference abstraction.
For local developmment purposes, you can run Llama Stack server on localhost, configured to server LLM of your choice. You can also rul LLM locally using eg. Ollama if you have reasonable HW. It is definitelly good to have GPU capable to accelerate AI worloads.
For more details see LLAMASTACK_DEV.md.