Next Gen UI MCP Server
This module is part of the Next Gen UI Agent project.
This package wraps Next Gen UI Agent in a Model Context Protocol (MCP) tools using the official Python MCP SDK.
Since MCP adoption is so strong these days and there is an apetite to use this protocol also for handling agentic AI, we also deliver UI Agent this way.
The most common way of utilising MCP tools is to provide them to LLM to choose and execute with certain parameters. This approach of using Next Gen UI Agent makes sense if you want your AI Orchestrator give a chance to decide about the UI component generation. For example to select which backend data loaded during the processing needs to be visualized in UI. You have to prompt LLM in a way to pass the correct user prompt and structured backend data content into the UI MCP unaltered, to prevent unexpected UI errors.
Alternative approach is to invoke this MCP tool directly (or even using another AI framework binding) with the parameters as part of your main application logic at the specific moment of the flow, after gathering structured backend data for response. This approach is a bit more reliable, helps to reduce main LLM processing price (tokens) and saves processing time, but is less flexible.
Provides
__main__.pyto run the MCP server as the standalone serverNextGenUIMCPServerto embed the UI Agent MCP server into your python code
Installation
Note: alternatively, you can use container image to easily install and run the server.
Depending on your use case you may need additional packages for inference provider or design component renderers. More about this in the next sections.
Usage
Running the standalone server
To get help how to run the server and pass the arguments run it with -h parameter:
Few examples:
# Run with MCP sampling (default - leverages client's LLM)
python -m next_gen_ui_mcp
# Run with OpenAI inference
python -m next_gen_ui_mcp --provider openai --model gpt-3.5-turbo
# Run with OpenAI compatible API of Ollama (local)
python -m next_gen_ui_mcp --provider openai --model llama3.2 --base-url http://localhost:11434/v1 --api-key ollama
# Run with MCP sampling and custom max tokens
python -m next_gen_ui_mcp --sampling-max-tokens 4096
# Run with SSE transport (for web clients)
python -m next_gen_ui_mcp --transport sse --host 127.0.0.1 --port 8000
# Run with streamable-http transport
python -m next_gen_ui_mcp --transport streamable-http --host 127.0.0.1 --port 8000
# Run with patternfly component system
python -m next_gen_ui_mcp --component-system rhds
# Run with rhds component system via SSE transport
python -m next_gen_ui_mcp --transport sse --component-system rhds --port 8000
As the above examples show you can choose to configure mcp sampling, openai or anthropic-vertexai inference provider.
You have to add the necessary dependencies to your python environment to do so, otherwise the application will complain about them missing.
See detailed documentation below.
Similarly pluggable component systems such as rhds also require certain imports, next_gen_ui_rhds_renderer in this particular case.
json renderrer is installed by default.
Configuration Reference
Server can be configured using commandline arguments, or environment variables. CLI has precedence over env variable.
| Commandline Argument | Environment Variable | Default Value | Description |
|---|---|---|---|
--config-path |
NGUI_CONFIG_PATH |
- | Path to YAML configuration files (to merge more yaml files, multiple commandline args can be used/comma separated in env variable). |
--component-system |
NGUI_COMPONENT_SYSTEM |
json |
UI Component system (json + any installed). Overrides value from YAML config file if used. |
--transport |
MCP_TRANSPORT |
stdio |
Transport protocol for MCP (stdio, sse, streamable-http). |
--host |
MCP_HOST |
127.0.0.1 |
Host to bind to (for sse and streamable-http transports). |
--port |
MCP_PORT |
8000 |
Port to bind to (for sse and streamable-http transports). |
--tools |
MCP_TOOLS |
- | List of enabled tools (comma separated). All are enabled by default. |
--structured_output_enabled |
MCP_STRUCTURED_OUTPUT_ENABLED |
true |
Enable or disable structured output. |
--provider |
NGUI_PROVIDER |
mcp |
LLM inference provider (mcp, openai, anthropic-vertexai), for details see below. |
--model |
NGUI_MODEL |
- | Model name. Required for other than mcp providers. |
--base-url |
NGUI_PROVIDER_API_BASE_URL |
- | Base URL for API, provider specific defaults. Used by openai, anthropic-vertexai. |
--api-key |
NGUI_PROVIDER_API_KEY |
- | API key for the LLM provider. Used by openai, anthropic-vertexai. |
--temperature |
NGUI_PROVIDER_TEMPERATURE |
- | Temperature for model inference, float value (defaults to 0.0 for deterministic responses). Used by openai, anthropic-vertexai. |
--sampling-max-tokens |
NGUI_SAMPLING_MAX_TOKENS |
- | Maximum LLM generated tokens, integer value. Used by mcp (defaults to 2048) and anthropic-vertexai (defaults to 4096). |
--anthropic-version |
NGUI_PROVIDER_ANTHROPIC_VERSION |
- | Anthropic version value used in the API call (defaults to vertex-2023-10-16). Used by anthropic-vertexai. |
--debug |
- | Enable debug logging. |
LLM Inference Providers
The Next Gen UI MCP server supports multiple inference providers, controlled by the --provider commandline argument / NGUI_PROVIDER environment variable:
Provider mcp
Uses Model Context Protocol sampling to leverage the client's LLM capabilities. No additional configuration required as it uses the connected MCP client's model, only few optional options are available.
MCP client has to support Sampling feature and its optional options!
Requires:
NGUI_SAMPLING_MAX_TOKENS(optional): Maximum LLM generated tokens, integer value (defaults to2048).
Provider openai
Uses LangChain OpenAI inference provider, so can be used with any OpenAI compatible APIs, eg. OpenAI API itself, or Ollama for localhost inference, or Llama Stack server v0.3.0+.
Requires additional package to be installed:
Requires:
NGUI_MODEL: Model name (e.g.,gpt-4o,llama3.2).NGUI_PROVIDER_API_KEY: API key for the provider.NGUI_PROVIDER_API_BASE_URL(optional): Custom base URL for OpenAI-compatible APIs like Ollama or Llama Stack. OpenAI API by default.NGUI_PROVIDER_TEMPERATURE(optional): Temperature for model inference (defaults to0.0for deterministic responses).
Base URL examples:
- OpenAI:
https://api.openai.com/v1(default) - Ollama at localhost:
http://localhost:11434/v1 - Llama Stack server at localhost port
5001called from MCP server running in image:http://host.containers.internal:5001/v1
Provider anthropic-vertexai
Uses Anthropic/Claude models from proxied Google Vertex AI API endpoint.
Called API url is constructed as {BASE_URL}/models/{MODEL}:streamRawPredict.
API key is sent as Bearer token in Authorization http request header.
Requires:
NGUI_MODEL: Model name.NGUI_PROVIDER_API_BASE_URL: Base URL of the API.NGUI_PROVIDER_API_KEY: API key for the provider.NGUI_PROVIDER_TEMPERATURE(optional): Temperature for model inference (defaults to0.0for deterministic responses).NGUI_PROVIDER_ANTHROPIC_VERSION(optional): Anthropic version to use in API call (defaults tovertex-2023-10-16).NGUI_SAMPLING_MAX_TOKENS(optional): Maximum LLM generated tokens, integer value (defaults to4096).
YAML configuration
Common Next Gen UI YAML configuration files can be used to configure UI Agent functionality.
Configuration file extension is available to provide ability to fine-tune descriptions for
the MCP tools and their arguments, to get better performance in your AI assitant/orchestrator.
For details see mcp field in the Schema Definition.
Examle of the mcp yaml configuration extension:
mcp:
tools:
generate_ui_multiple_components:
description: Generate multiple UI components for given user_prompt and input data.\nAlways get fresh data from another tool first.
argument_descriptions:
user_prompt: "Original user prompt without any changes, so UI components have necessary context. Do not generate this."
# other UI Agent configurations
Running Server locally from Git Repo
If you are running this from inside of our NextGenUI Agent GitHub repo then our pants repository manager can help you satisfy all dependencies. In such case you can run the commands in the following way:
# Run with MCP sampling (default - leverages client's LLM)
pants run libs/next_gen_ui_mcp/server_example.py:extended
# Run with streamable-http transport and Red Hat Design System component system for rendering
pants run libs/next_gen_ui_mcp/server_example.py:extended --run-args="--transport streamable-http --component-system rhds"
# Run directly
PYTHONPATH=./libs python libs/next_gen_ui_mcp -h
Testing with MCP Client
As part of the GitHub repository we also provide an example client. This example client implementation uses MCP SDK client libraries and ollama for MCP sampling inference provision.
You can run it via this command:
The--concurrent parameter is there only to allow calling it while you use pants run for starting the server. By default pants restrict parallel invocations.
Using NextGenUI MCP Agent through Llama Stack
Llama-stack documentation for tools nicely shows how to register a MCP server but also shows the below code on how to invoke a tool directly
Available MCP Tools
generate_ui_multiple_components
The main tool that wraps the entire Next Gen UI Agent functionality.
This single tool handles:
- Component selection based on user prompt and data
- Data transformation to match selected components
- Design system rendering to produce final UI
Parameters:
user_prompt(str, required): User's prompt which we want to enrich with UI componentsstructured_data(List[Dict], required): List of structured input data. Each object has to haveid,dataandtypefield.session_id(str, optional): Session ID. Not used, present just for compatibility purposes.
You can find the input schema in spec/mcp/generate_ui_input.schema.json.
Returns:
Object containing:
- UI blocks with rendering and configuration
- Textual summary of the UI Blocks generation
When error occurs during the execution valid ui blocks are rendered. The failing UI Block is mentioned in the summary and don't appear in blocks field.
Textual summary is usefull to give the calling LLM a chance to "understand" what happened and react accordingly, include info about UI in natural language response etc.
By default the result is provided as structured content where structured content contains JSON object and the text content just "human readable summary". It's beneficial to send to Agent only text summary for LLM processing and use structured content for UI rendering on client side.
If it's disabled via --structured_output_enabled=false then there is no structured content in the result and the text content contains the same content but as serialized JSON string.
For compatibility the JSON object contains the summary as well.
Example:
{
"blocks": [
{
"id": "e5e2db10-de22-4165-889c-02de2f24c901",
"rendering": {
"id": "e5e2db10-de22-4165-889c-02de2f24c901",
"component_system": "json",
"mime_type": "application/json",
"content": "{\"component\":\"one-card\",\"image\":\"https://image.tmdb.org/t/p/w440_and_h660_face/uXDfjJbdP4ijW5hWSBrPrlKpxab.jpg\",\"id\":\"e5e2db10-de22-4165-889c-02de2f24c901\",\"title\":\"Toy Story Movie Details\",\"fields\":[{\"id\": \"title\",\"name\":\"Title\",\"data_path\":\"$..movie_detail.title\",\"data\":[\"Toy Story\"]},{\"id\": \"year\",\"name\":\"Release Year\",\"data_path\":\"$..movie_detail.year\",\"data\":[1995]},{\"id\": \"imdbRating\",\"name\":\"IMDB Rating\",\"data_path\":\"$..movie_detail.imdbRating\",\"data\":[8.3]},{\"id\": \"runtime\",\"name\":\"Runtime (min)\",\"data_path\":\"$..movie_detail.runtime\",\"data\":[81]},{\"id\": \"plot\",\"name\":\"Plot\",\"data_path\":\"$..movie_detail.plot\",\"data\":[\"A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room.\"]}]}"
},
"configuration": {
"data_type": "movie_detail",
"input_data_transformer_name": "json",
"json_wrapping_field_name": "movie_detail",
"component_metadata": {
"id": "e5e2db10-de22-4165-889c-02de2f24c901",
"title": "Toy Story Movie Details",
"component": "one-card",
"fields": [
{
"id": "title",
"name": "Title",
"data_path": "$..movie_detail.title"
},
{
"id": "year",
"name": "Release Year",
"data_path": "$..movie_detail.year"
},
{
"id": "imdbRating",
"name": "IMDB Rating",
"data_path": "$..movie_detail.imdbRating"
},
{
"id": "runtime",
"name": "Runtime (min)",
"data_path": "$..movie_detail.runtime"
},
{
"id": "plot",
"name": "Plot",
"data_path": "$..movie_detail.plot"
},
{
"id": "posterUrl",
"name": "Poster",
"data_path": "$..movie_detail.posterUrl"
}
]
}
}
}
],
"summary": "Components are rendered in UI.\nCount: 1\n1. Title: 'Toy Story Movie Details', type: one-card"
}
You can find schema for the reponse in spec/mcp/generate_ui_output.schema.json.
generate_ui_component
The tool that wraps the entire Next Gen UI Agent functionality and with decomposed one input object into individual arguments.
Useful for agents which are able to pass one tool cool result to another.
When error occures, whole tool execution fails.
Parameters:
user_prompt(str, required): User's prompt which we want to enrich with UI componentsdata(str, required): Raw input data to render within the UI componentsdata_type(str, required): Data typedata_id(str, optional): ID of Data. If not present, ID is generated.session_id(str, optional): Session ID. Not used, present just for compatibility purposes.
Returns:
Same result as generate_ui_multiple_components tool.
Available MCP Resources
system://info
Returns system information about the Next Gen UI Agent including:
- Agent name
- Component system being used
- Available capabilities
- Description