Eval Tests
The Tableau MCP project uses Vitest for eval tests. Eval tests are located in the
tests/eval directory and are named *.test.ts.
What is an Eval test?
Eval tests—aka Evals—are tests used to evaluate MCP tool implementations using LLM-based scoring. The tests provide assessments for accuracy, completeness, relevance, clarity, and reasoning and help answer questions like:
- Can the model consistently choose the correct tools to answer the user prompt?
- Can the model generate the correct tool inputs based on the user prompt?
- Does the tool implementation accurately answer the user prompt?
- Is the tone suitable for the target audience?
Running
The eval tests can only be run:
- Locally.
- If you have access to a site the tests understand. Currently, that's only https://10ax.online.tableau.com/#/site/mcp-test/.
- If you have an OpenAI API key or access to an OpenAI-compatible gateway.
To run them locally:
- Ensure you do not have a
.envfile in the root of the project. - Create a
tests/.envfile with contents:
SERVER=https://10ax.online.tableau.com
SITE_NAME=mcp-test
AUTH=direct-trust
JWT_SUB_CLAIM=<your email address>
CONNECTED_APP_CLIENT_ID=<redacted>
CONNECTED_APP_SECRET_ID=<redacted>
CONNECTED_APP_SECRET_VALUE=<redacted>
-
Create a
tests/.env.resetfile with the same contents except all the env var values are empty. (Environment variables get set at the beginning of each test and cleared at the end of each test.) -
Create a
tests/eval/.envfile with contents:
OPENAI_API_KEY=<your OpenAI API key>
- Run
npm run test:evalor select thevitest.config.eval.tsconfig in the Vitest extension and run them from your IDE.
Environment Variables
The following environment variables are used by the Eval tests:
OPENAI_API_KEY
The OpenAI API key.
ENABLE_LOGGING
When true, LLMs will stream their output to the console and tool call information will also be
logged.
OPENAI_BASE_URL
The base URL for the OpenAI-compatible gateway.
EVAL_TEST_MODEL
The model to use for the Eval tests. If not set, the default model is used.
Running the Eval tests against a different site
To run the Eval tests locally against a different site, you need to:
- Have a site that has the Superstore sample datasource and workbook (which exist with every new site). The tests query this datasource and workbook.
- Create and enable a Direct Trust Connected App in the site.
- Create a Pulse Metric Definition named
Tableau MCP. Its details don't matter. - Update the
environmentDataobject intests/constants.tswith the new site details. - Follow the steps in the Running section, providing these new site details in the
tests/.envfile.
Debugging
If you are using VS Code or a fork, you can use the Vitest extension to run and debug the Eval tests.