8000 feat: add ability to specify mlflow tag for tesseract by nbowen3 · Pull Request #426 · pasteurlabs/tesseract-core · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@nbowen3
Copy link
Contributor
@nbowen3 nbowen3 commented Dec 17, 2025

Relevant issue or PR

Description of changes

This adds the ability to specify mlflow tags for Tesseract runs via the TESSERACT_MLFLOW_TAGS env variable. This is useful to embed additional metadata into the MLFlow run.

n.b. As we want to ensure that all runs have the tags if set, we need to make a small change here to check if there is an active run. Previously, if there was no an active run and you logged something, it would implicitly create one. That is a problem as it will not include the tags. Instead we now explicitly check if there an active run; and, if not, we create one with the tags.

Testing done

Local and unit testing

@codecov
Copy link
codecov bot commented Dec 17, 2025

Codecov Report

❌ Patch coverage is 81.81818% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.73%. Comparing base (5284bb6) to head (0a86797).

Files with missing lines Patch % Lines
tesseract_core/runtime/mpa.py 80.95% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #426      +/-   ##
==========================================
+ Coverage   66.91%   76.73%   +9.81%     
==========================================
  Files          29       29              
  Lines        3431     3452      +21     
  Branches      535      538       +3     
==========================================
+ Hits         2296     2649     +353     
+ Misses        945      570     -375     
- Partials      190      233      +43     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dionhaefner
Copy link
Contributor

Thanks for the PR. I'm trying to understand what these tags are used for in practice, and whether it really makes sense to set them server-wide (as opposed to per request, like the run_id parameter we have introduced for a similar purpose). What's the primary use case @nbowen3?

Would also be interested in @johnbcoughlin's take on this.

@johnbcoughlin
Copy link
Contributor

@nbowen3 and I spoke about this, here are my takeaways:

  • P4D uses tesseract run, not tesseract serve, so from P4D's perspective it's much the same to set the env var for the lifetime of the Tesseract.
  • Passing tags on a per-call basis is definitely more useful than on a per-serve basis, however:
  • In my opinion the auto-start_run functionality (which uses the TESSERACT_MLFLOW_ family of env vars) is still not useful except for callers who just require a "lowest common denominator" logging experience (like P4D invoking an arbitrary Tesseract).
  • To make the auto-start_run functionality competitive with just calling start_run yourself, we would need a way to
    • thread through most of the parameters listed here, including run_name and parent_run_id.
    • disable automatically starting a run in jacobian_vector_product and other autodiff calls.

Given all of this, I think we should be wary about adding call-time start_run parameters to the Tesseract surface area. The benefits of making them call-time apply to non-P4D use cases, and don't begin to materialize until at least 2 or 3 more have been added.

Comment on lines +205 to +208
def _ensure_active_run(self) -> None:
"""Ensure there is an active MLflow run, starting one if needed."""
if self.mlflow.active_run() is None:
self.start_run()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this all of a sudden?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly protection - someone can call log_... without having an active run which starts one implicitly, which wouldn't include tags. This is to ensure if someone does that we start the run with the tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, we don't seem to call start_run in the example metrics tesseract: https://github.com/pasteurlabs/tesseract-core/blob/main/examples/metrics/tesseract_api.py#L19-L21.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we always start a run internally before every request, so this should be unnecessary unless you actually observed this failure mode in action.

@dionhaefner
Copy link
Contributor

thread through most of the parameters listed here, including run_name and parent_run_id.

So why not add a generic setting then, like mlflow_extra_args? Then the feature in this PR becomes TESSERACT_MLFLOW_EXTRA_ARGS='{"tags": {"foo": "bar"}, "run_name": "myrun", ...}'.

I'd rather have something less pretty but more likely to be useful for a wide range of users than several specialized configs that are always 1 step behind what people actually need.

@nbowen3
Copy link
Contributor Author
nbowen3 commented Dec 19, 2025

thread through most of the parameters listed here, including run_name and parent_run_id.

So why not add a generic setting then, like mlflow_extra_args? Then the feature in this PR becomes TESSERACT_MLFLOW_EXTRA_ARGS='{"tags": {"foo": "bar"}, "run_name": "myrun", ...}'.

I'd rather have something less pretty but more likely to be useful for a wide range of users than several specialized configs that are always 1 step behind what people actually need.

@dionhaefner That's fair. I'm not super experienced with python but I believe you could parse that env variable into a dict then do something like mlflow.start_run(**extra_args_dict)? That way we don't have to parse out the keys themselves to individual vars and don't have to make additional parsing logic each time a field is added. We do lose some safety this way but could compare keys against a known list.

@dionhaefner
Copy link
Contributor

Yes, for example you can use ast.literal_eval to transform the string contained in the envvar into a Python dict safely.

@nbowen3
Copy link
Contributor Author
nbowen3 commented Dec 19, 2025

Yes, for example you can use ast.literal_eval to transform the string contained in the envvar into a Python dict safely.

@dionhaefner I'll take a pass at implementing that then re-ping. Regarding envvar vs as part of the request, are we fine to keep it as the envvar for now?

@dionhaefner
Copy link
Contributor

Yes that's fine. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

0