8000 Support for dataclasses at a surface level · Issue #634 · nipype/pydra · GitHub
[go: up one dir, main page]

Skip to content

Support for dataclasses at a surface level #634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghisvail opened this issue Mar 24, 2023 · 4 comments
Closed

Support for dataclasses at a surface level #634

ghisvail opened this issue Mar 24, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@ghisvail
Copy link
Collaborator

I am a big advocate for class-based definition of input and output task specifications over dynamic constructs via pydra.specs.SpecInfo(..., fields=[...]).

One annoying thing though is that class-based specifications aren't working with built-in dataclasses although they could from a pure semantic viewpoint. I understand the rationales (mostly pickling) for the transition at the implementation level, but I was wondering whether we could keep support for dataclasses at the surface API level.

For instance, let's take the following interface for fslreorder2std:

@attrs.define(slots=False, kw_only=True)
class FSLReorient2StdSpec(pydra.specs.ShellSpec):
    """Specificiations for fslreorient2std."""

    input_image: os.PathLike = attrs.field(
        metadata={
            "help_string": "input image",
            "mandatory": True,
            "argstr": "",
            "position": -2,
        }
    )

    output_image: str = attrs.field(
        metadata={
            "help_string": "output reoriented image",
            "argstr": "",
            "position": -1,
            "output_file_template": "{input_image}_r2std",
        }
    )

    output_matrix: str = attrs.field(
        metadata={
            "help_string": "output transformation matrix",
            "argstr": "-m",
            "output_file_template": "{input_image}_r2std.mat",
            "keep_extension": False,
        }
    )


class FSLReorient2Std(pydra.engine.ShellCommandTask):
    """Task definition for fslreorient2std."""

    executable = "fslreorient2std"

    input_spec = pydra.specs.SpecInfo(
        name="FSLReorient2StdInput", bases=(FSLReorient2StdSpec,)
    )

There really is nothing specific to attrs which is not covered by dataclasses:

diff --git a/pydra/tasks/fsl/fslreorient2std.py b/pydra/tasks/fsl/fslreorient2std.py
index c35a914..28b1724 100644
--- a/pydra/tasks/fsl/fslreorient2std.py
+++ b/pydra/tasks/fsl/fslreorient2std.py
@@ -15,18 +15,18 @@ Examples

 __all__ = ["FSLReorient2Std"]

-import os

-import attrs
+import dataclasses
+import os

 import pydra


-@attrs.define(slots=False, kw_only=True)
+@dataclasses.dataclass(kw_only=True)
 class FSLReorient2StdSpec(pydra.specs.ShellSpec):
     """Specificiations for fslreorient2std."""

-    input_image: os.PathLike = attrs.field(
+    input_image: os.PathLike = dataclasses.field(
         metadata={
             "help_string": "input image",
             "mandatory": True,
@@ -35,7 +35,7 @@ class FSLReorient2StdSpec(pydra.specs.ShellSpec):
         }
     )

-    output_image: str = attrs.field(
+    output_image: str = dataclasses.field(
         metadata={
             "help_string": "output reoriented image",
             "argstr": "",
@@ -44,7 +44,7 @@ class FSLReorient2StdSpec(pydra.specs.ShellSpec):
         }
     )

-    output_matrix: str = attrs.field(
+    output_matrix: str = dataclasses.field(
         metadata={
             "help_string": "output transformation matrix",
             "argstr": "-m",

As you can see the semantics are equivalent (not suprising considering their historical synergies), yet dataclasses has the benefits of being built-in and having its documentation right there with the rest of Python. Perhaps by transforming the dataclasses-based declaration into an attrs representation internally?

@ghisvail ghisvail added the enhancement New feature or request label Mar 24, 2023
@ghisvail
Copy link
Collaborator Author

Keep in mind that I might be missing some relevant context. I am reporting this issue from the viewpoint of a task contributor and a novice developer to the project. I know quite a few project which made the jump to attrs (for instance pytest lately if I am not mistaken), yet the surface-level API does not leak any attrs specific features or limitations to my knowledge.

@satra
Copy link
Contributor
satra commented Mar 25, 2023

@ghisvail - this is a path we took after careful consideration of all the issues we ran into with dataclasses. we realized we were building attrs when working with dataclasses. having two different object types makes it a lot more complicated, even if it's hidden away internally.

hence, i would suggest going down this road only when the following aspects have been considered/demonstrated for dataclasses:

  • validation
  • dynamic inheritance
  • pickling of dynamic classes

pydra is not just for CLI tools. we wanted pydra to be flexible enough to handle python functions, also in other libraries, as well and to be able to use them in workflows. in fact the pydra-ml library that we built doesn't have any command line tools for interfaces.

we have even considered how to simplify the user specification, but have repeatedly come back to the notion that providing a bit more syntactic sugar doesn't help significantly beyond the current construction possibilities of specs.

personally, i am always open to reconsidering options, but in this case given all the other aspects of pydra we have, we should really consider what we loose by switching back to dataclasses.

@ghisvail
Copy link
Collaborator Author

Hey @satra, thanks for taking the time to provide some more context to this 👍

@ghisvail - this is a path we took after careful consideration of all the issues we ran into with dataclasses. we realized we were building attrs when working with dataclasses. having two different object types makes it a lot more complicated, even if it's hidden away internally.

Just to be sure, I am not suggesting to replace attrs by dataclasses everywhere within pydra, but having at least simpler use cases which are fully expressible with dataclasses (like the example) not require attrs.

hence, i would suggest going down this road only when the following aspects have been considered/demonstrated for dataclasses:

* validation

* dynamic inheritance

* pickling of dynamic classes

Those are for Pydra internals, no? I am only talking about the surface level API here.

pydra is not just for CLI tools. we wanted pydra to be flexible enough to handle python functions, also in other libraries, as well and to be able to use them in workflows. in fact the pydra-ml library that we built doesn't have any command line tools for interfaces.

Indeed. Though I see no class-based definitions of specifications in pydra-ml, only decorator-style task definitions. I am not proposing to touch those, they are totally fine and don't expose attrs.

we have even considered how to simplify the user specification, but have repeatedly come back to the notion that providing a bit more syntactic sugar doesn't help significantly beyond the current construction possibilities of specs.

I personnaly find most of the UX for task definition, be it via pydra.mark decorators or pydra.specs.SpecInfo, really good. None of it exposes nor requires attrs specific knowledge, which is a sign of good encapsulation. I just wish static style definitions would not leak attrs as a requirement.

personally, i am always open to reconsidering options, but in this case given all the other aspects of pydra we have, we should really consider what we loose by switching back to dataclasses.

Once again, I am not proposing to switch Pydra's internals from attrs to dataclasses, only suggesting to also allow usage of dataclasses for simple static task definitions.

It might not be feasible for reasons I have overlooked, hence my motivation for opening this issue. Perhaps it's a stupid idea actually 😅

@tclose
Copy link
Contributor
tclose commented Apr 1, 2025

#768

@tclose tclose closed this as completed Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants
0