8000 Use attrs instead of dataclasses by satra · Pull Request #174 · nipype/pydra · GitHub
[go: up one dir, main page]

Skip to content

Use attrs instead of dataclasses #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jan 18, 2020
Merged

Use attrs instead of dataclasses #174

merged 18 commits into from
Jan 18, 2020

Conversation

satra
Copy link
Contributor
@satra satra commented Jan 11, 2020

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Summary

Switch to attrs from dataclasses

Checklist

  • All tests passing
  • I have added tests to cover my changes
  • I have updated documentation (if necessary)
  • My code follows the code style of this project
    (we are using black: you can pip install pre-commit,
    run pre-commit install in the pydra directory
    and black will be run automatically with each commit)

Acknowledgment

  • I acknowledge that this contribution will be available under the Apache 2 license.

@satra satra changed the title Use attrs instead of dataclasses WIP: Use attrs instead of dataclasses Jan 11, 2020
@satra
Copy link
Contributor Author
satra commented Jan 11, 2020

@djarecka @effigies - this is very much a WIP to go in parallel with #173. i have only run test_specs.py for now. was even able to un-xfail a few tests.

i do think even based on this, that attrs will simplify a lot of the code. in this particular instance i'm doing a very superficial change (keeping the pickling stuff for instance). but overall it seems quite straightforward to move over.

here is what i have used.

dc.dataclass --> @attr.s(auto_attribs=True, kw_only=True)
dc.fields(klass) --> attr.fields(klass)
dc.fields(instance) --> attr_fields(instance) #defined this function
dc.field --> attr.ib
dc.make_dataclass --> attr.make_class (with the fields having to be attr.ib)

allows better inheritance. everything is a keyword arg, and defaults work (@djarecka - so default_value may need to be removed or altered)

@codecov
Copy link
codecov bot commented Jan 11, 2020

Codecov Report

Merging #174 into master will decrease coverage by 0.01%.
The diff coverage is 90.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #174      +/-   ##
==========================================
- Coverage   86.72%   86.71%   -0.02%     
==========================================
  Files          17       17              
  Lines        2644     2702      +58     
  Branches      680      700      +20     
==========================================
+ Hits         2293     2343      +50     
- Misses        228      230       +2     
- Partials      123      129       +6
Flag Coverage Δ
#unittests 86.71% <90.32%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pydra/engine/audit.py 88.88% <100%> (ø) ⬆️
pydra/engine/helpers_state.py 91.08% <100%> (+0.01%) ⬆️
pydra/engine/task.py 84.54% <73.33%> (-2.24%) ⬇️
pydra/engine/helpers.py 85.79% <81.81%> (-0.19%) ⬇️
pydra/engine/helpers_file.py 79.69% <87.87%> (+1.64%) ⬆️
pydra/engine/core.py 87.84% <96.29%> (+0.07%) ⬆️
pydra/engine/specs.py 91.45% <98.57%> (+1.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 51fdbbf...752ac5d. Read the comment docs.

@satra
Copy link
Contributor Author
satra commented Jan 12, 2020

@djarecka - just a note that default_value will likely not be required. i've turned all the inputs/outputs into keyword only classes. i haven't removed default_value yet.

@djarecka
Copy link
Collaborator

@satra - great! I tried to read this quickly, but I'd need more time to understand all changes, will do it later

@satra
Copy link
Contributor Author
satra commented Jan 12, 2020

@djarecka - for me all tests pass under engine except 3 in test_node_task.py

@djarecka
Copy link
Collaborator

@satra - i can take a look

@satra
Copy link
Contributor Author
satra commented Jan 17, 2020

@effigies and @djarecka - i am opening this PR for discussion before we merge.

i am going to vote yes since attrs appears to have a nice approach to validation as well, which can be globally disabled. so this should allow us to do type validation quite easily.

as far as i could tell, attrs still does not address the pickling issue.

@satra satra changed the title WIP: Use attrs instead of dataclasses Use attrs instead of dataclasses Jan 17, 2020
if modified_inputs is not None:
self.inputs = attr.evolve(self.inputs, **modified_inputs)
cmdline = " ".join(self.container_args + self.command_args)
self.inputs = attr.evolve(self.inputs, **orig_inputs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this needed to pass the existing tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was updated to reflect the fact that one can call cmdline before calling run. and this property is not used for running.

but to answer your original question, i don't think there was a test that tested this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to find situation when your modification changes anything. If you think about a single task and running cmdline before run, this worked without your changes. If you think about workflows, this doesn't work, there are still LF. Could you give me some examples or revert the changes for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you prefer to keep it, I'll just make a note to think about it later, but couldn't find a good example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will also have to take the changes away from run. to test the issue, take the current master branch, print cmdline from a container task, run it, and then print cmdline again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in master this modified test will fail. (i'll push it to this branch)

def test_docker_inputspec_3(plugin, tmpdir):
    """ input file is in the container, so metadata["container_path"]: True,
        the input will be treated as a str """
    filename = "/proc/1/cgroup"

    cmd = "cat"

    my_input_spec = SpecInfo(
        name="Input",
        fields=[
            (
                "file",
                File,
                dc.field(
                    metadata={
                        "mandatory": True,
                        "position": 1,
                        "help_string": "input file",
                        "container_path": True,
                    }
                ),
            )
        ],
        bases=(DockerSpec,),
    )

    docky = DockerTask(
        name="docky",
        image="busybox",
        executable=cmd,
        file=filename,
        input_spec=my_input_spec,
        strip=True,
    )

    cmdline = docky.cmdline # in master this will fail
    res = docky()
    assert "docker" in res.output.stdout
    assert cmdline == docky.cmdline

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't see a difference between the old cmdline and the new one: you can see: satra#12

@djarecka
Copy link
Collaborator

sure, let's change it!

@effigies
Copy link
Contributor

I'm okay with changing it. Also, do we have a precise description of what the pickling problem is?

@satra
Copy link
Contributor Author
satra commented Jan 17, 2020

do we have a precise description of what the pickling problem is?

concurrent futures uses regular pickle, so cloudpickle cannot be used.

@satra
Copy link
Contributor Author
satra commented Jan 17, 2020

I'm okay with changing it.

i'll let @djarecka hit the merge button once she is ok with it. i'll make a separate dataclasses branch from master in case we wan't to easily find it.

@@ -10,7 +10,7 @@ A simple dataflow engine with scalable semantics.
The goal of pydra is to provide a lightweight Python dataflow engine for DAG construction, manipulation, and distributed execution.

Feature list:
1. Python 3.7+ using type annotation and dataclasses
1. Python 3.7+ using type annotation and [attrs](https://www.attrs.org/en/stable/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still use 3.7+ features, or can we relax to 3.6? (I know that at this point 3.7 isn't cutting edge, so maybe it doesn't matter.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my current intent was to get this merged, then do additional trimming/testing/relaxing. i did mention the 3.6 expansion to dorota when we met.

@djarecka
Copy link
Collaborator
djarecka commented Jan 18, 2020 via email

@djarecka djarecka merged commit 8bee39f into nipype:master Jan 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0