8000 [MRG+2] Model persistence doc by pignacio · Pull Request #3317 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG+2] Model persistence doc #3317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 27, 2014

Conversation

pignacio
Copy link
Contributor

Taking over #3084. Fixes #1332

Comments on #3084 should be addressed in this patch.
I took the liberty and simplified the "Security and maintenance limitations" section as it felt overly verbose for me.

Feel free to comment any improvements/corrections on this

@pignacio
Copy link
Contributor Author

Apparently the travis box ran out of memory while running tests under python 3. Lots of tracebacks ending in

  File "/home/travis/anaconda/envs/testenv/lib/python3.4/multiprocessing/popen_fork.py", line 70, in _launch
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

The builds for python 2.6/2.7 passed without errors

@jnothman
Copy link
Member

Travis servers seem to have been relatively unstable lately... sorry i
haven't looked at your changes yet.

On 25 June 2014 23:25, pignacio notifications@github.com wrote:

Apparently the travis box ran out of memory while running tests under
python 3. Lots of tracebacks ending in

File "/home/travis/anaconda/envs/testenv/lib/python3.4/multiprocessing/popen_fork.py", line 70, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

The builds for python 2.6/2.7 passed without errors


Reply to this email directly or view it on GitHub
#3317 (comment)
.

@pignacio
Copy link
Contributor Author

sorry i haven't looked at your changes yet.

No rush :)

I just got curious when the build failed, and once I found why, i guessed it wouldn't harm posting it here

@ogrisel
Copy link
Member
ogrisel commented Jun 26, 2014

Apparently the travis box ran out of memory while running tests under python 3.

Yes 8000 tests involving forks started to fail randomly since a couple of days on travis. They use to be fine for the past year. I hope this is just temporary.

Would be interesting to check wether those failures correlate with the time where travis active builds reach 100% worker capacity: http://status.travis-ci.com/


* Never unpickle untrusted data
* Models saved in one version of scikit-learn might not to load in another
version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add:

It is strongly advised to save metadata information along with the model pickle files about the training data (such as the reference to an immutable snapshot), the Python source of the code used to train the model, the versions of scikit-learn and its dependencies and finally the cross-validation score obtained on the training data. This should make it possible to rebuild a similar model with future versions of scikit-learn and check that the cross-validation score is still in the same range.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for more detail on bookkeeping with pickles.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pignacio My phrasing is bad. Maybe rephrase that long conjunction as a 4 items bullet list to improve readability.

@pignacio
Copy link
Contributor Author

@ogrisel there's the metadata paragraph

On the other hand, general question: I'd love to squash the fixup commits into the older commits to get them out of the way. Any ideas if doing that will break in any way the PR? I've never done that before in GH. We can always do the squashing right before the merge, tough.

@jnothman
Copy link
Member

As long as you're opening a new PR, it shouldn't be a problem. I often
squash before merge anyway.

On 26 June 2014 11:37, pignacio notifications@github.com wrote:

@ogrisel https://github.com/ogrisel there's the metadata paragraph

On the other hand, general question: I'd love to squash the fixup commits
into the older commits to get them out of the way. Any ideas if doing that
will break in any way the PR? I've never done that before in GH. We can
always do the squashing right before the merge, tough.


Reply to this email directly or view it on GitHub
#3317 (comment)
.

@jnothman
Copy link
Member

LGTM!

@jnothman jnothman changed the title Model persistence doc [MRG+1] Model persistence doc Jun 27, 2014
@ogrisel
Copy link
Member
ogrisel commented Jun 27, 2014

+1 for merge.

@jnothman I don't understand your remark about opening a new PR. One could squash the commits and force push to update the existing PR. I don't think this is a problem here as this only about documentation changes and AFAIK nobody is working on a concurrent fork of this branch.

@jnothman
Copy link
Member

Sorry, I thought @pignacio was talking about doing so on another PR. Yes,
sure, you can squash and force-update, or the person merging it can squash
and just merge it into master without updating the PR.

On 27 June 2014 06:30, Olivier Grisel notifications@github.com wrote:

+1 for merge.

@jnothman https://github.com/jnothman I don't understand your remark
about opening a new PR. One could squash the commit and force push to
update the existing PR. I don't think this is a problem here as this only
about documentation changes and AFAIK nobody is working on a concurrent
fork of this branch.


Reply to this email directly or view it on GitHub
#3317 (comment)
.

@jnothman
Copy link
Member

@pignacio if you would like to do the squashing yourself, I shall leave it to you. Otherwise I'm happy to squash and merge this shortly.

@jnothman jnothman changed the title [MRG+1] Model persistence doc [MRG+2] Model persistence doc Jun 27, 2014
@pignacio
Copy link
Contributor Author

@jnothman There we go. Is that ok?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 4b79f22 on pignacio:model_persistence_doc into afcb384 on scikit-learn:master.

@jnothman
Copy link
Member

Sure. I would have just squashed all your commits, but it's fine.

jnothman added a commit that referenced this pull request Jun 27, 2014
DOC Documentation for model persistence
@jnothman jnothman merged commit 6460479 into scikit-learn:master Jun 27, 2014
@jnothman
Copy link
Member

Thanks!

@ogrisel
Copy link
Member
ogrisel commented Jun 27, 2014

Thanks @pignacio for finishing up this doc fix.

@pignacio pignacio deleted the model_persistence_doc branch June 27, 2014 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

model persistence documentation
4 participants
0