8000 Unit test times · Issue #339 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Unit test times #339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
robertlayton opened this issue Sep 3, 2011 · 13 comments
Closed

Unit test times #339

robertlayton opened this issue Sep 3, 2011 · 13 comments
Labels
Easy Well-defined and straightforward way to resolve Enhancement
Milestone

Comments

@robertlayton
Copy link
Member

Following a discussion on the mailing list about the total build time, including testing, it was suggested that some of the unit tests are slow. The following functions take the most cumulative time in running the tests, and give insight into which tests should be addressed first. The hmm/gmm tests are currently the worst for taking time to run. Can someone please have a look at these tests and work out if its possible to make them faster?

I've taken some liberty with the following output, but its basically the result of running $ python profile_sklearn.py | grep "test"

         8564043 function calls (8511572 primitive calls) in 54.349 CPU seconds

   Ordered by: cumulative time
   List reduced from 4738 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   54.389   54.389 __init__.py:27(test)
        1    0.001    0.001   54.389   54.389 nosetester.py:249(test)
       17    0.002    0.000   13.312    0.783 test_hmm.py:211(train_hmm_and_keep_track_of_log_likelihood)
        5    0.001    0.000    6.735    1.347 test_hmm.py:315(test_fit)
        6    0.001    0.000    3.641    0.607 test_hmm.py:610(test_fit)
       12    0.001    0.000    3.594    0.300 test_gmm.py:202(test_train)
     2475    0.007    0.000    3.343    0.001 test_hmm.py:216(<genexpr>)
        7    0.000    0.000    2.653    0.379 test_gaussian_process.py:35(test_2d)
        1    0.001    0.001    2.420    2.420 test_ridge.py:237(test_dense_sparse)
        4    0.001    0.000    2.369    0.592 test_hmm.py:353(test_fit_with_priors)
        1    0.004    0.004    1.893    1.893 test_image.py:57(test_connect_regions)
        1    0.000    0.000    1.782    1.782 test_gaussian_process.py:83(test_ordinary_kriging)
        2    0.046    0.023    1.622    0.811 test_ridge.py:115(_test_ridge_loo)
        1    0.004    0.004    1.610    1.610 test_image.py:65(test_connect_regions_with_grid)
        2    0.000    0.000    1.504    0.752 test_sgd.py:255(test_auto_weight)
        1    0.000    0.000    1.470    1.470 test_cross_val.py:63(test_permutation_score)
        4    0.000    0.000    1.466    0.367 cross_val.py:812(permutation_test_score)
      404    0.040    0.000    1.242    0.003 cross_val.py:754(_permutation_test_score)
        2    0.000    0.000    1.179    0.590 test_hmm.py:495(test_fit)
        1    0.000    0.000    1.142    1.142 test_hmm.py:396(test_fit_startprob_and_transmat)
        1    0.001    0.001    1.141    1.141 test_coordinate_descent.py:137(test_enet_path)
        1    0.000    0.000    1.126    1.126 test_image.py:167(test_reconstruct_patches_perfect_color)
      602    0.006    0.000    1.116    0.002 doctest.py:780(find)
       77    0.001    0.000    1.104    0.014 doctest.py:2144(runTest)
       77    0.001    0.000    1.102    0.014 doctest.py:1344(run)
        1    0.006    0.006    1.000    1.000 test_neighbors.py:70(test_neighbors_sparse_classification)
       77    0.009    0.000    0.982    0.013 doctest.py:1196(__run)
        1    0.000    0.000    0.914    0.914 test_parallel.py:82(test_error_capture)
4804/2383    0.046    0.000    0.888    0.000 doctest.py:893(_find)
     1874    0.007    0.000    0.836    0.000 suite.py:370(_get_wrapped_tests)
     1370    0.002    0.000    0.809    0.001 suite.py:92(_get_tests)
        1    0.000    0.000    0.804    0.804 test_nmf.py:108(test_projgrad_nmf_sparseness)

For reference, profile_sklearn.py is:

import cProfile
from sklearn import test

cProfile.run("test()", 'sklearn.prof')

import pstats
p = pstats.Stats('sklearn.prof')
p.strip_dirs().sort_stats('cumulative').print_stats(500)
@fabianp
Copy link
Member
fabianp commented Sep 7, 2011

the benchmark is really useful. Assigning milestone 0.10.

@acompa
Copy link
acompa commented Sep 8, 2011

@robertlayton @fabianp Per Robert's suggestion on the list, I looked through a couple of issues and landed here.

I'd like to improve a couple of these unit tests. I'm assuming I can ignore the HMM tests for now, since the module might be deprecated soon?

@robertlayton
Copy link
Member Author

Its a good place to start (if you don't mind a bit of a steep learning curve!).

Have you done python profiling before? If not, I'm happy to explain the format of the results.

@acompa
Copy link
acompa commented Sep 8, 2011

No clue--but like I mentioned before, I'm here to learn. :D

What would you recommend as a reference on Python profilers? Is this a good start?

@robertlayton
Copy link
Member 8000 Author

Thats a good reference, much better than other docs. I wouldn't worry too much about the detail though, a single read to get the gist is fine.

The report I've got is sorted by cumulative time - this is the total amount of time spent within that function.
Obviously, the highest one is the __init__.py:27(test) - this is the function (test within the init.py_ file) I called, so when it was done, the program exits. test basically just calls nosetests, which is why nosetester.py:249(test) is the next on the list.

The real meat is the next line:

17    0.002    0.000   13.312    0.783 test_hmm.py:211(train_hmm_and_keep_track_of_log_likelihood)

This means, of the 54 seconds it took to run all of the tests, 13 seconds were spent in the train_hmm_and_keep_track_of_log_likelihood method. This method was called 17 times (the first column) and each call took an average of 0.783 seconds.

If we ignore the hmm and gmm ones, the most likely candidate for optimisation is

1    0.001    0.001    2.420    2.420 test_ridge.py:237(test_dense_sparse)

This one test takes 2.4 seconds. It is the test_dense_sparse function this file: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/tests/test_ridge.py
Why does it take so long to run? A quick examination doesn't show too much wrong with it but it does run a lot of different tests, so you may need to profile just this function and see where it takes all of its time. It may just be that this function runs a lot of different tests and that is why it takes so long. Let me know if you need a pointer.

@acompa
Copy link
acompa commented Sep 8, 2011

This is a great rundown--thanks for taking the time to explain! I'll review over the next couple of days and let you know if I get stuck on something.

@robertlayton
Copy link
Member Author

Good luck :)

@acompa
Copy link
acompa commented Jul 20, 2012

Picking this up after a delay. The good news is that I'm now comfortable enough with Python to knock this out.

Here are the most recent results:

(scikit-learn) chomp@mac 0 ~ $ head -n 20 sklearn_test_results
Running unit tests and doctests for sklearn
test_memory setup
test_memory teardown
test_memory setup
test_memory teardown
        1    0.000    0.000   47.196   47.196 __init__.py:27(test)
        1    0.001    0.001   47.195   47.195 nosetester.py:249(test)
        1    0.000    0.000   47.031   47.031 unittest.py:800(__init__)
     1160    0.002    0.000   44.016    0.038 unittest.py:299(__call__)
     1160    0.013    0.000   44.013    0.038 unittest.py:264(run)
       16    0.003    0.000    3.593    0.225 test_hmm.py:184(train_hmm_and_keep_track_of_log_likelihood)
        1    0.000    0.000    2.127    2.127 unittest.py:609(loadTestsFromNames)
        1    0.000    0.000    2.059    2.059 test_robust_covariance.py:20(test_mcd)
        6    0.001    0.000    2.059    0.343 test_robust_covariance.py:42(launch_mcd_on_dataset)
        5    0.001    0.000    1.899    0.380 test_hmm.py:296(test_fit)
        1    0.004    0.004    1.630    1.630 test_image.py:58(test_connect_regions)
        1    0.001    0.001    1.577    1.577 test_forest.py:258(test_pickle)
       12    0.002    0.000    1.479    0.123 test_gmm.py:183(test_train)
        7    0.000    0.000    1.452    0.207 test_gaussian_process.py:37(test_2d)
        1    0.003    0.003    1.357    1.357 test_image.py:66(test_connect_regions_with_grid)

This function in test_hmm.py

def train_hmm_and_keep_track_of_log_likelihood(hmm, obs, n_iter=1, **kwargs):
    hmm.n_iter = 1 
    hmm.fit(obs)
    loglikelihoods = []
    for n in xrange(n_iter):
        hmm.n_iter = 1 
        hmm.init_params = ''
        hmm.fit(obs)
        loglikelihoods.append(sum(hmm.score(x) for x in obs))
    return loglikelihoods

is responsible for about 3.5s of build time. It gets called 16 times while building with a variety of values for n_iter--if we standardize that to, say, n_iter=5, I'm sure we'll see a few seconds of savings.

We also see sklearn.covariance.tests.test_robust_covariance.launch_mcd_on_dataset() taking about 2s of build time over 6 calls. I'm not too sure what's happening there, exactly; sklearn.covariance.robust_covariance cites a paper for FastMCD, and I'll take a look at that.

@amueller
Copy link
Member

Thanks for tackling this :)

I think it would be better to sort by tottime than cumtime. Well at least looking at both is a good idea.

ogrisel added a commit that referenced this issue Jul 20, 2012
Issue #339: minor optimization
@robertlayton
Copy link
Member Author

I'd be happy to close this bug, as we now have much faster tests overall, and a new benchmarking framework that will help in the future. Everyone happy for that?

@amueller
Copy link
Member
amueller commented Sep 3, 2012

Actually I'm still unhappy with the timing ;)

@ogrisel
Copy link
Member
ogrisel commented Sep 3, 2012

This utility is nice to find the worst offending tests: https://github.com/patjenk/nose-timer

larsmans added a commit that referenced this issue Mar 17, 2013
larsmans added a commit that referenced this issue Jul 18, 2013
@larsmans
Copy link
Member

Closing as this issue is too open-ended to be properly fixed. The codebase changes all the time, and so do the testing times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Easy Well-defined and straightforward way to resolve Enhancement
Projects
None yet
Development

No branches or pull requests

6 participants
0