WIP simplify naive Bayes parametrization#1525
WIP simplify naive Bayes parametrization#1525amueller wants to merge 3 commits intoscikit-learn:masterfrom
Conversation
|
Making the common tests pass is a bit more tricky. I think this is a step forward in the inferface, though. |
|
Actually, I'm not so sure things have gotten consistent now; I think I'd favor the former for simplicity; Naive Bayes is a typical beginner's tool, so it should be simple. The latter would provide better API compatibility with other estimators, but NB's interpretation of |
|
I see your point about I don't know what you mean by "prior taking That these are mutually exclusive is one of the reasons I think having a single parameter is more natural. One could also imagine |
Yes, but we could decouple them so that unrepresented classes get a weight of 1 and other classes can be up-/downweighted arbitrarily. We'd just have to fill in the ones and normalize, which seems to be what you're implying as well. |
|
Ok but then the meaning of the parameter would change a lot from the previous |
|
Ok so you would keep the |
|
No, because I agree the API is too clumsy (and the code is more complicated than an NB estimator needs to be). Let me formulate a proposal:
This way, custom priors can still be given explicitly, but the parameter becomes much more flexible. Does that sound like something we almost could put in a docstring, aside from the [rationale]? |
|
It sounds a lot like my current docstring, doesn't it ;) It sounds like a good solution to me. I must admit I haven't really thought about the algorithm just shuffled the bits around. Sounds actually kinda simple the way you put it. Do you want to give it a shot or should I try? |
|
Ok it sounds more like I should have written the docstring, actually ;) but it was what I meant, I only put less elegantly. |
|
We can actually just reuse the function from |
|
Looks good! |
|
Ok, I'll add some more tests for dictionaries and also for using lists in SVM and SGD. |
|
Ok so this doesn't seem to make much sense, maybe: So I just unified two functions that did the opposite... I could still use the same function but take the inverse in the |
|
Hm reusing the code is not so trivial given the different normalizations. Any ideas how to handle this better? Just leave the current parametrization? |
|
closing this as it sucks. |
After moving
class_priortoclass_weightas in__init__parameter in #1499, the parametrization was a bit weird. I now got rid offit_priorand letclass_weightbe'auto',None(=uniform) or an array. This is a bit more consistent with the other estimators.I'll try to make it pass the common tests now and I guess I'll also allow dictionaries.