10000 Re-design ANN loss functions along with some bug fixes by iamshnoo · Pull Request #2495 · mlpack/mlpack · GitHub
[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-design ANN loss functions along with some bug fixes #2495

Closed
wants to merge 14 commits into from

Conversation

iamshnoo
Copy link
Member
@iamshnoo iamshnoo commented Jul 2, 2020

Trying to solve #2444 .
Also re-designs some of the loss functions to adapt them for a future PR that will allow none-type reduction support.

Google Colab notebooks and brief description of the problem for each of the loss functions can be checked from the first comment of the linked issue above. (I might be wrong about some of these, please let me know if I am and also how I should go about correcting them.)

I have divided each loss function into a separate commit for the first 11 commits to this branch, so that it might be easier to review.

Comment on lines +37 to +42
typename InputType::elem_type lossSum = arma::accu(loss);

if (reduction)
return lossSum;

return lossSum / input.n_elem;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if reduction = true, mean loss has to be returned.

Copy link
Member Author
@iamshnoo iamshnoo Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am returning sum, when reduction = true.
I am doing that for each and every loss function, so there shouldn't be any non-uniformity, I guess.

Is there any specific reason why I should return mean instead of sum, when reduction = true?
(apart from the fact, that PyTorch does it)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal. But it just looked counter-intuitive. Any new user would expect that "reduced" form is less than "non-reduced" form. (It's like swapping the keys Home and End in your keyboard)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-reduced form is a vector / matrix / cube.
For all the loss functions I have touched on above, the non-reduced loss is stored in the variable loss.
And the reduced form is a single number, either the sum or the mean of all the values in the loss variable.

The reason why I preferred the sum reduction is simply that sum will have a greater value than mean for all cases, which might be more effective when back propagating it through previous layers.

Lets wait for some more opinions on this.
If more people feel this is counter-intuitive, I will swap them back to what you suggested :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this for all the other functions, which I think you do, it's fine for me.

@Abilityguy
Copy link
Contributor

Hi @iamshnoo , Are you working on this?.
I could help with the merge fixes if needed.

@iamshnoo
Copy link
Member Author
iamshnoo commented Mar 6, 2021

Feel free to go ahead with the merge fixes. I am not working on this at the moment. You can tag me in the PR if you need me to take a look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0