8000 Add more dtype details to docstring standard doc · Issue #14664 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Add more dtype details to docstring standard doc #14664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thomasjpfan opened this issue Aug 15, 2019 · 12 comments · Fixed by #14744
Closed

Add more dtype details to docstring standard doc #14664

thomasjpfan opened this issue Aug 15, 2019 · 12 comments · Fixed by #14744

Comments

@thomasjpfan
Copy link
Member

Continuing #12356
Related to #14640

  1. When None is the default, we can just set it as default:
name: array-like of shape (n_samples,), default=None
  1. When the type of the ndarray is known:
name: int ndarray of shape (n_samples,)

CC @adrinjalali

@NicolasHug
Copy link
Member

submit a PR?

@adrinjalali
Copy link
Member

Makes sense +1

8000

@glemaitre
Copy link
Member
glemaitre commented Aug 19, 2019

OK, I found the discussion then :)

I am a bit in trouble with (2). It is even more confusing with something like str list. list of str is much better. Then, we get into trouble with array where we have shape: ndarray of int of shape (n_samples,). I don't have a good proposal :S

name: ndarray - dtype=int - shape=(n_samples,)

@thomasjpfan
Copy link
Member Author

Going with shape= may work slightly nicer:

name: ndarray of int, shape=(n_samples,)

world: ndarray, shape=(n_samples,)

@NicolasHug
Copy link
Member

I think we already went for of shape ().

maybe ndarry of shape (n_samples,), dtype=int?

That being said, I feel like specifying dtypes is mostly useful on private APIs, where we can be slightly more flexible on the style. IMO most user-facing docs shouldn't worry too much about dtypes. For example, I don't think drop_idx_ for OneHotEncoder benefits from specifying a dtype.

@thomasjpfan
Copy link
Member Author

What if we have a list that can have multiple types, like a list of tuples:

features : {(int, ), (int, int)} list

or

features : list of {(int, ), (int, int)}

@glemaitre
Copy link
Member

I would say second one

@thomasjpfan
Copy link
Member Author
thomasjpfan commented Aug 19, 2019

I think we already went for of shape ().

I would be open to changing it, to handle the situation where we have the dtype, and the shape of an array.

ndarry of shape (n_samples,), dtype=int

The dtype is pretty far from the ndarray now.

@glemaitre If we use the second one then:

name : ndarray of int of shape (n_samples,)

That feels like there is one two many "of"s. We kind of need two delimiters, one for type and another for shape.

@NicolasHug
Copy link
Member

We can go case by case? How many instances of these do we have? How many of them are public docs?

@jnothman
Copy link
Member
jnothman commented Aug 19, 2019 via email

@adrinjalali
Copy link
Member

I think here it's only about the dtype, I guess. And for that I personally prefer the dtype ndarray of shape(...)

@thomasjpfan thomasjpfan changed the title Add more details to docstring standard doc Add more dtype details to docstring standard doc Aug 21, 2019
@thomasjpfan
Copy link
Member Author

I'm not sure which part this sets out to discuss. When None is the default,
sometimes it means something quite different to "none" and that deserves
clarifying.

@jnothman This discussion is around the dtype. In the case of None, we could do one of the following:

  1. array-like of shape (n_samples,) or None, default=None

  2. array-like of shape (n_samples,) or None

  3. array-like of shape (n_samples,), default=None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
0