-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Refactor _convert_container
#28681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refactor _convert_container
#28681
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Charlie-XIAO. Here are some comments. I'm +1 for the refactoring of _convert_container
.
Once the CIs are green, I'll review ;) (I am buying a bit of time). |
Now I have resolved all suggestions from Jérémie, and CI is green :) |
I'm going to resolve the conflicts and make a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments before to look at the tests.
@Charlie-XIAO would you mind resolving conflicts? |
Will do soon (maybe within one day or two). |
Sorry for the super long delay @adrinjalali - I was super busy with coursework during the semester 😢. I have now resolved the conflicts, but not sure if anything has changed since my last update since it is affecting quite a lot of tests. |
The test failure seems to be because that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks @Charlie-XIAO .
@@ -2365,6 +2365,15 @@ def _is_polars_df(X): | |||
return isinstance(X, pl.DataFrame) | |||
|
|||
|
|||
def _is_arrow_df(X): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a CI with arrow installed? (I think we should)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think we do have it in pylatest_conda_forge_mkl_linux-64
:
- pyarrow |
Follow up of #28521, see #28521 (comment).
This PR splits
constructor_name
intoconstructor_type
andconstructor_lib
, and adds two parameterssparse_container
andsparse_format
to reflect dense versus sparse. /cc @glemaitreMapping from the original
constructor_name
to the specifications in this PRconstructor_type
constructor_lib
sparse_container
sparse_format
"pandas"
None
"csr"
"list"
"list"
"tuple"
"tuple"
"slice"
"slice"
"array"
"array"
"sparse"
/"sparse_csr"
"array"
"matrix"
"sparse_csc"
"array"
"matrix"
"csc"
"sparse_csr_array"
"array"
"array"
"sparse_csc_array"
"array"
"array"
"csc"
"dataframe"
/"pandas"
"dataframe"
"polars"
"dataframe"
"polars"
"pyarrow"
"dataframe"
"pyarrow"
"series"
"series"
"polars_series"
"series"
"polars"
"index"
"index"
Some rules of when each parameter is applicable
sparse_container
has no effect unlessconstructor_type="array"
sparse_format
has no effect unlessconstructor_type="array"
andsparse_container
is notNone
constructor_lib
has no effect unlessconstructor_type
is one of"dataframe"
,"series"
, and"index"
min_version
has no effect unlessconstructor_lib
is usedcolumn_names
(renamed fromcolumns_name
) has no effect unlessconstructor_type="dataframe"
categorical_feature_names
has no effect unlessconstructor_type="dataframe"
Constraints on the shape of the input container
constructor_type="slice"
requires 1-dimensional container with exactly 2 elementsconstructor_type="dataframe"
requires 2-dimensional container (previously it converts(n,)
shape to(n, 1)
shape)constructor_type="series"
requires 1-dimensional container (previously it creates series with each element being a list when given 2-dimensional container)constructor_type="index"
requires 1-dimensional container (similar to"series"
)