-
-
Notifications
You must be signed in to change notification settings - Fork 325
feature selection: remove duplicated features #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @solegalli, I was wondering if anyone is working on this one? I'm willing to work on this one if you are ok... |
@Tejash-Shah you are more than welcome!! A few things to consider:
Thanks a lot and look forward to working together again :) |
Yes, that was my idea of using itertools where we can create feature pairs and iterate over-pairs. Please check the below screenshot for the implementation. However, only after writing tests, we can be assured about it, below is just a screenshot of my idea. Also, let me know if you were thinking about another implementation using itertools, would be interesting to know. Thank you :) |
@Tejash-Shah looks good! look forward to the PR. |
Hi @solegalli is the develop branch has an error in stylechecks? Tox is throwing off error at
EDIT: |
yes, my bad. I need to fix that. Please ignore that error for the moment, and push as usual. I will let you know when I fixed it. Sorry :( |
@Tejash-Shah fixed |
* add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * create drop duplicate transformer * delete extra fixture Co-authored-by: Soledad Galli <solegalli@protonmail.com>
* add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * create drop duplicate transformer * delete extra fixture Co-authored-by: Soledad Galli <solegalli@protonmail.com>
* style formatting base scripts * reformat style creation modules * reformat style discretisers * reformat style imputers * rewords strings, minor changes * reformat codestyle outliers * reformat code style selection * reformat style transformers * reformat style wrappers * separate woe and ratio encoder closes issue #143 (#149) * Issue 143 * doc issue 143 documentation issue 143 * Update PRatioEncoder.rst 'C' Removed from RareLabelCEncoder whch was causing and error. Also enconder_dict_ ratio results added * Changes in docstrings #143 Changes requested by Sole in docstrings, after first pull request related to this issue. * More docstring changes related to #143 minor updates in docstrings * separate woe and ratio tests into functions #147 separate woe and ratio tests into functions #147 * move PRatioEncoder under WoEencoder in list as these are related transformers * add ' to log_ratio in docstring * fix docstring with intro to transformer funcion * add more detail about the encoding in docstrings * renamed test functions * reword tests Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reorganised folders with jupyter notebooks (#155) * Drop duplicate features #114 (#144) * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * create drop duplicate transformer * delete extra fixture Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reformat code style encoders * shorten dosctrings with flake8 Co-authored-by: NicoGalli <72278140+NicoGalli@users.noreply.github.com> Co-authored-by: Tejash Shah <stejash15@gmail.com>
* add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * create drop duplicate transformer * delete extra fixture Co-authored-by: Soledad Galli <solegalli@protonmail.com>
* style formatting base scripts * reformat style creation modules * reformat style discretisers * reformat style imputers * rewords strings, minor changes * reformat codestyle outliers * reformat code style selection * reformat style transformers * reformat style wrappers * separate woe and ratio encoder closes issue #143 (#149) * Issue 143 * doc issue 143 documentation issue 143 * Update PRatioEncoder.rst 'C' Removed from RareLabelCEncoder whch was causing and error. Also enconder_dict_ ratio results added * Changes in docstrings #143 Changes requested by Sole in docstrings, after first pull request related to this issue. * More docstring changes related to #143 minor updates in docstrings * separate woe and ratio tests into functions #147 separate woe and ratio tests into functions #147 * move PRatioEncoder under WoEencoder in list as these are related transformers * add ' to log_ratio in docstring * fix docstring with intro to transformer funcion * add more detail about the encoding in docstrings * renamed test functions * reword tests Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reorganised folders with jupyter notebooks (#155) * Drop duplicate features #114 (#144) * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * create drop duplicate transformer * delete extra fixture Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reformat code style encoders * shorten dosctrings with flake8 Co-authored-by: NicoGalli <72278140+NicoGalli@users.noreply.github.com> Co-authored-by: Tejash Shah <stejash15@gmail.com>
* sort modules in subfolders, separate classes in individual modules, rename encoding classes, rename few other classes * fix WoE name in doc * sort selection in subfolder, separate class into modules * fix selection imports * update test_selection * update VERSION to 1.0.0 * match doc name with submodule name, fix imports and class names (#129) * reorganise test submodules as per package submodules, create individual test files for each class (#130) * rename class init params and defo values (#131) * reorganise base transformers and functions, clean imports (#133) * fix style check (#141) * improve code style throughout (#142) * remove iterable from init MathematicalCombination * replace listcomp by genexp in arbitrary discretiser * create abstraction decision tree discretiser * create abstraction in base cat encoder * replace listcomp by genexp * create abstraction base numerical imputer * create baseOutlier * final cleanup of code * replace listcomp by genexp in randomsampleimputer * split tests into individual functions (#147) * split imputation tests * fix typo, unify test names in math combination test * reformat line space in sklearwrapper test * remove comment in drop constant tests * split discretisation tests * split encoding tests * split test outliers * split transformer tests * separate woe and ratio encoder closes issue #143 (#149) * Issue 143 * doc issue 143 documentation issue 143 * Update PRatioEncoder.rst 'C' Removed from RareLabelCEncoder whch was causing and error. Also enconder_dict_ ratio results added * Changes in docstrings #143 Changes requested by Sole in docstrings, after first pull request related to this issue. * More docstring changes related to #143 minor updates in docstrings * separate woe and ratio tests into functions #147 separate woe and ratio tests into functions #147 * move PRatioEncoder under WoEencoder in list as these are related transformers * add ' to log_ratio in docstring * fix docstring with intro to transformer funcion * add more detail about the encoding in docstrings * renamed test functions * reword tests Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reorganised folders with jupyter notebooks (#155) * Drop duplicate features #114 (#144) * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * create drop duplicate transformer * delete extra fixture Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reformat code style with black (#153) * style formatting base scripts * reformat style creation modules * reformat style discretisers * reformat style imputers * rewords strings, minor changes * reformat codestyle outliers * reformat code style selection * reformat style transformers * reformat style wrappers * separate woe and ratio encoder closes issue #143 (#149) * Issue 143 * doc issue 143 documentation issue 143 * Update PRatioEncoder.rst 'C' Removed from RareLabelCEncoder whch was causing and error. Also enconder_dict_ ratio results added * Changes in docstrings #143 Changes requested by Sole in docstrings, after first pull request related to this issue. * More docstring changes related to #143 minor updates in docstrings * separate woe and ratio tests into functions #147 separate woe and ratio tests into functions #147 * move PRatioEncoder under WoEencoder in list as these are related transformers * add ' to log_ratio in docstring * fix docstring with intro to transformer funcion * add more detail about the encoding in docstrings * renamed test functions * reword tests Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reorganised folders with jupyter notebooks (#155) * Drop duplicate features #114 (#144) * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * add DropDuplicateFeatures in init * add fixture for duplicate features * add DropDuplicateFeatures functionality * add test for DropDuplicateFeatures * create drop duplicate transformer * delete extra fixture Co-authored-by: Soledad Galli <solegalli@protonmail.com> * reformat code style encoders * shorten dosctrings with flake8 Co-authored-by: NicoGalli <72278140+NicoGalli@users.noreply.github.com> Co-authored-by: Tejash Shah <stejash15@gmail.com> * reformat test code style (#157) * expand style check in tox, add black to test_req (#154) * update docs, shorten lines, test code (#158) * update docs, shorten lines, test code * include v1.0.0 changes in changelog * fix linebreaks in changelog * Add type hints, docstrings, and expand test. Introduce bug fix in _define_variables (#159) * Add .vscode in gitignore * add type hints and docstrings * Add type hints, docstrings, and introduce bug fix * add type hints and docstrings * add type hints, docstrings, and stylistic modifications * add type hints, docstrings, and stylistic modifications * some stylistic modifications * add test to check null values in dataframe * remove redundant docstring * stylistic modification * add new test and expanded existing one * fix flake8 suggestions * fix some indentation errors * fix indention error * add type hints and docstrings in boxcox transformer * add type hints and docstrings in log transformation * add type hints, docstrings, and recaftor constructor in power transformation * add type hints and docstrings in reciprocal transformer * add type hints * add type hints and docstrings * add test case to test the type of math operation argument * remove extra blank line * add test cases for parameter_checks * black it files * update type hints and docstring * black them and update type hints and docstrings * update type hints and docstrings * black it * update type hints * tidy imports Co-authored-by: NicoGalli <72278140+NicoGalli@users.noreply.github.com> Co-authored-by: Tejash Shah <stejash15@gmail.com> Co-authored-by: Nodar Okroshiashvili <n.okroshiashvili@gmail.com>
The text was updated successfully, but these errors were encountered: