GitHub - wywongbd/autocluster: AutoML for clustering models in sklearn.

autocluster

autocluster is an automated machine learning (AutoML) toolkit for performing clustering tasks.

Report and presentation slides can be found here and here.

Prerequisites

Python 3.5 or above
Linux OS, or Windows WSL is also possible

How to get started?

First, install SMAC:

sudo apt-get install build-essential swig
conda install gxx_linux-64 gcc_linux-64 swig
pip install smac==0.8.0

pip install autocluster

How it works?

autocluster automatically optimizes the configuration of a clustering problem. By configuration, we mean
- choice of dimension reduction algorithm
- choice of clustering model
- setting of dimension reduction algorithm's hyperparameters
- setting of clustering model's hyperparameters
autocluster provides 3 different approaches to optimize the configuration (with increasing complexity):
- random optimization
- bayesian optimization
- bayesian optimization + meta-learning (warmstarting)

Algorithms/Models supported

List of dimension reduction algorithms in sklearn supported by autocluster's optimizer.

List of clustering models in sklearn supported by autocluster's optimizer.

Examples

Examples are available in these notebooks.

Experimental results

This dataset comprises of 16 Gaussian clusters in 128-dimensional space with N = 1024 points. The optimal configuration obtained by autocluster (SMAC + Warmstarting) consists of a Truncated SVD dimension reduction model + Birch clustering model.

This dataset comprises of 15 Gaussian clusters in 2-dimensional space with N = 5000 points. The optimal configuration obtained by autocluster (SMAC + Warmstarting) consists of a TSNE dimension reduction model + Agglomerative clustering model.

Links

Link to pypi.
Great writeup by Martin Krasser on Bayesian Optimization

Disclaimer

The project is experimental and still under development.

Name		Name	Last commit message	Last commit date
Latest commit History 835 Commits
autocluster		autocluster
data		data
experiments		experiments
images		images
reports		reports
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autocluster

Prerequisites

How to get started?

How it works?

Algorithms/Models supported

Examples

Experimental results

Links

Disclaimer

About

Releases

Packages

Contributors 6

Languages

License

wywongbd/autocluster

Folders and files

Latest commit

History

Repository files navigation

autocluster

Prerequisites

How to get started?

How it works?

Algorithms/Models supported

Examples

Experimental results

Links

Disclaimer

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages