GitHub - niklastoe/classifier_metric_uncertainty: Bayesian method to determine the metric uncertainty for (binary) classifiers

Classifier Metric Uncertainty Due to Small Sample Sizes

Classifier metrics (such as accuracy, sensitivity, specificity, precision...) are highly uncertain if they are calculated from a small sample size. Unfortunately, these point estimates are often considered to be exact. We present a Bayesian method to determine metric uncertainty. Our paper explains the underlying concepts and showcases that many published classifiers have surprisingly large metric uncertainties. This repository contains the implementation in Python.

Usage

The easiest way to calculate metric uncertainty is via our interactive, browser-based tool. The site may take a few minutes to load. It does not install any packages or execute any code on your machine, it needs to start the environment on the host. This causes the small delay. Please follow this link to the browser-based tool.

If you want to calculate metric uncertainties on a regular basis or even integrate the method into your workflow, feel free to copy this repository. tutorial.ipynb should give you an idea how to use the most important parts of the code.

Reproducibility

To ensure that all dependencies work as intended, type pip install -r requirements.txt.

Non-standard Packages & Tools

pymc3 (Gelman-Rubin diagnostics and tests)
sympy (metric definition)
Voila (turns my Jupyter Notebook into a standalone application)
Binder (hosts the application)

Citation

@article{toetsch2021classifier,
title={Classifier uncertainty: evidence, potential impact, and probabilistic treatment},
author = {Tötsch, Niklas and Hoffmann, Daniel},
journal={PeerJ Computer Science},
volume={7},
pages={e398},
year={2021},
publisher={PeerJ Inc.}}

Contributing

If you have questions or comments, please create an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.rst		README.rst
__init__.py		__init__.py
classifier_comparison.py		classifier_comparison.py
interactive_notebook.ipynb		interactive_notebook.ipynb
jupyter_config.json		jupyter_config.json
requirements.txt		requirements.txt
tutorial.ipynb		tutorial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classifier Metric Uncertainty Due to Small Sample Sizes

Usage

Reproducibility

Non-standard Packages & Tools

Citation

Contributing

About

Releases

Packages

Contributors 2

Languages

License

niklastoe/classifier_metric_uncertainty

Folders and files

Latest commit

History

Repository files navigation

Classifier Metric Uncertainty Due to Small Sample Sizes

Usage

Reproducibility

Non-standard Packages & Tools

Citation

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages