Description
The status of the OpenML API is now so that I think we can relatively easily implement a fetcher for OpenML datasets.
You can see some of the discussion here:
openml/OpenML#218 (comment)
The interface should probably either accepting a name or an id. Names are not unique in OpenML, integer IDs are - but less user friendly.
My suggestion would be to do a search call like
https://openml.org/api/v1/json/data/list/data_name/anneal/limit/1
which searches for the anneal dataset. The result will contain the ID of the first dataset called anneal. Then we can fetch that with a second API call as a CSV.
Finally we probably need to also do a call for the JSON meta-data, wh
65D4
ich tells us which column is the target, and probably also which columns are categorical and which are continuous, and possibly more.
For our interface, we definitely need the target column, though.
This should be fairly straight-forward.