-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
HTTP 500 error in fetch_mldata mauna-loa-atmospheric-co2 #11108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
My understanding is that the long-term solution is the openml fetcher #9543 (not 100% sure what the status is). mldata.org has historically not been extremely reliable but if this is just temporary glitches I would say we should ignore them as we have done so far. The feeling I got when investigating #8588 is that mldata.org maintenance is not very active (no disrespect intended, just saying that there is not a staff of 10 full-time people behind it). Edit: more details about who maintains mldata.org: #8588 (comment). If it starts to be too annoying to be ignored, we could probably implement a retry mechanism, but someone should double-check that it actually fixes the problem. For example when a glitch happens it may actually last for a few minutes, in which case a retry mechanism may not be a great fit. |
Apart from |
Agreed. Let's close this for now and re-open later if needed. |
@lesteve status is it's on my todo and I'm back from the dead (aka teaching) |
or completing the openml PR and loading from there
…On 18 May 2018 9:26 pm, "Roman Yurchak" ***@***.***> wrote:
There have been several PRs (#11100 (review)
<#11100 (review)>,
#11106 <#11106>) where
CircleCI arbitrarly failes due to HTTP 500 errors when calling
fetch_mldata('mauna-loa-atmospheric-co2'),
Partial traceback below,
Traceback (most recent call last):
File "/home/circleci/project/examples/gaussian_process/plot_gpr_co2.py", line 75, in <module>
data = fetch_mldata('mauna-loa-atmospheric-co2').data
File "/home/circleci/project/sklearn/datasets/mldata.py", line 154, in fetch_mldata
mldata_url = urlopen(urlname)
File "/home/circleci/miniconda/envs/testenv/lib/python3.6/urllib/request.py", line 223, in urlopen
[...]
File "/home/circleci/miniconda/envs/testenv/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)urllib.error.HTTPError: HTTP Error 500: Internal Server Error
If this keeps repeating, possible workaround could be,
- increasing the number of download attemps in fetch_mldata
- investigating the failures upstream with mldata.org
- copying this particular dataset to figshare (#7425
<#7425>) which
seems to have a better quality of service and adding a fallback URL
there
<https://github.com/scikit-learn/scikit-learn/blob/a24c8b464d094d2c468a16ea9f8bf8d42d949f84/sklearn/datasets/mldata.py#L29>
used if the download from the mldata website fails..
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#11108>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAEz6xmCMKlMcJsxGDe8KlwDO0mYi-0Rks5tzq_UgaJpZM4UEkg6>
.
|
Uh oh!
There was an error while loading. Please reload this page.
There have been several PRs (#11100 (review), #11106) where CircleCI arbitrarly fails due to HTTP 500 errors when calling
fetch_mldata('mauna-loa-atmospheric-co2')
,Partial traceback below,
If this keeps repeating, a possible workaround could be,
The text was updated successfully, but these errors were encountered: