10000 Add get_srml iotools function; deprecate read_srml_month_from_solardat by AdamRJensen · Pull Request #1779 · pvlib/pvlib-python · GitHub
[go: up one dir, main page]

Skip to content

Add get_srml iotools function; deprecate read_srml_month_from_solardat #1779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 29, 2023
Merged
Prev Previous commit
Next Next commit
Update error handling
  • Loading branch information
AdamRJensen committed Jun 19, 2023
commit 8f14cb86b6ec2f19f14315afd7d54bf4209a9c9b
14 changes: 8 additions & 6 deletions pvlib/iotools/srml.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"""
import numpy as np
import pandas as pd
import requests
import urllib
import warnings


Expand All @@ -28,8 +28,9 @@

def read_srml(filename, map_variables=True):
"""
Read University of Oregon SRML 1min .tsv file into pandas dataframe. The
SRML is described in [1]_.
Read University of Oregon SRML 1min .tsv file into pandas dataframe.

The SRML is described in [1]_.

Parameters
----------
Expand Down Expand Up @@ -172,8 +173,9 @@ def format_index(df):

def read_srml_month_from_solardat(station, year, month, filetype='PO',
map_variables=True):
"""Request a month of SRML data from solardat and read it into
a Dataframe. The SRML is described in [1]_.
"""Request a month of SRML data and read it into a Dataframe.

The SRML is described in [1]_.

Parameters
----------
Expand Down Expand Up @@ -297,7 +299,7 @@ def get_srml(station, start, end, filetype='PO', map_variables=True,
try:
dfi = read_srml(url + f, map_variables=map_variables)
dfs.append(dfi)
except requests.HTTPError:
except urllib.error.HTTPError:
warnings.warn(f"The following file was not found: {f}")

data = pd.concat(dfs, axis='rows')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about following this with a data = data.loc[start:end]? Right now I get back a full month even if I request just a single day, which isn't really ideal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not opposed to this suggestion, although we may run into some issues related to timezone. I think this is why the start/end parameters in Solar Forecast Arbiter are required to be timezone localized - that seems like a hassle though.

There already exist several functions in pvlib that return a full month when requesting a single day btw, e.g. get_bsrn, and probably others too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that's a good point. Up to you on what is best here. If we keep the current behavior of returning complete months, might be worth a note in the docstring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not opposed to this suggestion, although we may run into some issues related to timezone. I think this is why the start/end parameters in Solar Forecast Arbiter are required to be timezone localized - that seems like a hassle though.

This sounds right. And a complicating factor for SRML was nighttime 0s in the future if we requested a day from the current month.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a raw data file for the current month that shows -999 and similar for future dates.

image

Here's a raw data file that includes -999 and similar as well as 0s

image

This caused a problem in SFA SolarArbiter/solarforecastarbiter-core#572 and it's reasonable to expect that it would cause a problem with other user code. I don't know if that's pvlib's problem to solve, but I think it's somewhat more likely to come up with this new function that accepts datetimes instead of entire months.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how the new function makes this more of an issue, currently, I would still request the same months with the old function, but it would just be more manual work.

I do think that it deserves a Warning entry and perhaps we can also implement a line that cuts off future data? For example:

data = data.loc[:pd.Timestamp.today(), :]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think many users would have different expectations of the new function that accepts start, end datetimes than the old function that accepts a year and a month.

Thanks for adding the warning. I'd rather see the previously rejected start:end slicing than .today() slicing. I'm also fine with just adding the warning and seeing if users complain.

Expand Down
0