10000 Create weather data/iotools page in User's Guide by kandersolar · Pull Request #1754 · pvlib/pvlib-python · GitHub
[go: up one dir, main page]

Skip to content

Create weather data/iotools page in User's Guide #1754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 22, 2023
Merged
Prev Previous commit
Next Next commit
further improvements
  • Loading branch information
kandersolar committed Nov 10, 2023
commit c2ba0e6095539faa0a9729ee5460f8c2f32bde4d
141 changes: 94 additions & 47 deletions docs/sphinx/source/user_guide/weather_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,46 +7,70 @@ Simulating the performance of a PV system requires irradiance and meteorological
as the inputs to a PV system model. Weather datasets are available
from many sources and in many formats. The :py:mod:`pvlib.iotools` module
contains functions to easily retrieve and import such datasets in a standardized
form that is convenient to use with the rest of pvlib. For a complete list
of functions related to retrieving and importing weather data, see :ref:`iotools`.
form that is convenient to use with the rest of pvlib.

The primary focus of :py:mod:`pvlib.iotools` is time series solar resource
data like the datasets from PVGIS and the NSRDB, but it also provides
functionality for other types of data useful for certain aspects of PV modeling
(e.g. precipitation data from :py:func:`~pvlib.iotools.get_acis_prism`
for soiling modeling, and horizon profiles from :py:func:`~pvlib.iotools.get_pvgis_horizon`
for horizon shade modeling).

Types of weather data sources
-----------------------------

Ground station measurements
***************************

From in-situ monitoring equipment. If properly maintained and quality-controlled,
these are the highest quality source of weather information. However, the coverage
depends on a weather station having been set up in advance for the location and
times of interest. There are datasets such as BSRN and SURFRAD which make their
measurement data publicly available.


Numerical Weather Prediction (NWP)
**********************************
For a complete list of functions related to retrieving and importing weather
data, see :ref:`iotools`.

These are mathematical simulations of weather systems. The data quality is much
lower than that of measurements, owing in part to coarser spatial and temporal
resolution, as well as many models not being optimised for solar irradiance for
PV applications. On the plus side, these models typically have worldwide coverage,
with some regional models (e.g. HRRR) sacrifice global coverage for somewhat higher
spatial and temporal resolution. Various forecast (e.g. GFS, ECMWF, ICON) and
reanalysis sources (ERA5, MERRA2) exist.

Types of weather data sources
-----------------------------

Satellite Data
**************

These sources process satellite imagery (typically from geostationary satellites)
to identify and classify clouds, and combine this with solar irradiance models to
produce irradiance estimates. The quality is generally much higher than NWP, but
still not as good as a well-maintained weather station. They have high spatial
and temporal resolution corresponding to the source satellite imagery, and are
generally optimised to estimate solar irradiance for PV applications. Free sources
such as PVGIS are available, and commerical sources such as SolarAnywhere,
Solcast and Solargis provide paid options though often have free trials.
Weather data can be grouped into a few fundamental categories. Which
type is most useful depends on the application. Here we provide a high-level
overview of different types of weather data, and when you might want to use
them.

1. **Ground station measurements**:
From in-situ monitoring equipment. If properly maintained and
quality-controlled, these are the highest quality
source of weather information. However, the coverage depends on
a weather station having been set up in advance for the location and
times of interest. Some ground station networks like the BSRN and SURFRAD
make their measurement data publicly available.

Data from public ground station measurement networks are useful if you
want accurate, high-resolution data but have flexibility around the
specific measurement location.

2. **Satellite data**:
These sources process satellite imagery (typically from geostationary
satellites) to identify and classify clouds, and combine this with solar
irradiance models to produce irradiance estimates. The quality is
generally much higher than NWP, but still not as good as a well-maintained
weather station. They have high spatial and temporal resolution
corresponding to the source satellite imagery, and are generally
optimised to estimate solar irradiance for PV applications. Free sources
such as PVGIS are available, and commerical sources such as SolarAnywhere,
Solcast and Solargis provide paid options though often have free trials.

Satellite data is useful when suitable ground station measurements are
not available for the location and/or times of interest.

3. **Numerical Weather Prediction (NWP)**:
These are mathematical simulations of weather systems.
The data qu 8000 ality is much lower than that of measurements and
satellite data, owing in part to coarser spatial and temporal
resolution, as well as many models not being optimised for solar
irradiance for PV applications. On the plus side, these models typically
have worldwide coverage, with some regional models (e.g. HRRR) sacrifice
global coverage for somewhat higher spatial and temporal resolution.
Various forecast (e.g. GFS, ECMWF, ICON) and reanalysis sources (ERA5,
MERRA2) exist.

NWP datasets are primarily useful for parts of the world not covered
by satellite-based datasets (e.g. the poles) or if extremely long time
ranges are needed.

For a more detailed comparison of the weather datasets available through
pvlib, see [1]_.


:py:mod:`pvlib.iotools` usage
Expand All @@ -59,17 +83,18 @@ a :py:class:`pandas.DataFrame` of the actual dataset, plus a metadata
dictionary. Most :py:mod:`pvlib.iotools` functions also have
a ``map_variables`` parameter to automatically translate
the column names used in the data file (which vary widely from dataset to dataset)
into standard pvlib names (see :ref:`variables_style_rules`). Typical usage
looks like this:
into standard pvlib names (see :ref:`variables_style_rules`).

Typical usage looks something like this:

.. code-block:: python

# reading a local data file:
df, metadata = pvlib.iotools.read_XYZ(filepath, map_variables=True, ...)

# retrieving data from an online service
df, metadata = pvlib.iotools.get_XYZ(location, date_range, map_variables=True, ...)
# get_pvgis_tmy returns two additional values besides df and metadata
df, _, _, metadata = pvlib.iotools.get_pvgis_tmy(latitude, longitude, map_variables=True)

This code will fetch a Typical Meteorological Year (TMY) dataset from PVGIS,
returning a :py:class:`pandas.DataFrame` containing the hourly weather data
and a python dict with information about the dataset.

Most :py:mod:`pvlib.iotools` functions work with time series datasets.
In that case, the returned ``df`` DataFrame has a datetime index, localized
Expand All @@ -86,9 +111,25 @@ Data retrieval
Several :py:mod:`pvlib.iotools` functions access the internet to fetch data from
online web APIs. For example, :py:func:`~pvlib.iotools.get_pvgis_hourly`
downloads data from PVGIS's webservers and returns it as a python variable.
Functions that retrieve data from the internet have names that begin with
``get_``: :py:func:`~pvlib.iotools.get_bsrn`, :py:func:`~pvlib.iotools.get_psm3`,
:py:func:`~pvlib.iotools.get_pvgis_tmy`, and so on.
Functions that retrieve data from the internet are named ``get_``, followed
by the name of the data source: :py:func:`~pvlib.iotools.get_bsrn`,
:py:func:`~pvlib.iotools.get_psm3`, :py:func:`~pvlib.iotools.get_pvgis_tmy`,
and so on.

For satellite/reanalysis datasets, the location is specified by latitude and
longitude in decimal degrees:

.. code-block:: python

lat, lon = 33.75, -84.39 # Atlanta, Georgia, United States
df, metadata = pvlib.iotools.get_psm3(lat, lon, map_variables=True, ...)


For ground station networks, the location identifier is the station ID:

.. code-block:: python

df, metadata = pvlib.iotools.get_bsrn(station='cab', start='2020-01-01', end='2020-01-31', ...)

Some of these data providers require registration. In those cases, your
access credentials must be passed as parameters to the function. See the
Expand All @@ -100,10 +141,16 @@ Reading local files

:py:mod:`pvlib.iotools` also provides functions for parsing data files
stored locally on your computer.
Functions that read and parse files in a particular format have names
that begin with ``read_``: :py:func:`~pvlib.iotools.read_tmy3`,
Functions that read and parse local data files are named ``read_``, followed by
the name of the file format they parse: :py:func:`~pvlib.iotools.read_tmy3`,
:py:func:`~pvlib.iotools.read_epw`, and so on.

For example, here is how to read a file in the TMY3 file format:

.. code-block:: python

df, metadata = pvlib.iotools.read_tmy3(r"C:\path\to\file.csv", map_variables=True)


References
----------
Expand Down
0