Core functionality#

Functions#

nemseer.compile_data(run_start: str, run_end: str, forecasted_start: str, forecasted_end: str, forecast_type: str, tables: Union[str, List[str]], raw_cache: str, processed_cache: Optional[str] = None, data_format: str = 'df') Optional[Union[Dict[str, DataFrame], Dict[str, Dataset]]][source]#

Compiles queried data from raw_cache and/or processed_cache.

For each queried table, this function:

  1. If required, downloads raw forecast data for the table and converts to the requested data structure.

  2. Otherwise, compiles table data from either of or both of the caches.

  3. Applies user-requested filtering to run times and forecasted times to any raw data.

If data_format = “df” (default), a pandas.DataFrame is returned. Otherwise, if data_format = “xr”, a xarray.Dataset is returned.

Examples

See compiling data examples.

Parameters:
  • run_start (str) – Forecast runs at or after this datetime are queried.

  • run_end (str) – Forecast runs before or at this datetime are queried.

  • forecasted_start (str) – Forecasts pertaining to times at or after this datetime are retained.

  • forecasted_end (str) – Forecasts pertaining to times before or at this datetime are retained.

  • forecast_type (str) – One of nemseer.forecast_types

  • tables (Union[str, List[str]]) – Table or tables required. A single table can be supplied as a string. Multiple tables can be supplied as a list of strings.

  • raw_cache (str) – Path to create or reuse as raw_cache. Files are downloaded to this directory and cached data is maintained in the parquet format.

  • processed_cache (optional) – Path to build or reuse processed_cache. Should be distinct from raw_cache

  • data_format (str) – Default is ‘df’, which returns pandas DataFrame. Can also request ‘xr’, which returns xarray.Dataset.

Return type:

Optional[Union[Dict[str, DataFrame], Dict[str, Dataset]]]

nemseer.download_raw_data(forecast_type: str, tables: Union[str, List[str]], raw_cache: str, run_start: Optional[str] = None, run_end: Optional[str] = None, forecasted_start: Optional[str] = None, forecasted_end: Optional[str] = None, keep_csv: bool = False) None[source]#

Downloads raw forecast data from NEMWeb MMSDM Historical Data SQLLoader

Downloads raw forecast data. Accepts a datetime pair, which can be either of:

  1. run_start and run_end

  2. forecasted_start and forecasted_end

Examples

See downloading raw data examples.

Parameters:
  • forecast_type (str) – One of nemseer.forecast_types

  • tables (Union[str, List[str]]) – Table or tables required. A single table can be supplied as a string. Multiple tables can be supplied as a list of strings.

  • raw_cache (str) – Path to create or reuse as raw_cache. Files are downloaded to this directory and cached data is maintained in the parquet format.

  • run_start (Optional[str]) – Forecast runs at or after this datetime are queried. If supplied, must be included with run_end.

  • run_end (Optional[str]) – Forecast runs before or at this datetime are queried. If supplied, must be included with run_start.

  • forecasted_start (Optional[str]) – Forecasts pertaining to times at or after this datetime are retained. If supplied, must be included with forecasted_end.

  • forecasted_end (Optional[str]) – Forecasts pertaining to times before or at this datetime are retained. If supplied, must be included with forecasted_start.

  • keep_csv (bool) – Default False. If True, downloaded csvs are retained in the raw_cache.

Raises:

ValueError – If a valid pair of datetimes is not supplied, or if more than a valid pair of datetimes is supplied.

Return type:

None

nemseer.generate_runtimes(forecasted_start: str, forecasted_end: str, forecast_type: str) Tuple[str, str][source]#

For a particular forecast type, generates the earliest run_start and the latest run_end that can be queried for the supplied forecasted times.

In other words, this function will return all run times for forecasts that cover the supplied forecasted times. As such, using the run_start and run_end returned by this function with nemseer.compile_data() will ensure most, if not all of the data for the seleected forecasted times is returned.

N.B. These have been determined based on AEMO documentation and actual data. This may not be accurate for all forecast types, e.g. MTPASA which is not run at a set time.

Examples

See getting valid run times for a set of forecasted time.

Parameters:
  • forecasted_start (str) – Forecasts pertaining to times at or after this datetime are retained.

  • forecasted_end (str) – Forecasts pertaining to times before or at this datetime are retained.

  • forecast_type (str) – One of nemseer.forecast_types

Returns:

Tuple of nemseer-valid string datetimes that correspond to valid run times

Raises:

ValueError – If supplied forecasted times are invalid.

Return type:

Tuple[str, str]

nemseer.get_data_daterange() Dict[int, List[int]]#

Years and months with data on NEMWeb MMSDM Historical Data SQLLoader .. rubric:: Examples

See querying date ranges

Returns:

Months mapped to each year. Data is available for each of these months.

Return type:

Dict[int, List[int]]

nemseer.get_tables(year: int, month: int, forecast_type: str, actual: bool = False) List[str]#

Requestable tables of particular forecast type on MMSDM Historical Data SQLLoader

If actual = False, provides a list of tables that can be requested via nemseer.

If actual = True, returns actual tables available via NEMWeb, including all tables that are enumerated.

N.B.:
  • Removes numbering from enumerated tables for P5MIN - e.g. CONSTRAINTSOLUTION(x) are all reduced to CONSTRAINTSOLUTION

Examples

See querying table availability

Parameters:
Returns:

List of tables associated with that forecast type for that period

Return type:

List[str]

Data#

Forecast types#

Forecast types requestable through nemseer. See also forecast types, and pre-dispatch and PASA.

nemseer.forecast_types = ('P5MIN', 'PREDISPATCH', 'PDPASA', 'STPASA', 'MTPASA')#

Built-in immutable sequence.

If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable’s items.

If the argument is a tuple, the return value is the same object.