Skip to end of metadata
Go to start of metadata

How is the CERA-20C atmospheric model daily data organised in MARS?

In general it is organised, as a huge tree, with the indentation below, showing different levels down that tree

  • stream: Atmospheric model, daily (enda)
    • type of data: Analysis (an), ensemble
        • Year: 1901
          • Month: January
            • type of level: Model levels (ml)

              • all dates of a month, times, levels, parameters and ensemble members (same tape file)

            • type of level: Potential temperature (pt)
              • all dates of a month, times, levels, parameters and ensemble members (same tape file)
            • type of level: Potential vorticity (pv)
              • all dates of a month, times, levels, parameters and ensemble members (same tape file)

            • type of level: Pressure level (pl)

              • all dates of a month, times, levels, parameters and ensemble members (same tape file)
            • type of level: Surface (sfc)

              • all dates of a month, times, parameters and ensemble members (same tape file)
          • Month: February
            • ...
          • Month: ...
            • ...
          • Month: December
            • ...
      • Year: 1902
        • ...
      • Year: ... 
        • ...
      • Year: 2010
        • ...
    • type of data: Forecast (fc), ensemble

      • Year: 1901

        • Month: January

          • type of level: Model levels (ml)

            • Date: 1901-01-01

              • all steps of a date, levels, parameters and ensemble members (same tape file)
            • Date: 1901-01-02

              • all steps of a date, levels, parameters and ensemble members (same tape file)
            • Date: 1901-01-...

              • ...
          • type of level: Pressure levels (pl)

            • Date: 1901-01-01

              • all steps of a date, levels, parameters and ensemble members (same tape file)

            • Date: 1901-01-02

              • all steps of a date, levels, parameters and ensemble members (same tape file)

            • Date: 1901-01-...

              • ...
          • type of level: Surface (sfc)

            • all dates of a month, times, steps, parameters and ensemble members (same tape file)

        • Month: February
          • ...
        • Month: ...
          • ...
        • Month: December
          • ...
      • Year: 1902
        • ...
      • Year: ...
        • ...
      • Year: 2010
        • ...
    • Type of data: Ensemble mean (ep)
      • Year: 1901

        • Month: January

          • type of level: Model levels (ml)

            • all dates of a month, times, steps, levels and parameters (same tape file)

          • type of level: Pressure levels (pl)

            • all dates of a month, times, steps, levels and parameters (same tape file)

          • type of level: Surface (sfc)

            • all dates of a month, times, levels and parameters (same tape file)

        • Month: February
          • ...
        • Month: ...
          • ...
        • Month: December
          • ...
      • Year: 1902
        • ...
      • Year: ...
        • ...
      • Year: 2010
        • ...

What would be the natural way to group requests?

The idea is to request as much data as possible from the same tape file.  In general the most efficient  way to group requests would be per the item highlighted in the table above. I.e. for analysis data group by month, putting all dates, times, levels, desired parameters and ensemble members for a specific month in one request. For forecast data (model/pressure levels) group by date, putting all steps, levels, desired parameters and ensemble members for a specific date in one request. For forecast data (surface level) group by month, putting all dates, times, steps, desired parameters and ensemble members for a specific month in one request. For ensemble means group by month, putting all dates, times, steps, and desired parameters for a specific month in one request.

Examples

  • The objective of the examples below is to demonstrate how to iterate efficiently for a particular CERA-20C request.
  • At this point you may wish to have a look on the  CERA-20C daily data availability and identify your desired level type, data type, time periods, parameters, etc.
  • The requests below can be used as a starting point, however, you will need to adapt them to your requirements, for example change some values (the right side of the "key":"value" pairs) or add or remove some keys.

Requesting ensemble mean (ep), multiple years, surface (sfc)

In the data tree 'type of data' is highest, followed by 'year', 'month' and 'type of level'. So we keep within one 'type of data' (an), go into the first year, into the first month, into the level type, and get the data we want. We don't want additional level types, so we move on to the next month.

The data volume in this example is about 70MB per month.

#!/usr/bin/env python
import calendar
from ecmwfapi import ECMWFDataServer
server = ECMWFDataServer()

def retrieve_cera20c_enda():
    """
       A function to demonstrate how to iterate efficiently over all months,
       for a list of years, for a CERA-20C daily data request. 
       You can extend the number of years to adapt the iteration to your needs.
       You can use the variable 'target' to organise the requested data in files as you wish."        
    """
    yearStart = 2000
    yearEnd = 2002
    monthStart = 1
    monthEnd = 12

    # enda is arranged by months, so we iterate over the months
    for year in list(range(yearStart, yearEnd + 1)):
        for month in list(range(monthStart, monthEnd + 1)):
            startDate = '%04d%02d%02d' % (year, month, 1)
            numberOfDays = calendar.monthrange(year, month)[1]
            lastDate = '%04d%02d%02d' % (year, month, numberOfDays)
            # we submit a data request for the current month
            target = "cera20c_enda_em_%04d%02d_sfc.grb" % (year, month)
            requestDates = (startDate + "/TO/" + lastDate)
            era20c_enda_sfc_request(requestDates, target)

def era20c_enda_sfc_request(requestDates, target):
    """      
        A CERA-20C request for ensemble mean, surface data.
        You can change the keywords below to adapt it to your needs.
        (eg add or remove  parameters, times etc)
    """
    server.retrieve({
        "class": "ep",
        "dataset": "cera20c",
        "stream": "enda",
        "expver": "1",
        "type": "em",
        "levtype": "sfc",
        "date": requestDates,
        "time": "00/03/06/09/12/15/18/21",
        "step": "0",
        "param": "165.128/166.128/167.128/168.128",
        "target": target,
    })

if __name__ == '__main__':
    retrieve_cera20c_enda()

Requesting ensemble, analysis (an), multiple years, pressure levels (pl)

In the data tree 'type of data' is highest, followed by 'year', 'month' and 'type of level'. So we keep within one 'type of data' (an), go into the first year, into the first month, into the level type, and get the data we want. We don't want additional level types, so we move on to the next month.

The data volume in this example is about 9GB per month.

#!/usr/bin/env python
import calendar
from ecmwfapi import ECMWFDataServer
server = ECMWFDataServer()

def retrieve_cera20c_enda():
    """
       A function to demonstrate how to iterate efficiently over all months,
       for a list of years, for a CERA-20C daily data request. 
       You can extend the number of years to adapt the iteration to your needs.
       You can use the variable 'target' to organise the requested data in files as you wish."        
    """
    yearStart = 2000
    yearEnd = 2002
    monthStart = 1
    monthEnd = 12

    # enda is arranged by months, so we iterate over the months
    for year in list(range(yearStart, yearEnd + 1)):
        for month in list(range(monthStart, monthEnd + 1)):
            startDate = '%04d%02d%02d' % (year, month, 1)
            numberOfDays = calendar.monthrange(year, month)[1]
            lastDate = '%04d%02d%02d' % (year, month, numberOfDays)
            # we submit a data request for the current month
            target = "cera20c_enda_ensemble_%04d%02d_pl.grb" % (year, month)
            requestDates = (startDate + "/TO/" + lastDate)
            era20c_enda_sfc_request(requestDates, target)

def era20c_enda_sfc_request(requestDates, target):
    """      
        A CERA-20C request for U and V component of wind, pressure levels, all 10 ensemble members.
        You can change the keywords below to adapt it to your needs.
        (eg add or remove  parameters, times etc)
    """
    server.retrieve({
        "class": "ep",
        "dataset": "cera20c",
        "stream": "enda",
        "expver": "1",
        "type": "an",
        "levtype": "pl",
        "levelist": "1/2/3/5/7/10/20/30/50/70/100/125/150/175/200/225/250/300/350/400/450/500/550/600/650/700/750/775/800/825/850/875/900/925/950/975/1000",
        "number": "0/1/2/3/4/5/6/7/8/9",   # ensemble members
        "date": requestDates,
        "time": "00/03/06/09/12/15/18/21",
        "step": "0",
        "param": "131.128/132.128",   # Here: U and V components of wind. These parameters are not archived explicitly, but derived from vorticity and divergence on the fly.
        "target": target,
    })

if __name__ == '__main__':
    retrieve_cera20c_enda()