This summarizes the issues discussed and decisions reached during a call between the CUAHSI HIS web service team and some of the USGS NWIS team on 2008-07-25. The core issue is having WaterML provide an unambiguous distinction between instantaneous measurements (observations) and statistical summaries of those observations.
Issues discussed included
- Representation of dates and times
- Representation of time zone information
The types of data that the USGS serves include
- Instantaneous discharge measurements (unit values)
- Daily mean discharge measurements
- Monthly mean discharge measurements
- Annual mean discharge measurements
- Statistics derived from all daily, monthly or annual discharge measurements for a period of record, or for a specified period
Date-time representation
WaterML represents date-time as the precise time at the beginning of the interval represented by the data value. For example 2007-01-01T00:00:00. The USGS is concerned that such a precise representation of date-time may lead to values such as daily, monthly and average mean values being interpreted as instantaneous at the time specified.
Resolution
Precise date times to be used because that is good practice in web services and direct web service users are generally more sophisticated.
Consider changing the attribute dateTime to startDateTime in the values element to be explicit that dateTime represents the start of the support interval.
The timeSupport element provides information on the period to which each observation applies.
Example of values element with precise date-time
<values count="3">
<value dateTime="2006-08-01T00:00:00" qualifiers="A">231</value>
<value dateTime="2006-08-02T00:00:00" qualifiers="A">222</value>
<value dateTime="2006-08-03T00:00:00" qualifiers="A">225</value>
<qualifier qualifierCode="A" network="USGS" vocabulary="dv_rmk_cd">Approved for publication. Processing and review completed.</qualifier>
<method methodID="1" />
</values>
Example of values element with dateTime attribute changed to startDateTime
<values count="3">
<value startDateTime="2006-08-01T00:00:00" qualifiers="A">231</value>
<value startDateTime ="2006-08-02T00:00:00" qualifiers="A">222</value>
<value startDateTime ="2006-08-03T00:00:00" qualifiers="A">225</value>
<qualifier qualifierCode="A" network="USGS" vocabulary="dv_rmk_cd">Approved for publication. Processing and review completed.</qualifier>
<method methodID="1" />
</values>
Example of timeSupport element
<timeSupport isRegular="true">
<unit>
<UnitName>day</UnitName>
<UnitType>Time</UnitType>
<UnitAbbreviation>d</UnitAbbreviation>
</unit>
<timeInterval>1</timeInterval>
</timeSupport>
Note from David Tarboton after call:
The tag semantics here is a bit unclear. I suggest instead the following
E.g. for daily mean values<timeScale isRegular="true">
<unit>
<UnitName>day</UnitName>
<UnitType>Time</UnitType>
<UnitAbbreviation>d</UnitAbbreviation>
</unit>
<timeSupport>1</timeSupport>
<timeSpacing>1</timeSpacing>
</timeScale>
E.g. for 15 min unit values that are effectively recorded "instantaneously"<timeScale isRegular="true">
<unit>
<UnitName>minute</UnitName>
<UnitType>Time</UnitType>
<UnitAbbreviation>min</UnitAbbreviation>
</unit>
<timeSupport>0</timeSupport>
<timeSpacing>15</timeSpacing>
</timeScale>
E.g. for unit values that are effectively recorded "instantaneously" and the time step changes¶<timeScale isRegular="false">
<unit>
<UnitName>minute</UnitName>
<UnitType>Time</UnitType>
<UnitAbbreviation>min</UnitAbbreviation>
</unit>
<timeSupport>0</timeSupport>
</timeScale>
Note that the timeSpacing element is not given if the timescale is isRegular.
See also discussions of scale at:
|
|
Time zones
USGS data is generally recorded and presented in local time because that is what is easiest for most customers. The practice of whether to adjust local time for daylight savings or not is left to the discretion of districts and is hence not consistent across the agency. Information on the precise conventions used is not readily available to the NWIS web system. These limitations inhibit precise specification of time zone information with all date-time values. These limitations also leave some ambiguity in the definition of daily and monthly averages, because it is unclear whether averages are computed using local standard or daylight saving, and how the transition from standard to daylight savings is handled.
Resolution
Given this uncertainty the resolution was:
- Use precise date-times with time zone specified where this information is available
- If precise date-time information is unavailable, use date-times without a specified time zone and leave it for the user to infer time zone information specified from the timeZoneInfo element.
- The timeZoneInfo element to indicate whether or not daylight savings is used.
Example of values element with precise date-time and time zone specified¶
<values count="3">
<value dateTime="2006-08-01T00:00:00-7:00" qualifiers="A">231</value>
<value dateTime="2006-08-02T00:00:00-7:00" qualifiers="A">222</value>
<value dateTime="2006-08-03T00:00:00-7:00" qualifiers="A">225</value>
<qualifier qualifierCode="A" network="USGS" vocabulary="dv_rmk_cd">Approved for publication. Processing and review completed.</qualifier>
<method methodID="1" />
</values>
Example of timeZoneInfo element
<timeZoneInfo>
<defaultTimeZone ZoneAbbreviation="MST" ZoneOffset="-7:00" />
<daylightSavingsTimeZone xsi:nil="true" />
</timeZoneInfo>
The nil="true" in the daylightSavingsTimeZone indicates that daylight savings is NOT used. This is not intuitive and could easily be misinterpreted. Consider reformatting to have greater clarity. Can someone please add an example that shows how dayLightSavingsTimeZone is specified when it IS in effect.
The following websites provide information on daylight savings and time zone information
Other items and Issues that remain to be resolved¶
- What should be the date-time for a statistic that is a particular value, such as the maximum or minimum flow during a specified period (e.g. day, month, year). Should the time reported be the "instant" that that flow occurred, or should it be the beginning of the period over which the max/min was taken. What should time support be?
- Stat codes that are overloaded. eg time of measurement is a stat code for some observations in the 'daily' service.
- Dave Briar noted error about hydroseek output.
- Hydroseek output has daily values output at noon.
- We will work with the USGS to QA the output, since this is an interface many users will use.
- Need to work on more automated data transfer from USGS.