Data input

Data within the UKFE package

The UKFE package includes several datasets that can be used used in analyses. These are based on data from the National River Flow Archive (NRFA). There is a pre-processing script that converts new releases of the NRFA Peak Flow Dataset into data frames suitable for use within UKFE (this can be found in the ‘inst’ folder of the package). UKFE is updated shortly after each release to use the latest data. The user can also input their own data.

The UKFE package contains five datasets. These are:

These datasets each have a help file and can be viewed by typing the name of the dataset into the console, or can be saved to an object to view:

# Load the package
library(UKFE)
# Save the 'QMEDData' data frame within the UKFE package to an object within your R 
# environment
QMEDData <- QMEDData

# View the first rows of the data in the console
head(QMEDData)
#>          AREA ALTBAR ASPBAR ASPVAR BFIHOST19 DPLBAR DPSBAR  FARL  FPEXT   LDP
#> 2001 553.2400    213    201   0.04     0.312  30.33   97.0 0.858 0.0555 56.78
#> 2002 423.4800    259    178   0.15     0.315  28.71   97.3 0.845 0.0553 52.37
#> 3002 237.1375    439     44   0.06     0.331  15.94  218.2 0.974 0.0377 31.93
#> 3003 331.6675    297     61   0.07     0.309  16.79  148.2 0.915 0.0488 31.20
#> 4003 202.3750    396    123   0.11     0.366  22.55  147.7 0.908 0.0374 43.06
#> 4005 123.6650    463    123   0.03     0.299  12.97  289.0 0.918 0.0366 28.09
#>      PROPWET RMED-1H RMED-1D RMED-2D SAAR SAAR4170 SPRHOST URBEXT2000     QMED
#> 2001    0.65     8.3    33.0    43.7 1117     1102   52.88      0e+00 170.2170
#> 2002    0.59     8.8    36.4    48.9 1217     1190   54.16      2e-04 153.4350
#> 3002    0.81     9.2    43.7    65.4 1784     2024   49.93      0e+00 177.8840
#> 3003    0.81     9.4    45.4    65.2 1896     1962   53.57      0e+00 347.4660
#> 4003    0.63     9.3    40.0    56.4 1366     1471   48.11      5e-04  80.2930
#> 4005    0.76     9.7    47.4    73.3 2145     2074   52.86      0e+00 101.5035
#>         QMEDcd      X      Y  QMEDfse  N URBEXT1990 BFIHOST
#> 2001 146.52743 284344 929781 1.039474 48      0e+00   0.324
#> 2002 126.36506 274454 916267 1.054355 30      0e+00   0.351
#> 3002 198.85830 240494 888016 1.047491 49      0e+00   0.436
#> 3003 237.03507 231272 901365 1.052648 45      0e+00   0.359
#> 4003  91.79051 253145 877495 1.041955 49      5e-04   0.385
#> 4005 118.31455 220288 850359 1.070837 38      0e+00   0.389

The user can also supply their own data for use in analyses; however, AM files would need to be in the same format as those from the NRFA. Catchment descriptors for ungauged sites can be imported as XML files; these should either be from the FEH Web Service or NRFA, or be in the same format as those.

Functions within the UKFE package for importing data

There are a range of functions for importing data, as set out in this section.

Annual maximum data

An annual maximum series can be obtained for sites suitable for pooling using the GetAM() function. This extracts data from the embedded AMSP data frame within the UKFE package. For other AMAX series available from the NRFA Peak Flow Dataset, the AMImport() function can be used, as can the GetDataNRFA() function (with Type = "AMAX"). The former function imports the data from the AM files and excludes the years classed as rejected. The latter function extracts the AMAX using the NRFA API. If you have a flow time series, the AnnualStat() function can be used to extract the water year AMAX (or any other annual statistic of interest). The following example uses the GetAM() option.

# Extract the AMAX data for NRFA site 55002 and save to an object called 'AM.55002'
AM.55002 <- GetAM(55002)

# View the head of the AMAX series
head(AM.55002)
#>         Date    Flow    id
#> 1 1974-02-12 457.894 55002
#> 2 1975-01-22 410.053 55002
#> 3 1975-12-02 364.079 55002
#> 4 1977-02-03 286.690 55002
#> 5 1977-11-20 318.908 55002
#> 6 1978-12-14 397.771 55002

# Plot the AMAX data
AMplot(AM.55002)

Bar chart of annual maximum river flow. The x-axis shows years, and the y-axis shows peak flow in cubic meters per second. Each bar represents the highest flow in that year. The flows vary from year to year, with several notably high peaks in recent years.

The AMplot() function returns a time series bar plot of the AMAX series.

Catchment descriptors

Catchment descriptors (CDs) from the NRFA can be brought into the ‘R’ environment using the GetCDs() function. For gauged sites that are suitable for pooling or QMED, these are extracted from the QMEDData data frame, otherwise, they are extracted using the NRFA API. Note that if they are brought in from the NRFA API (when not suitable for QMED or pooling), some of the descriptors differ; for example, the gauge location is provided rather than the catchment centroid. There will be a warning message when this happens. An example of using the GetCDs() function to view the catchment descriptors for the gauge with an NRFA ID of 39001 is as follows:

# Extract and view catchment descriptors for NRFA gauge 39001
GetCDs(39001)
#>    Descriptor       Value
#> 1        AREA   9930.7975
#> 2      ALTBAR    109.0000
#> 3      ASPBAR    108.0000
#> 4      ASPVAR      0.0800
#> 5   BFIHOST19      0.6790
#> 6      DPLBAR    139.8700
#> 7      DPSBAR     42.0000
#> 8        FARL      0.9420
#> 9       FPEXT      0.1476
#> 10        LDP    269.5500
#> 11    PROPWET      0.3000
#> 12    RMED-1H     10.8000
#> 13    RMED-1D     32.7000
#> 14    RMED-2D     41.5000
#> 15       SAAR    706.0000
#> 16   SAAR4170    724.0000
#> 17    SPRHOST     26.9400
#> 18 URBEXT2000      0.0664
#> 19    Easting 462899.0000
#> 20   Northing 187850.0000
#> 21    QMEDfse      1.0280
#> 22          N    140.0000
#> 23 URBEXT1990      0.0426
#> 24    BFIHOST      0.6530

It’s useful to store them as an ‘object’ for use with other functions, in which case you can give them a name. You can assign the data to the named object using <-. For example:

# Extract catchment descriptors for NRFA gauge 39001 and store in an object called 
# 'CDs.39001'
CDs.39001 <- GetCDs(39001)

Then, when you wish to view them, the object name CDs.39001 can be entered into the console.

If you wish to derive CDs from an XML file for catchments that aren’t suitable for pooling or QMED, or are not gauged at all, you can use the CDsXML() function. The file path will need to be used. For Windows operating systems, the backslashes will need to be changed to forward slashes, or the file path will need to be stated as follows: r"{my\file\path}". For example, you can import some descriptors downloaded from the FEH Web Service as follows:

# Extract catchment descriptors from an xml file and store in an object called 
# 'CDs.MySite'
CDs.MySite <- CDsXML("C:/Data/FEH_Catchment_384200_458200.xml")

# As above but retaining backslashes in the file path
CDs.MySite <- CDsXML(r"{C:\Data\FEH_Catchment_384200_458200.xml}")

Or if importing CDs from the NRFA Peak Flow Dataset:

# Extract catchment descriptors from an xml file and store in an object called 
# 'CDs.27003'
CDs.27003 <- CDsXML("C:\Data\NRFAPeakFlow_v13-0-2\suitable-for-neither\027003.xml")

Other hydrological data retrieval functions using APIs

There are several functions with names starting with GetData that extract data from the websites of different organisations using their APIs. These are:

There are examples for all of these within each function’s help file.

QMED

The GetQMED() function can be used to import the QMED data from the QMEDData data frame (derived from AMAX data). If it is not in that data frame, it automatically imports the AMAX data using the GetAM() function and calculates the median.