Preface. The code of this blog posting will only work with the latest
development-branch of the
section dataset from
oce provides a good example of a dataset
1 2 data(section) Sflag <- section[['salinityFlag']]
A good first step is to see what flags are being used
This dataset uses the WHP convention for flags (see
?section), in which a
flag value of 2 is used to indicate data considered to be acceptable. Thus, the
table indicates that only 3/4 of the salinity measurements are considered
to be acceptable. This makes this a good dataset to illustrate the handling
First, extract some relevant data.
1 2 3 4 5 S <- section[['salinity']] T <- section[['temperature']] theta <- section[['theta']] p <- section[['pressure']] Sflag <- section[['salinityFlag']]
Next, plot salinity flag vs salinity
1 plot(S, Sflag, pch=Sflag-1)
This suggests that, apart from one distinct outlier at a salinity of 26, the salinities of bad data are generally in the range of the salinities of good data. Next, examine temperature and salinity together.
1 plotTS(as.ctd(S, T, p), pch=Sflag-1)
The last two plots suggest that one of the points marked as being bad (flag=4) is distinctly anomalous compared with all the other data. A detailed analysis could be made of that point (e.g. first isolate the station, then plot it in detail) but time may be better spent simply focussing on data that have been assessed as being reasonable during the data-archiving process.
1 2 ok <- Sflag == 2 plotTS(as.ctd(S[ok], T[ok], p[ok]))
Another approach is to use
handleFlags to select the good data
1 2 section2 <- handleFlags(section) plotTS(section2)
where we have used the fact that
plotTS can recognize section objects. The
handleFlags is also recommended because it carries over to other types
of plots, e.g. a salinity section. For example, a salinity section of all the
data is produced with
1 plot(section, which="salinity")
while one of just the acceptable data is produced with
1 plot(section2, which="salinity")
- Find which station has the very low salinity, and examine that station in more detail.
- Try as above, but only discarding data with
salinityFlag==4, which are known to be bad (i.e. retain both acceptable and questionable data).
- Continue step 2, with other types of analysis (e.g. examine spatial dependence).
- Look online for the source of the
sectiondataset, to find out more about how the data-quality flags were assigned.