Home Research CV Students Computing Tools Pictures


National Oceanography Centre
Southampton Waterfront Campus
European Way, Southampton, SO14 3ZH
E.Frajka-Williams@noc.soton.ac.uk
564/16 National Oceanography Centre
Tel: +44 (0)23 8059 6044

World Ocean Database 2009 and Loading it into Matlab

World Ocean Database 2009

If the code or instructions below are helpful to you, I'd love to hear it. Always nice to know whether it's worthwhile to put these kinds of instructions online. (Send me an e-mail at the address above.) Even better, if you encounter and fix a relevant data-access issue that isn't in the "troubleshooting" below, let me know so I can add it to the page.

World Ocean Database 2009

A giant database of hydrographic profiles, spanning the past 100+ years. How to download and parse into matlab.

URL: http://www.nodc.noaa.gov/OC5/SELECT/dbsearch/dbsearch.html

  1. Download data
    • From URL, choose "WOD select"
    • Search criteria to use: Geographic coordinates, measured variables, dataset, (data exclusion - I haven't done anything with this yet). Click "Build a query"
    • For Faroe's, used 58 to 65 N, -20 to 0 E
      For LabSea, used 51 to 68 N, -67 to -43 E

      Dataset: OSD, CTD, XBT, MBT, PFL, DRB
      Variables: Check at least temp and salin in "1" column

      Click "Get an inventory"
    • For Faroes: [58,65]N, [-20,0]E there were 56k OSD, 4k CTD, 17k XBT, 29k MBT, 1k PFL > 107k casts
      For Lab Sea there were 19k OSD, 7k CTD, 6k XBT, 38k MBT, 8k PFL > 79k casts

      Click "Download Data"
    • Choose "Comma delimited", leave the rest the same.
      Enter e-mail address. When e-mail arrives, download all files to a dedicated directory
  2. Clean data files. Tedious! You can do this by hand, which may be quicker, by opening each file and searching for the string '---.---' which is what they use for no data, and replacing it with a blank. If you use emacs, the query-replace command is
    esc-x %
    and when the first instance is found, use a bang "!" to replace all.

    Otherwise, you can use the script clean_files.m which is slow, but you don't have to do anything.
  3. Parse the .csv files into .mat data files. Use rowm_all_scr.m which calls read_ocldb_write_mat.m, clean_files.m, remove_str.m.

    Update rowm_all_scr.m, saving your values for
    input_dir (wherever you put the .csv files)
    filename1 (the root of the file names, everything before the instrument name, "CTD" or "OSD")
    rep (the number of files for each instrument. If CTD files run through CTD56, then rep should be 56 for the CTDs.
    output_dir (where you want to save your .mat files)

    This script will create a .mat file for each .csv file with one variable, cast1 which is a structure. The fields of the structure include:
    inst- instrument type: a 3 character string
    number- cast number from the file
    lat,lon- latitude and longitude of the cast
    yy,mm,dd- Year, month and date
    salin- empty if no salinity data
    flag- Value up to 3 if there's a bad quality flag
    index- datanumber
    depth- depth vector of samples
    temp- temperature vector

    It will also create the huge structure all_cast which contains all the casts in it. You can save this if you'd like to be able to access it more easily. (I recommend saving it.)

    Next, make some cool plots of this huge amount of data!

Back to the full list of dataset instructions available