WRF Tutorial: Difference between revisions

From Fluids Wiki
Jump to navigation Jump to search
No edit summary
m (→‎Geography Data: updates with common WRF error)
Line 66: Line 66:


=== Geography Data ===
=== Geography Data ===
Create and navigate to a <code>/geog</code> directory, download, and unzip the high-resolution geography data found [http://www2.mmm.ucar.edu/wrf/users/download/get_sources_wps_geog.html here]:
Create and navigate to a <code>/geog</code> directory, download, and unzip the high-resolution geography data found [http://www2.mmm.ucar.edu/wrf/users/download/get_sources_wps_geog_V3.html here]:
<syntaxhighlight lang="bash"> wget http://www2.mmm.ucar.edu/wrf/src/wps_files/geog_high_res_mandatory.tar.gz </syntaxhighlight>
<syntaxhighlight lang="bash"> wget http://www2.mmm.ucar.edu/wrf/src/wps_files/geog_complete.tar.gz </syntaxhighlight>
Even if you are using WRF V4.0, you will need to use the geographical data provided with WRF V3. Due to an error (as of Oct 24 2018) WRF will ask for data that is only provided in the previous dataset.
 
You will need to put the appropriate path from your WPS directory to your geog directory in your <code>namelist.wps</code>.
You will need to put the appropriate path from your WPS directory to your geog directory in your <code>namelist.wps</code>.



Revision as of 11:03, 24 October 2018

This is a guide for configuring, compiling, and running WRF (Weather Research and Forecasting) and the WRF Preprocessing System (WPS) using Sharcnet machines for real data simulations. This guide is meant to supplement the WRF User Tutorial. Graham has pre-compiled WRF modules available. However, you may find yourself needing to compile WRF and WPS yourself. Both of these options, along with obtaining real data, useful tips, and troubleshooting material, can be found below.

Using Pre-compiled Modules

The current (as of May 2018) configuration of WRF and WPS that work together on Graham are WPS 3.8.1 and WRF 3.8.1. Copy the following directories into WRF and WPS directories in your desired ~/projects directory:

cp -r /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/intel2016.4/openmpi2.1/wrf/3.8.1/WRFV3
cp -r /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/intel2016.4/openmpi2.1/wps/3.8.1/WPS

To use the modules, type:

module load wrf/3.8.1
module load wps/3.8.1

at the start of each session. If you run WRF and WPS often, it is helpful to add these lines to your ~/.bashrc.

To view the options that were used when compiling to configure WRF and WPS, see:

WRFV3/configure.wrf
WPS/configure.wps

If you are using WRF on Graham, it is important to note that the current (as of March 20 2018) version of OpenMPI (version 2.1.1) will cause segmentation faults in WRF using more than one node. To fix this problem, load an older version of OpenMPI:

module remove openmpi/2.1.1
module load openmpi/2.0.2

You will then need to configure and compile WRF by following the steps in the subsequent section. You will not need to re-configure or re-compile WPS.

Compiling WRF and WPS

Modules are available for both WPS and WRF on Graham. However, compiling WPS and WRF yourself may be necessary if you wish to use a newer version of WRF on Graham or if you are experiencing segmentation faults on more than one node.

Setting the OpenMPI version

Using WRF (any version) with more than one node will not work with OpenMPI version 2.1.1 (or presumably newer versions- this remains untested). To check which version of OpenMPI is loaded, type mpirun --version. You will have to remove this version and load version 2.0.2:

module remove openmpi/2.1.1
module load openmpi/2.0.2

Setting the netCDF variable

WRF and WPS need to know the path to the proper netCDF directory. You will need to set the netCDF path to the fortran-mpi netCDF folder. On graham, this can be done by typing the following:

 export NETCDF=/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/intel2016.4/openmpi2.1/netcdf-fortran-mpi/4.4.4

Check to make sure the following file exists: /include/netcdf.inc.

At this point, you may also choose to allow for large (>2GB) netCDF files. This is recommended for any high-resolution WRF simulations.

 export WRFIO_NCD_LARGE_FILE_SUPPORT=1

Compiling WRF

In the WRF or WRFV3 directory, type: ./configure. You will be given a list of computer options. Select intel, ifort/icc (dmpar). This is option 15 for WRF 4.0.

In the next section, you may wish to compile for nesting. Option 1, which covers all basic nests, is recommended.

This will create a configure.wrf file, containing configure options based on your environment. In very unlikely cases, some paths may be incorrect. You can edit paths in this file to their correct paths before continuing.

To compile for real-data cases, type ./compile em_real. This will take about an hour, and will create the following executables:

main/ndown.exe
main/wrf.exe
main/real.exe

Compiling WPS

Make sure the netCDF variable is still set from before. In the WPS directory, type ./configure. You will be given a list of computer options. Choose linux/intel/dmpar: option 19 for WRF 4.0. This will create a configure.wps file, containing configure options based on your environment. If paths are incorrect, you can edit paths in this file to their correct paths before continuing. This includes the path to your WRF/WRFV3 directory.

To compile WPS, type ./compile. This will create the following executables:

geogrid/src/geogrid.exe
metgrid/src/metgrid.exe
ungrib/src/ungrid.exe

Acquiring Data

To run WRF using real-data, you will need to acquire both static geography data and GRIB (General Regularly-distributed Information in Binary form) data that describes the weather conditions.

Geography Data

Create and navigate to a /geog directory, download, and unzip the high-resolution geography data found here:

 wget http://www2.mmm.ucar.edu/wrf/src/wps_files/geog_complete.tar.gz

Even if you are using WRF V4.0, you will need to use the geographical data provided with WRF V3. Due to an error (as of Oct 24 2018) WRF will ask for data that is only provided in the previous dataset.

You will need to put the appropriate path from your WPS directory to your geog directory in your namelist.wps.

GRIB Data

Download reanalysis data for your preferred simulation dates. It's suggested to use NARR-A data using the HAS data access link on the NOAA Website. Place this data in the /DATA directory. You can download your entire order by using the following command:

 wget -erobots=off -nv -m -np -nH --cut-dirs=2 --reject "index.html*" http://www1.ncdc.noaa.gov/pub/has/model/YOUR_HAS_ORDER/

To untar all the files in your order, type:

 for a in `ls -1 *.tar`; do tar -xvf $a; done

You will need to remove anything that isn't a GRIB file (no extension), including the *.b files, from this directory.

In the /WPS directory, link the path from WPS to DATA using:

  ./link_grib.csh path/to/data

If you're using NARR data, you will need to link the appropriate Variable Table. In the /WPS directory, type:

 ln -sf ungrib/Variable_Tables/Vtable.NARR Vtable

Running WPS

There are three components to WPS:

  • ungrib.exe takes static GRIB data and turns it into an intermediate file format
  • geogrid.exe takes static geographical data and fits it to your specified grid
  • metgrid.exe takes output from geogrid.exe and ungrib.exe and interpolates the data to your domain for your specified times

In your WPS directory, edit namelist.wps to specify your desired domain. A list of best practices for the namelist can be found here. If you're running with NARR data, set interval_seconds = 10800, since it has output every 3 hours. To preview a map of your domain, use ncl:

module load ncl
ncl util/plotgrids_old.ncl

Ungrib and geogrid can be run independently of each other. Re-running one does not mean you have to re-run the other. However, metgrid must be run after both ungrid and geogrid. If you re-run one of geogrid or ungrib, then you will have to re-run metgrid. Furthermore, ungrib must be run on a single processor, whereas geogrid and metgrid can both be run on multiple processors.

Since geogrid and metgrid can both be run on multiple processors and take a small amount of time and computing power, it may be helpful to submit them in the same job after running ungrib. At the end of your WPS submit script, put the following:

srun ./geogrid.exe
srun ./metgrid.exe

Check your *.log files for errors. A met_em file should be created for each time in your simulation: met_em.d01.YYYY-MM-DD_HH:00:00.nc.

Running WRF

There are two components to WRF:

  • real.exe vertically interpolates the met_em files and creates boundary/initial conditions
  • wrf.exe generates the simulation

Run WRF from either your /WRF/test/em_real or /WRF/real directory. Link your met_em files to this directory:

 ln -sf path_to_met_em_files/met_em.d0* . 

Edit the namelist.input to match the WPS domain and choose physics parameterizations. Best practices can be found here. Run ./real.exe and then ./wrf.exe.

Check rsl.out.0000 and rsl.error.0000 for errors after each run. A wrfout_d01_YYYY-MM-DD_HH:00:00 file should be created.

If you're running a high-resolution or large-domain simulation, you may run into memory allocation errors when defining the grid. It is useful to submit your job using complete nodes with #SBATCH --mem=125G to run ./wrf.exe.

Post Processing

For a quick look at your data (during or after a run), use ncview:

module load ncview
ncview wrfout_d01_YYYY-MM-DD_HH:00:00

For a more detailed look at your data and for creating nicer graphics, you can use NCL or VisIt. To use VisIt, append a .nc to the end of your wrfout files:

cp wrfout_d01_YYYY-MM-DD_HH:00:00 wrfout_d01_YYYY-MM-DD_HH:00:00.nc

Useful Tips

Modifying Land Use Data

One may wish to modify the type of land in a region (ie. turn lakes into grassland) for sensitivity testing. To do so, change the value of LU_INDEX for this region in geo_em.d01.nc after geogrid.exe is run, but before metgrid.exe is run. Load the NetCDF Operators module, nco, by typing module load nco and use the netCDF Arithmetic Processor, ncap2, to modify LU_INDEX. Land use type is given by the USGS 24-category system:

Value Description
1 Urban and Built-Up Land
2 Dryland Cropland and Pasture
3 Irrigated Cropland and Pasture
4 Mixed Dryland/Irrigated Cropland and Pasture
5 Cropland/Grassland Mosaic
6 Cropland/Woodland Mosaic
7 Grassland
8 Shrubland
9 Mixed Shrubland/Grassland
10 Savanna
11 Deciduous Broadleaf Forest
12 Deciduous Needleleaf Forest
13 Evergreen Broadleaf Forest
14 Evergreen Needleleaf Forest
15 Mixed Forest
16 Water Bodies
17 Herbaceous Wetland
18 Wooded Wetland
19 Barren or Sparsely Vegetated
20 Herbaceous Tundra
21 Wooded Tundra
22 Mixed Tundra
23 Bare Ground Tundra
24 Snow or Ice
28 Inland Lake

For example, to change the land type to grassland in a rectangular region x1 to x2 and y1 to y2, type:

 ncap2 -O -s 'LU_INDEX(:,y1:y2,x1:x2)=7;' geo_em.d01.nc geo_em.d01.nc

where -O is to overwrite the original geo_em file.

You will also need ensure the following line is in namelist.input so that real.exe will not overwrite surface inputs:

surface_input_source = 3

To change a body of water to land, other variables in this region must also be altered:

  • Change LANDMASK from 0 (water) to 1 (land)
  • Change LAKE_DEPTH to 10 (default)
  • Change SOILTEMP to 280
  • Change SCB_DOM and SCT_DOM to neighbouring soil values
  • Change LANDUSEF from 1 to 0 at soil category 20: ncap2 -O -s 'LANDUSEF(:,20,y1:y2,x1:x2)=0;' geo_em.d01.nc geo_em.d01.nc
  • Change SOILCBOT and SOILCTOP from 1 to 0 at category 13, similar to LANDUSEF.

Extracting time series from a specific location

WRF has the capability to easily record time series at specific station locations using TSlist for surface variables: t, q, u, v, psfc, glw, gsw, hfx, lh, tsk, tslb, rainc, rainnc, clw, along with vertical profiles at the given location for u, v, potential temperature, geopotential height, and water vapour mixing ratio. In the directory that you run WRF, edit the tslist file to contain a station name, prefix, latitude, and longitude of your given location. This must be done before WRF is run. If you wish to specify more then 5 locations, you must edit the max_ts_locs line in namelist.input to specify the desired number of locations.

After WRF is run, the following files containing time series information for the station locations will have been created:

pfx.dNN.TS, pfx.dNN.UU, pfx.dNN.VV, pfx.dNN.TH, pfx.dNN.PH, pfx.dNN.QV

To extract time series information as an array for a specific variable in pfx.dNN.TS, it is helpful to use the following bash script, wrfTimeSeriesToArray.sh:

 if [ $# -ne 4 ]
  then
    echo "Required arguments are FILE_NAME COLUMN_NUMBER STARTING_ROW OUTFILE_NAME"
    echo "If the first column starts with the delimiter, you'll have to add 1 to the COLUMN_NUMBER"
    echo "Eg. For u10 in a prefix.d01.TS, type: bash wrfTimeSeriesToArray.sh prefix.d01.TS 9 2 'prefix-u10'"
    exit
 fi
 output=($(awk -v SR=$3 -v COL=$2  -F "[[:space:]]+" 'SR<=NR{ if($COL=="") print BLANK; else print $COL; }' $1))
 echo "${output[@]}" >> $4

Reducing the size of your output files

Since wrfout files contain a huge number of variables, they can be incredibly huge. This makes storing and transferring them a difficult task. However, using ncks, it is easy to create a new netCDF file containing only your variables of interest.

To view all the variables that are stored in your netCDF wrfout file, use:

 ncdump -h wrfout_d01_YYYY-MM-DD_HH:00:00

You will need to keep the variables Times, XLAT, XLONG, XTIME, along with your specific variables of interest. For example, to create a new netCDF file containing only variables Times, XLAT, XLONG, XTIME, U, V, and W, type:

module load nco
ncks -v Times,XLAT,XLONG,XTIME,U,V,W original_wrfout_filename new_filename

Using WPS and NCL to plot atmospheric reanalysis data (eg. NARR data) on a domain

One way to plot large-scale atmospheric features on a domain of interest is to use WPS to superimpose GriB reanalysis data onto a geographical domain. WPS is built to combine reanalysis data and geographical data- you will not need to run WRF itself. Download the appropriate data (eg. NARR data) and set up namelist.wps to describe your domain and the time range you wish to see. Run ungrib.exe, geogrid.exe, and metgrid.exe. You will use the met_em.d01.YYYY-MM-DD-HH:00:00.nc file to plot the values. In ncl, read whichever variables of interest at time 0 as follows:

slp = a->PMSL(0,:,:)  ; Sea Level Pressure: a 2D variable at time 0 
rh = a->RH(0,0,:,:)   ; Relative Humidity: a 3D variable at time 0 and eta level 0

Use these variables to create a desired plot of reanalysis data.

Troubleshooting Resources

Debug level

When real.exe and wrf.exe don't run properly, take a look in the rsl.out.0000 and rsl.error.0000 files for error outputs. However, sometimes the outputs contained in these files don't specify where something goes wrong. To increase the number of outputs printed to rsl.out.0000 and rsl.error.0000, change the debug_level variable in namelist.input. This value can range from 0 to 1000. The larger the number, the more outputs printed to the files. It is worth noting that large debug_levels can cause the code to run slowly, or not at all. A safe debug_level number is around 100.

Segmentation faults on more than one node

If you are experiencing segmentation faults within the first timestep when running WRF on more than one node, it is likely that your version of OpenMPI is causing problems when transferring information between nodes. A good way to confirm this is by looking at your wrfout file. If squares of the domain (in variables PB, PH, PHB, T, etc.) are set to zero, this is likely the cause of the problem. You will need to load an earlier version of OpenMPI and re-configure/re-compile WRF. See Setting the OpenMPI version for more details.