WRF Tutorial: Difference between revisions
Line 232: | Line 232: | ||
===Extracting time series from a specific location=== | ===Extracting time series from a specific location=== | ||
WRF has the capability to easily record time series at specific station locations using [https://github.com/yyr/wrf/blob/master/run/README.tslist TSlist] for surface variables: t, q, u, v, psfc, glw, gsw, hfx, lh, tsk, tslb, rainc, rainnc, clw, along with vertical profiles at the given location for u, v, potential temperature, geopotential height, and water vapour mixing ratio. In the directory that you run WRF, edit the <code>tslist</code> file to contain a station name, prefix, latitude, and longitude of your given location. This must be done '''before''' WRF is run. If you wish to specify more then 5 locations, you must edit the <code>max_ts_locs</code> line in <code>namelist. | WRF has the capability to easily record time series at specific station locations using [https://github.com/yyr/wrf/blob/master/run/README.tslist TSlist] for surface variables: t, q, u, v, psfc, glw, gsw, hfx, lh, tsk, tslb, rainc, rainnc, clw, along with vertical profiles at the given location for u, v, potential temperature, geopotential height, and water vapour mixing ratio. In the directory that you run WRF, edit the <code>tslist</code> file to contain a station name, prefix, latitude, and longitude of your given location. This must be done '''before''' WRF is run. If you wish to specify more then 5 locations, you must edit the <code>max_ts_locs</code> line in <code>namelist.input</code> to specify the desired number of locations. | ||
After WRF is run, the following files containing time series information for the station locations will have been created: | After WRF is run, the following files containing time series information for the station locations will have been created: |
Revision as of 11:22, 16 August 2018
This is a guide for configuring, compiling, and running WRF (Weather Research and Forecasting) and the WRF Preprocessing System (WPS) using Sharcnet machines for real data simulations. This guide is meant to supplement the WRF User Tutorial. Graham has pre-compiled WRF modules available. However, you may find yourself needing to compile WRF and WPS yourself. Both of these options, along with obtaining real data, useful tips, and troubleshooting material, can be found below.
Using Pre-compiled Modules
The current (as of May 2018) configuration of WRF and WPS that work together on Graham are WPS 3.8.1 and WRF 3.8.1. Copy the following directories into WRF and WPS directories in your desired ~/projects directory:
/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/intel2016.4/openmpi2.1/wrf/3.8.1/WRF /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/intel2016.4/openmpi2.1/wps/3.8.1/WPS
To use the modules, type:
module load wrf/3.8.1 module load wps/3.8.1
at the start of each session. If you run WRF and WPS often, it is helpful to add these lines to your ~/.bashrc
.
To view the options that were used when compiling to configure WRF and WPS, see:
WRFV3/configure.wrf WPS/configure.wps
If you are using WRF on Graham, it is important to note that the current (as of March 20 2018) version of OpenMPI (version 2.1.1) will cause segmentation faults in WRF using more than one node. To fix this problem, load an older version of OpenMPI:
module remove openmpi/2.1.1 module load openmpi/2.0.2
You will then need to configure and compile WRF by following the steps in the subsequent section. You will not need to re-configure or re-compile WPS.
Compiling WRF and WPS
Compiling is necessary for use of any version of WRF on Orca or for use of WRF 4.0 on Graham. However, it is not recommended to compile WRF yourself on Graham, since modules are available.
Setting the OpenMPI version
Using WRF (any version) with more than one node will not work with OpenMPI version 2.1.1 (or presumably newer versions- this remains untested). To check which version of OpenMPI is loaded, type mpirun --version
. You will have to remove this version and load version 2.0.2:
module remove openmpi/2.1.1 module load openmpi/2.0.2
Setting the netCDF variable
WRF and WPS need to know the path to the proper netCDF directory. You will need to set the netCDF path to the fortran-mpi netCDF folder. On graham, this can be done by typing the following:
export NETCDF=/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/intel2016.4/openmpi2.1/netcdf-fortran-mpi/4.4.4
Check to make sure the following file exists: /include/netcdf.inc
.
At this point, you may also choose to allow for large (>2GB) netCDF files. This is recommended for any high-resolution WRF simulations.
export WRFIO_NCD_LARGE_FILE_SUPPORT=1
Compiling WRF
In the WRF or WRFV3 directory, type: ./configure
. You will be given a list of computer options. Select intel, ifort/icc (dmpar). This is option 15 for WRF 4.0.
In the next section, you may wish to compile for nesting. Option 1, which covers all basic nests, is recommended.
This will create a configure.wrf
file, containing configure options based on your environment. In very unlikely cases, some paths may be incorrect. You can edit paths in this file to their correct paths before continuing.
To compile for real-data cases, type ./compile em_real
. This will take about an hour, and will create the following executables:
main/ndown.exe main/wrf.exe main/real.exe
Compiling WPS
Make sure the netCDF variable is still set from before. In the WPS directory, type ./configure
. You will be given a list of computer options. Choose linux/intel/dmpar: option 19 for WRF 4.0. This will create a configure.wps
file, containing configure options based on your environment. If paths are incorrect, you can edit paths in this file to their correct paths before continuing. This includes the path to your WRF/WRFV3 directory.
To compile WPS, type ./compile
. This will create the following executables:
geogrid/src/geogrid.exe metgrid/src/metgrid.exe ungrib/src/ungrid.exe
Acquiring Data
To run WRF using real-data, you will need to acquire both static geography data and GRIB (General Regularly-distributed Information in Binary form) data that describes the weather conditions.
Geography Data
Create and navigate to a /geog
directory and download the high-resolution geography data found here:
wget http://www2.mmm.ucar.edu/wrf/users/download/get_sources_wps_geog.html
You will need to put the appropriate path to this data in your namelist.wps
.
GRIB Data
Download reanalysis data for your preferred simulation dates. It's suggested to use NARR-A data using the HAS data access link on the NOAA Website. Place this data in the /DATA
directory. You can download your entire order by using the following command:
wget -erobots=off -nv -m -np -nH --cut-dirs=2 --reject "index.html*" http://www1.ncdc.noaa.gov/pub/has/model/YOUR_HAS_ORDER/
To untar all the files in your order, type:
for a in `ls -1 *.tar`; do tar -xvf $a; done
You will need to remove anything that isn't a GRIB file, including the *.b
files, from this directory.
In the /WPS
directory, link the path from WPS to DATA using:
./link_grib.csh path/to/data
If you're using NARR data, you will need to link the appropriate Variable Table. In the /WPS
directory, type:
ln -sf ungrib/Variable_Tables/Vtable.NARR Vtable
Running WPS
There are three components to WPS:
- ungrib.exe takes static GRIB data and turns it into an intermediate file format
- geogrid.exe takes static geographical data and fits it to your specified grid
- metgrid.exe takes output from geogrid.exe and ungrib.exe and interpolates the data to your domain for your specified times
In your WPS directory, edit namelist.wps
to specify your desired domain. A list of best practices for the namelist can be found here. If you're running with NARR data, set interval_seconds = 10800
, since it has output every 3 hours. To preview a map of your domain, use ncl:
module load ncl ncl util/plotgrids_old.ncl
Ungrib and geogrid can be run independently of each other. Re-running one does not mean you have to re-run the other. However, metgrid must be run after both ungrid and geogrid. If you re-run one of geogrid or ungrib, then you will have to re-run metgrid. Furthermore, ungrib must be run on a single processor, whereas geogrid and metgrid can both be run on multiple processors.
Since geogrid and metgrid can both be run on multiple processors and take a small amount of time and computing power, it may be helpful to submit them in the same job after running ungrib. At the end of your WPS submit script, put the following:
srun ./geogrid.exe srun ./metgrid.exe
Check your *.log
files for errors. A met_em
file should be created for each time in your simulation: met_em.d01.YYYY-MM-DD_HH:00:00.nc
.
Running WRF
There are two components to WRF:
- real.exe vertically interpolates the
met_em
files and creates boundary/initial conditions - wrf.exe generates the simulation
Run WRF from either your /WRF/test/em_real
or /WRF/real
directory. Link your met_em
files to this directory:
ln -sf path_to_met_em_files/met_em.d0* .
Edit the namelist.input
to match the WPS domain and choose physics parameterizations. Best practices can be found here.
Run ./real.exe
and then ./wrf.exe
.
Check rsl.out.0000
and rsl.error.0000
for errors after each run. A wrfout_d01_YYYY-MM-DD_HH:00:00
file should be created.
If you're running a high-resolution or large-domain simulation, you may run into memory allocation errors when defining the grid. It is useful to submit your job using complete nodes with #SBATCH --mem=125G
to run ./wrf.exe
.
Post Processing
For a quick look at your data (during or after a run), use ncview:
module load ncview ncview wrfout_d01_YYYY-MM-DD_HH:00:00
For a more detailed look at your data and for creating nicer graphics, you can use NCL or VisIt. To use VisIt, append a .nc
to the end of your wrfout
files:
cp wrfout_d01_YYYY-MM-DD_HH:00:00 wrfout_d01_YYYY-MM-DD_HH:00:00.nc
Useful Tips
Modifying Land Use Data
One may wish to modify the type of land in a region (ie. turn lakes into grassland) for sensitivity testing. To do so, change the value of LU_INDEX for this region in geo_em.d01.nc after geogrid.exe
is run, but before metgrid.exe
is run. Load the NetCDF Operators module, nco, by typing module load nco
and use the netCDF Arithmetic Processor, ncap2, to modify LU_INDEX. Land use type is given by the USGS 24-category system:
Value | Description |
---|---|
1 | Urban and Built-Up Land |
2 | Dryland Cropland and Pasture |
3 | Irrigated Cropland and Pasture |
4 | Mixed Dryland/Irrigated Cropland and Pasture |
5 | Cropland/Grassland Mosaic |
6 | Cropland/Woodland Mosaic |
7 | Grassland |
8 | Shrubland |
9 | Mixed Shrubland/Grassland |
10 | Savanna |
11 | Deciduous Broadleaf Forest |
12 | Deciduous Needleleaf Forest |
13 | Evergreen Broadleaf Forest |
14 | Evergreen Needleleaf Forest |
15 | Mixed Forest |
16 | Water Bodies |
17 | Herbaceous Wetland |
18 | Wooded Wetland |
19 | Barren or Sparsely Vegetated |
20 | Herbaceous Tundra |
21 | Wooded Tundra |
22 | Mixed Tundra |
23 | Bare Ground Tundra |
24 | Snow or Ice |
28 | Inland Lake |
For example, to change the land type to grassland in a rectangular region x1 to x2 and y1 to y2, type:
ncap2 -O -s 'LU_INDEX(:,y1:y2,x1:x2)=7;' geo_em.d01.nc geo_em.d01.nc
where -O
is to overwrite the original geo_em file.
You will also need ensure the following line is in namelist.input
so that real.exe
will not overwrite surface inputs:
surface_input_source = 3
To change a body of water to land, other variables in this region must also be altered:
- Change LANDMASK from 0 (water) to 1 (land)
- Change LAKE_DEPTH to 10 (default)
- Change SOILTEMP to 280
- Change SCB_DOM and SCT_DOM to neighbouring soil values
- Change LANDUSEF from 1 to 0 at soil category 20:
ncap2 -O -s 'LANDUSEF(:,20,y1:y2,x1:x2)=0;' geo_em.d01.nc geo_em.d01.nc
- Change SOILCBOT and SOILCTOP from 1 to 0 at category 13, similar to LANDUSEF.
Extracting time series from a specific location
WRF has the capability to easily record time series at specific station locations using TSlist for surface variables: t, q, u, v, psfc, glw, gsw, hfx, lh, tsk, tslb, rainc, rainnc, clw, along with vertical profiles at the given location for u, v, potential temperature, geopotential height, and water vapour mixing ratio. In the directory that you run WRF, edit the tslist
file to contain a station name, prefix, latitude, and longitude of your given location. This must be done before WRF is run. If you wish to specify more then 5 locations, you must edit the max_ts_locs
line in namelist.input
to specify the desired number of locations.
After WRF is run, the following files containing time series information for the station locations will have been created:
pfx.dNN.TS, pfx.dNN.UU, pfx.dNN.VV, pfx.dNN.TH, pfx.dNN.PH, pfx.dNN.QV
To extract time series information as an array for a specific variable in pfx.dNN.TS, it is helpful to use the following bash script, wrfTimeSeriesToArray.sh
:
if [ $# -ne 4 ]
then
echo "Required arguments are FILE_NAME COLUMN_NUMBER STARTING_ROW OUTFILE_NAME"
echo "If the first column starts with the delimiter, you'll have to add 1 to the COLUMN_NUMBER"
echo "Eg. For u10 in a prefix.d01.TS, type: bash wrfTimeSeriesToArray.sh prefix.d01.TS 9 2 'prefix-u10'"
exit
fi
output=($(awk -v SR=$3 -v COL=$2 -F "[[:space:]]+" 'SR<=NR{ if($COL=="") print BLANK; else print $COL; }' $1))
echo "${output[@]}" >> $4
Reducing the size of your output files
Since wrfout files contain a huge number of variables, they can be incredibly huge. This makes storing and transferring them a difficult task. However, using ncks, it is easy to create a new netCDF file containing only your variables of interest.
To view all the variables that are stored in your netCDF wrfout file, use:
ncdump -h wrfout_d01_YYYY-MM-DD_HH:00:00
You will need to keep the variables Times, XLAT, XLONG, XTIME
, along with your specific variables of interest. For example, to create a new netCDF file containing only variables Times, XLAT, XLONG, XTIME, U, V, and W, type:
module load nco
ncks -v Times,XLAT,XLONG,XTIME,U,V,W original_wrfout_filename new_filename
Troubleshooting Resources
Debug level
When real.exe
and wrf.exe
don't run properly, take a look in the rsl.out.0000
and rsl.error.0000
files for error outputs. However, sometimes the outputs contained in these files don't specify where something goes wrong. To increase the number of outputs printed to rsl.out.0000
and rsl.error.0000
, change the debug_level
variable in namelist.input
. This value can range from 0 to 1000. The larger the number, the more outputs printed to the files. It is worth noting that large debug_levels can cause the code to run slowly, or not at all. A safe debug_level number is around 100.
Segmentation faults on more than one node
If you are experiencing segmentation faults within the first timestep when running WRF on more than one node, it is likely that your version of OpenMPI is causing problems when transferring information between nodes. A good way to confirm this is by looking at your wrfout file. If squares of the domain (in variables PB, PH, PHB, T, etc.) are set to zero, this is likely the cause of the problem. You will need to load an earlier version of OpenMPI and re-configure/re-compile WRF. See Setting the OpenMPI version for more details.