Running AceCAST

This guide will demonstrate how to run AceCAST by walking you through an example with the Easter500 benchmark test case.

Before attempting to run, make sure AceCAST and its dependencies have been installed correctly and that you have a valid AceCAST license to use the software (see Installation Guide). Also make sure that you have a valid license file (see Acquire A License) and have placed it in your acecast-v3.2.2/acecast/run/ directory.

For this example we will assume your AceCAST installation is in your home directory (i.e. at ~/acecast-v3.2.2). If it is somewhere else you will need to modify the code examples accordingly.

Input Data

The first step in any AceCAST/WRF workflow is to generate input data for the model. AceCAST uses the same namelist.input, wrfbdy*, wrfinput*, etc. files that are used by the standard CPU-WRF model. The only restriction is that the specified namelist options must be supported by AceCAST (see Namelist Configuration). In this guide we will use we will use the Easter500 benchmark from our standard Benchmarks, which is typically a good test case for a small number of GPUs.

Download Easter500 Test Case Data:

cd ~/acecast-v3.2.2/acecast
mkdir benchmarks
cd benchmarks
wget https://tqi-public.s3.us-east-2.amazonaws.com/datasets/v2/easter500.tar.gz
tar -xf easter500.tar.gz

At this point your acecast directory should look like this:

~/acecast-v3.2.2/acecast
├── benchmarks
│   ├── easter500
│   │   ├── met_em.d01.2020-04-12_00:00:00.nc
│   │   ├── << more met_em.* files >>
│   │   ├── met_em.d01.2020-04-13_12:00:00.nc
│   │   └── namelist.input
│   └── easter500.tar.gz
└── run
    ├── acecast-advisor.sh
    ├── acecast-trial.lic
    ├── acecast.exe
    ├── gpu-launch.sh
    ├── real.exe
    └── << various static data files >>

Setting Up the Simulation Run Directory

Next we would like to create a new directory to run the simulation in containing all of the necessary runtime and input files. This will typically include the following:

  • AceCAST Run Files (always found in ~/acecast-v3.2.2/acecast/run/):
    • executables (real.exe, acecast.exe, etc.)

    • acecast advisor script (acecast-advisor.sh)

    • license file (acecast.lic or acecast-trial.lic)

    • MPI wrapper script (gpu_launch.sh)

    • static data files (CCN_ACTIVATE.BIN, GENPARM.TBL, LANDUSE.TBL, etc.)

  • Simulation Specific Input Data and Configuration Files (found in ~/acecast-v3.2.2/acecast/benchmarks/easter500 for this example):
    • namelist file (namelist.input)

    • real.exe or acecast.exe input data (met_em* or wrfbdy*, wrfinput*, etc.)

Tip

We consider it best practice to create a new directory for each simulation you run. This can help you avoid common mistakes when running large numbers of simulations and also allows you to run multiple simulations simultaneously if you have the compute resources to do so.

For our example we will be using 4 GPUs and will set up this simulation run directory at ~/acecast-v3.2.2/acecast/easter500-4GPU:

# Create and cd to new run directory
mkdir ~/acecast-v3.2.2/acecast/easter500-4GPU
cd ~/acecast-v3.2.2/acecast/easter500-4GPU

# Link static acecast run files
ln -s ../run/* .

# Link input data files
ln -s ../benchmarks/easter500/met_em.* .

# Copy the namelist file
cp ../benchmarks/easter500/namelist.input .

Tip

We typically copy the namelist.input file rather than create a symbolic link like we do with all of the other files here. Since the namelist is modified regularly it is best to make changes to the local copy of the file rather than the original, which can cause confusing problems if the namelist is linked and edited in multiple run directories.

Verify Namelist Configuration

At this point we can use the acecast-advisor.sh script to verify that all of the options specified in the namelist are supported by AceCAST. We have an entire section of the documentation dedicated to this topic (see Namelist Configuration) but we will keep things simple for this example.

Note

The Easter500 benchmark is distributed with a fully supported namelist but we recommend trying out the acecast-advisor.sh tool anyways to get a sense of how it works for when you start using your own namelists rather than the one that we provide for this example.

AceCAST Advisor – Support Check Tool

# cd to the simulation run directory if you aren't already there
./acecast-advisor.sh --tool support-check

Setting Up Your Environment

Prior to running the executables in the following sections you will need to make sure your environment is set up correctly as described in the Installation Guide (see Environment Setup).

Modify OpenMPI Settings (Optional)

The NVIDIA HPC SDK uses an older version of OpenMPI (version 3.1.5). This version is performant and works well on a variety of systems but it can produce some confusing warnings when running MPI jobs. These warnings can be suppressed by setting the btl_base_warn_component_unused=0 option using the following commands.

mkdir -p ~/.openmpi
echo "btl_base_warn_component_unused = 0" > ~/.openmpi/mca-params.conf

Note that this only needs to be done one time on any given system.

Running Real

To generate the wrfinput*, wrfbdy*, etc. inputs for AceCAST we need to run Real. This works the same way it does for WRF and this process should be familiar for WRF users.

# cd to the simulation run directory if you aren't already there
mpirun -n <number of cpu cores> ./real.exe

Change the <number of cpu cores> to the number of cores you would like to use to run real.exe.

Note

The mpirun command options can vary depending on a number of factors including the number of nodes, CPU cores per node or whether you are running under resource managers (e.g., SLURM, Torque, etc.) to name a few.

If real.exe ran successfully then you should see that it generated the input files for AceCAST (wrfinput*, wrfbdy*, etc.) and you can also check for a successful completion message in the RSL log files:

tail -n 5 rsl.error.0000

Running AceCAST

General AceCAST usage can be summarized as follows:

mpirun [MPIRUN_OPTIONS] gpu-launch.sh [--gpu-list GPU_LIST] acecast.exe

We always recommend that you use one MPI task per each GPU you intend to run on. This is accomplished through the proper choice of MPIRUN_OPTIONS as well as the gpu-launch.sh MPI wrapper script. The goal of the former is to launch the correct number of MPI tasks on each node. The gpu-launch.sh script sets the ACC_DEVICE_NUM environment variable (see NVHPC OpenACC Environment Variables) to the specific GPU id for each MPI task prior to launching the acecast.exe executable.

Note

For more information about the gpu-launch.sh script check out GPU Mapping with MPI and the GPU Launch Script.

For our example we can run with 4 GPUs on a single node:

mpirun -n 4 ./gpu-launch.sh ./acecast.exe

If AceCAST ran successfully then you should see that it generated the wrfout* files. You should also check for a successful completion message in the RSL log files:

tail -n 5 rsl.error.0000

Summary and Next Steps

In this section we covered the basics of running AceCAST through an example where we ran the Easter500 benchmark test case with 4 GPUs on a single node. By using input data from one of our benchmark test cases, we were able to focus on the fundamental mechanics of running the AceCAST software before moving on to other critical topics such as generating input data and namelist configuration. These will be covered in the next sections Generating Input Data and Namelist Configuration.