As you may have noticed, the simulations take quite some time on a single processor, for which a series of about 500 daytime simulations take about 2 hours. From now on, we will exploit the multi-processor environment to run simulations in parallel. Herein, the master script ‘batch_simulations.py’ will be run in which it launches multiple jobs for which each one runs the previous ‘simulations.py’ script. This is done in STEP1 as follows:

STEP 1: Start CLASS4GL in parallel mode for workstations

python $CLASS4GL/simulations/batch_simulations.py \
  --station_id 74560 \
  --c4gl_path_lib $CLASS4GL \
  --exec $CLASS4GL/simulations/simulations.py \
  --path_forcing $CLASS4GL_DATA/forcing/IGRA_PAIRS_20190515/ \
  --path_experiments $CLASS4GL_DATA/experiments/IGRA_PAIRS_20190515/ \
  --experiments " BASE " \
  --multi_processing_mode pythonpool
  --cpu_count 4 

The script may warn you for an output directory that already exists. This is because the output will be stored in the same directory as in the previous tutorial. In order to delete the previous existing experiment(s), add the argument --cleanup_output_directories. Alternatively, you can specify another experiment name with --experiments_names, eg., --experiments_names BASE_allsounding_days.

This starts a similar job as in STEP 3 under “Running CLASS4GL simulations”, but now 5 jobs will be processed in parallel by using 4 CPUs of your workstation. Hence, this will speed up the total execution time by a factor 4, as long as that 4 CPUs on your workstation are available. Note that most recent machines have more than 4 cores, so you may enlarge the number for --cpu_count according to your hardware specifications.

STEP 2 (optional): Start CLASS4GL in parallel mode for supercomputers

When processing multiple stations, the execution time is still long even when running on multiple processors. If you are on a supercomputer, you can launch the simulations as ‘pbs’ jobs. This can be done by modifying the arguments of the command above:

python $CLASS4GL/simulations/batch_simulations.py \
--station_id 74560 \
--c4gl_path_lib $CLASS4GL \
--exec $CLASS4GL/simulations/simulations.py \
--path_forcing $CLASS4GL_DATA/forcing/IGRA_PAIRS_20190515/ \
--path_experiments $CLASS4GL_DATA/experiments/IGRA_PAIRS_20190515/ \
--experiments " BASE " \
--multi_processing_mode qsub

The output should be something like:

getting all stations from --path_forcing
Index(['Unnamed: 0', 'STNID', 'filename', 'latitude', 'longitude'], dtype='object')
defining all_stations_select
Selecting stations by --station_id
station numbers included in the whole batch (all chunks): [74560]
getting all records of the whole batch
Unnamed: 0 filename latitude longitude
74560 124 74560_ini.yaml 40.15 -89.33
splitting batch in --split_by=50 jobs.
total chunks of simulations (= size of array-job) per experiment: 5
Warning! Output directory '/data/gent/vo/000/gvo00090/D2D/data/SOUNDINGS//GLOBAL_EXPERIMENTS//BASE_test' exists! I'm removing it 10 seconds!' Press ctrl-c to abort.
Submitting array job for experiment BASE_test: qsub -l walltime=2:0:0 /user/home/gent/vsc422/vsc42247/software/class4gl/class4gl/simulations/batch_simulations.pbs -t 0-4 -v C4GLJOB_experiments=BASE,C4GLJOB_experiments_names=BASE_test,C4GLJOB_exec=/user/home/gent/vsc422/vsc42247/software/class4gl/class4gl/simulations/simulations.py,C4GLJOB_station_id=74560,C4GLJOB_subset_forcing=ini,C4GLJOB_split_by=50,C4GLJOB_c4gl_path_lib=/user/home/gent/vsc422/vsc42247/software/class4gl/class4gl,C4GLJOB_path_forcing=/data/gent/vo/000/gvo00090/D2D/data/SOUNDINGS//GLOBAL_20190115/,C4GLJOB_path_experiments=/data/gent/vo/000/gvo00090/D2D/data/SOUNDINGS//GLOBAL_EXPERIMENTS/
7566581

This again starts a similar job as in STEP 3 under “Running CLASS4GL simulations”, but now it is launched as 5 parallel jobs through qsub. Notice the last line, which gives you the job identifier. Depending on your supercomputer environment, you may see the jobs as follows: qstat -at which will give the following output:

                                                                       Req'd  Req'd      Elap    
Job id Username Queue Name SessID NDS TSK Memory Time Use S Time

7566581[] vsc42247 victini c4gl_sim -- 1 1 -- 02:00:00 Q 00:00:00

The ‘Q’ in the table means that the jobs are in the cue, whereas the ‘R’ means that the job is running. The STDIN job is your current interactive session. The other jobs are the submitted simulations. After a while the jobs will be started:

                                                                       Req'd  Req'd      Elap    
Job id Username Queue Name SessID NDS TSK Memory Time Use S Time
7566581[0] vsc42247 victini c4gl_sim -- 1 1 -- 02:00:00 R 00:03:51
7566581[1] vsc42247 victini c4gl_sim -- 1 1 -- 02:00:00 R 00:03:51
7566581[2] vsc42247 victini c4gl_sim -- 1 1 -- 02:00:00 R 00:03:51
7566581[3] vsc42247 victini c4gl_sim -- 1 1 -- 02:00:00 R 00:03:20
7566581[4] vsc42247 victini c4gl_sim -- 1 1 -- 02:00:00 R 00:03:20

You will notice that this will be 8 times faster so only about 15 minutes instead of 4 hours.

You will see that the submission of jobs generates a lot of files:

ls -larth

which gives:

-rw-------  1 vsc42247 vsc42247  84K 28 aug 18:01 c4gl_sim.o642906-7
-rw-------  1 vsc42247 vsc42247 285K 28 aug 18:01 c4gl_sim.o642906-0
-rw-------  1 vsc42247 vsc42247 108K 28 aug 18:01 c4gl_sim.o642907-7
-rw-------  1 vsc42247 vsc42247 330K 28 aug 18:01 c4gl_sim.o642907-0
-rw-------  1 vsc42247 vsc42247 143K 28 aug 18:01 c4gl_sim.o642907-1
-rw-------  1 vsc42247 vsc42247 105K 28 aug 18:01 c4gl_sim.o642906-6
-rw-------  1 vsc42247 vsc42247 142K 28 aug 18:01 c4gl_sim.o642907-5
-rw-------  1 vsc42247 vsc42247 326K 28 aug 18:01 c4gl_sim.o642907-3
-rw-------  1 vsc42247 vsc42247 181K 28 aug 18:01 c4gl_sim.o642906-2
-rw-------  1 vsc42247 vsc42247  97K 28 aug 18:01 c4gl_sim.o642906-1
-rw-------  1 vsc42247 vsc42247 308K 28 aug 18:01 c4gl_sim.o642907-6
-rw-------  1 vsc42247 vsc42247 254K 28 aug 18:02 c4gl_sim.o642906-4
-rw-------  1 vsc42247 vsc42247 304K 28 aug 18:02 c4gl_sim.o642907-4
-rw-------  1 vsc42247 vsc42247 100K 28 aug 18:02 c4gl_sim.o642906-5
-rw-------  1 vsc42247 vsc42247 308K 28 aug 18:02 c4gl_sim.o642907-2
-rw-------  1 vsc42247 vsc42247 307K 28 aug 18:02 c4gl_sim.o642906-3

This is for debugging purposes. If no expected behaviour is found, you can safely remove them.

If you remove the --station_id specification, all stations will be run. Note that this will result in an array of about 700 jobs:

ipython -i $CLASS4GL/simulations/batch_simulations.py -- \
  --c4gl_path_lib $CLASS4GL \
  --exec $CLASS4GL/simulations/simulations.py \
  --path_forcing $CLASS4GL_DATA/forcing/IGRA_PAIRS_20190515/ \
  --path_experiments $CLASS4GL_DATA/experiments/IGRA_PAIRS_20190515/ \
  --experiments BASE

The computation time largely depends on cueing times and computational power of the computer infrastructure, but the whole job array should be finished in 1 hour (for TIER2 systems).

Whenever necessary, you can abort all jobs in the batch/array with

qdel 7566581[]

Please replace the job number 7566581 with the actual job number according to qstat.

STEP 2: explore your array experiment

As soon as all jobs are completed, you can open the data explorer:

´╗┐ipython -i $CLASS4GL/interface/interface.py -- \
  --c4gl_path_lib $CLASS4GL \
  --path_forcing $CLASS4GL_DATA/forcing/IGRA_PAIRS_20190515/ \
  --path_experiments $CLASS4GL_DATA/experiments/IGRA_PAIRS_20190515/ \
  --experiments BASE

STEP 3: move between stations

In case you have run multiple stations, you now can not only move between sounding days (see step 6 of first experiment), but also between stations. An overview of the stations can be obtained by:

c4gldata['BASE'].frames['stats']['stations'].table

Next, you move to a specific station (eg., 74560) with:

c4gldata['BASE'].sel_station(STNID=74560)

Next: sensitivity experiments.