Running Gaussian 09 Jobs on Lewis

The current production version of the Gaussian 09 software package is installed on lewis under /share/apps/gaussian/g09. Example input files for the Gaussian 09 test jobs are in /share/apps/gaussian/g09/tests/com/.

Before you can run Gaussian 09 jobs on lewis, you must be authorized as a member of the gaussian group. To determine if you are authorized, enter the id command, and verify that gaussian is listed as one of the groups to which your account belongs. If you are not a member of the gaussian group, please contact the system administrator via email at support@rnet.missouri.edu.

Gaussian 09 on lewis must be invoked via the run_g09 command, and Gaussian 09 jobs must be run as batch jobs under LSF, submitted via the bsub command. For general information about how to submit LSF batch jobs on lewis, see Submitting Jobs via LSF.

A special LSF queue, named "gaussian" has been created for Gaussian jobs. The compute nodes that are scheduled by the gaussian queue have been configured specifically to support efficient execution of Gaussian jobs, including large locally attached scratch disk space for fast I/O to the work files used by Gaussian jobs. Gaussian jobs can be submitted to other queues, including the idle queue, but users are urged to submit Gaussian jobs to the gaussian queue.

You can submit jobs to the gaussian queue by using the "-q" option of the bsub command:

#BSUB -q gaussian (in the job script file)
or
bsub -q gaussian ... (on the command line)

When you submit Gaussian jobs to the gaussian queue, please be aware of the fact that each of the gaussian nodes has its own unique /scratch file system that is used by Gaussian for temporary work files. Each gaussian node has about 11 TB of usable locally attached disk space for /scratch. There is a global /scratch file system that is accessible from the lewis login node and from all of the other compute nodes, but it is not the one that is used by the gaussian nodes. If you need to access any of the work files in /scratch belonging to a Gaussian job that was submitted to the gaussian queue, while that job is running or after it has run, you must log in via SSH to the specific compute node where the job ran in order to access the files. The node where the job ran is displayed by the bjobs command, and it is also shown in the standard output file for the job. For example, if the job runs on compute node c14a-05, then you can log in to node c14a-05 from the lewis login node using the following command:

$ ssh c14a-05

When you are finished, log out of the compute node to return to the lewis login node. But remember, each gaussian node has its own unique /scratch file system that is separate from the global /scratch file system that is used by all other nodes.

The LSF job script file for a Gaussian 09 job should look something like this:
#BSUB -J test1
#BSUB -oo test1.o%J
#BSUB -eo test1.e%J
#BSUB -q gaussian
#BSUB -n 2
#BSUB -R "span[hosts=1] rusage[mem=2048]"
run_g09 test001.com
This job script specifies that the job name will be test1, and that the standard output and error output files will be named test1.onnnnnn and test1.ennnnnn, respectively, where nnnnnn is the job number assigned by LSF. It also specifies that the job needs 2 CPUs on the same host, and 2048 MB of memory. The run_g09 command sets up the Gaussian 09 environment, and then invokes g09 with the specified input file, in this example test001.com. See Submitting Jobs via LSF for a more detailed description of the #BSUB parameters and how to submit the job via the bsub command.

The -n parameter specifies how many CPUs should be used for the job. You must not specify anything in your Gaussian 09 input file about how many processors your job will use. See the section Running Gaussian 09 with Linda below for the proper way to specify the use of multiple CPUs/nodes for a Gaussian 09 job on lewis. The mem= parameter specifies how much memory (in megabytes) LSF should allocate for the job. This amount of memory must match the amount of memory specified in the Gaussian 09 input file. (Note: The value specified for mem= when submitting the job to LSF must always be given as the number of megabytes required, regardless of what units are used to specify the memory amount in the Gaussian 09 input file.)

All of the parameters on the #BSUB lines can also be specified with the bsub command on the command line, but it is convenient to put them in the job script file so that they are not forgotten.

Use the bjobs command to see which jobs are queued/executing/ended:
bjobs -a
The directory to use for temporary work files on lewis is /scratch. The run_g09 command uses that directory by default. If you explicitly specify the pathnames for work files in your .com files, please modify the pathnames accordingly. Files in the /scratch directory will be automatically deleted if they have not been accessed in more than 5 days.

If you run Gaussian 09 utilities such as newzmat or formchk, you will need to set up the Gaussian 09 environment for your interactive login session. Place the following lines in the .bash_profile file in your home directory:
export g09root=/share/apps/gaussian
export GAUSS_SCRDIR=/scratch
source $g09root/g09/bsd/g09.profile

Running Gaussian 09 with Linda


The version of Gaussian 09 on lewis includes Linda, which allows Gaussian 09 jobs to be run across multiple nodes in the cluster, instead of being limited to just a single node. Gaussian 09 with Linda is still invoked by executing the run_g09 command, as you have been doing previously on lewis. But there are some things that you will need to change.

First, because of the unique way (compared to other parallel applications on lewis) that Linda uses SSH to launch processes on multiple nodes, and if you intend to run Gaussian 09 jobs across multiple nodes, you will need to modify your account's SSH configuration by doing the following:
  1. Copy the system-level SSH configuration to your SSH configuration directory with the following commands:
    $ cd ~/.ssh
    $ cp /etc/ssh/ssh_config config
  2. Edit the config file that you just copied and make the following changes: uncomment (remove the # at the beginning of) the StrictHostKeyChecking line and change "ask" to "no".
You must not specify anything in your Gaussian 09 input file about how many processors or Linda workers your job will use. So, do not specify any of the following Link 0 commands in the input file:
%NProc
%NProcShared
%NProcLinda
%LindaWorkers
The number of processors to be used and the number and names of Linda worker nodes will be passed from LSF to Gaussian 09 via the Default.Route file that will be created by the run_g09 command in the current working directory from where the job is submitted. That means that you cannot use the Default.Route file for your own purposes, because the run_g09 command will overwrite your file with a new one. Any option that you might want to specify in the Default.Route file must be put into your Gaussian 09 input file instead.

There are now three ways to run Gaussian 09 using multiple processors (CPUs) on lewis:
  1. On a single node, using multiple threads with shared memory, and with all processors allocated on the same node.

  2. Across multiple nodes, with multiple processes communicating via TCP over the high speed Infiniband network, and with processors being allocated wherever they are available. Even if multiple processors happen to be allocated on the same node, the processes on that node will not use shared memory for communication.

  3. Across multiple nodes, using multiple threads with shared memory on each node, and an equal number of processors allocated on each node. Communication between threads on the same node will be via shared memory, and communication between processes on different nodes will be via TCP over the high speed Infiniband network.
In general, communication between threads using shared memory is faster than communication between processes via TCP. So, option 3 should perform better than option 2 for the same number of processors. However, option 2 provides LSF with more scheduling flexibility than option 3, and jobs using option 3 may have to wait longer in the queue for nodes and processors to become available.

The option to be used is determined entirely by how you specify the number of processors and the processor spanning requirements to LSF when you submit the job. Remember, you must not specify the number of processors to be used in your Gaussian 09 input file.

For option 1, with all processors on the same node, you would specify something like the following in your job script file:
#BSUB -n 4
#BSUB -R "span[hosts=1] rusage[mem=2048]"
For option 2, there is no particular spanning requirement, so you omit the span specification. In the following example, 16 processors will be allocated across multiple nodes, wherever they are available:
#BSUB -n 16
#BSUB -R "rusage[mem=2048]"
For option 3, you must specify how many processors are to be allocated on each node via the ptile spanning specification. In the following example, a total of 12 processors will be allocated on 3 nodes, with 4 processors on each node:
#BSUB -n 12
#BSUB -R "span[ptile=4] rusage[mem=2048]"
Please note that for option 3, the number of processors must be an even multiple of the ptile value. If the value of ptile is equal to the number of processors requested, then the effect is the same as option 1.

You may see messages like the following in the job's error output file:
eval server 0 on d17b-14-ib0.local has dropped it's connection.
subprocess pid = 19597 has exited. status = 0x0000, id = 0, state = 17. command was /share/sw/g09linda/g09/linda8.2/opteron-linux/bin/linda_sh /share/sw/g09linda/g09/linda-exe/l302.exel 134217728 /scratch/Gau-19592.chk 0 /scratch/Gau-19592.int 0 /scratch/Gau-19592.rwf 0 /scratch/Gau-19592.d2e 0 /scratch/Gau-19592.scr 0 /scratch/Gau-19591.inp 0 junk.out 0 /scratch/Gau-19592.nex 0 +LARGS 1 d17b-14-ib0.local 10.10.117.14 41168 15 1 .
died after signing in successfully
These messages are not indicative of a problem. They indicate that the Linda work in a Gaussian link is finished, and that Gaussian is continuing with a new link. They can be ignored.

Not all Gaussian 09 calculation can be parallelized via Linda. HF, CIS=Direct, and DFT calculations on molecules are Linda parallel, including energies, optimizations and frequencies. TDDFT energies and gradients and MP2 energies and gradients are also Linda parallel. Portions of MP2 frequency and CCSD calculations are Linda parallel, but others are only SMP (shared memory) parallel, so they see some speedup from using a few nodes but no further improvement from larger numbers of nodes.

Also, the amount of speedup that you will see depends upon how much parallelism can be used for various types of calculations. Gaussian 09 may not be able to keep all processors busy all of the time. There is also additional overhead due to network communications between nodes. So, doubling the number of nodes used on a job will not reduce the execution time by half. In general, as you increase the number of processors and nodes used for a job, you can expect to see diminishing returns on the amount of speedup achieved. You will need to run tests with the types of calculations that you normally perform in order to determine how much attempted parallelism will result in the most effective use of resources.

For an online user's guide for Gaussian 09, see Gaussian 09 User's Reference.

For more information about accessing lewis in general, see Accessing Lewis.