VinaLC: Parallel Molecular Docking Program

Biochemical and Biophysical Systems Group
VinaLC version: 1.1.2

Users Guide
Software Installation

1. Standard Docking Calculation.

1.1. Prepare input files.

In the VinaLC/testground, there is a small test case contains some exmaple input files.

1.1.1. receptor

File contains list of receptor file name: recList.txt two receptor pdbqt files: 1KIJ_protH.pdbqt 1KIJ_protH1.pdbqt

1.1.2. ligand

File contains list of ligand file name: ligList.txt a "data" directory contains two ligand pdbqt files: data/ ligands1.pdbqt ligands2.pdbqt

Each pdbqt file can contain multiple structures. The pdbqt structures for ligands have to be separated by "MODEL" and "ENDMDL" keywords even if the file contains single structure.

1.1.3. grid file

Grid file is to save the docking grid information. Each line corresponds to each receptor in recList.txt. each line has six number where first 3 are center of active site and last 3 are grids dimension with unit of angstrom. The program will compute the num_grid_size on-the-fly by: xnum_grid_size=x_grid/granularity. granularity can be set with program option and its default value is 0.375 angstrom. the file is arranged as:

x_center y_center z_center x_grid y_grid z_grid 
x_center y_center z_center x_grid y_grid z_grid
....

1.1.4. pdbqt files

The pdbqt files for receptors and ligands are prepared from their pdb file by mgltools (http://mgltools.scripps.edu/) Two python scripts in MGLTools-1.5.6rc2/MGLToolsPckgs/AutoDockTools/Utilities24/ are used

for receptor

prepare_receptor4.py -r rec.pdb -o rec.pdbqt -A checkhydrogens 

for ligand

prepare_ligand4.py -l ligand.pdb

The pdbqt files can also be prepared by babel:

for recetpor <-xr : Output as a rigid molecule (i.e. no branches or torsion tree)>

babel -i<format> input.<format> -opdbqt out.<format> -xr

for ligand

babel -i<format> input.<format> -opdbqt out.<format>

1.2. Running program


          

1.2.1. To run the program with slurm in debug mode:

srun -N4 -n4 -c12 -ppdebug ./vina --recList recList.txt --ligList ligList.txt --geoList geoList.txt

-N4: 4 nodes will use -n4: 4 tasks will each task run on one node -c12: 12 threads will run on each node -ppdebug: use debug mode

1.2.2. Vina program option:

-----------------------------------------------------------------------------------------------------
./vinaBMPI --help

Input:
  --recList arg              receptor list file
  --fleList arg              flex part receptor list file
  --ligList arg              ligand list file
  --geoList arg              receptor geometry file
  --exhaustiveness arg (=8)  exhaustiveness (default value 8) of the global 
                             search (roughly proportional to time): 1+
  --granularity arg (=0.375) the granularity of grids (default value 0.375)
  --num_modes arg (=9)       maximum number (default value 9) of binding modes 
                             to generate
  --seed arg                 explicit random seed
  --randomize arg            Use different random seeds for complex
  --energy_range arg (=2)    maximum energy difference (default value 2.0) 
                             between the best binding mode and the worst one 
                             displayed (kcal/mol)
  --useScoreCF               Use score cutoff to save ligand with top score 
                             higher than certain critical value
  --scoreCF arg (=-8)        Score cutoff to save ligand with top score higher 
                             than certain value (default -8.0)

Information (optional):
  --help                display usage summary

-----------------------------------------------------------------------------------------------------
            

1.2.3. Run with different options

srun -N4 -n4 -c12 ./vinaBMPI --recList recList.txt --ligList ligList.txt
            --geoList geoList.txt --exhaustiveness 12 
srun -N4 -n4 -c12 ./vinaBMPI --recList recList.txt --ligList ligList.txt
--geoList geoList.txt --exhaustiveness 12 --granularity 0.333 ...

Options, --recList --ligList --geoList are required to be specified.

1.3. More on preparation of the input PDBQT files.

In the PDBQT file make sure you have MODEL and ENDMDL to mark beginning and end of the each structures/poses, even for single structure/pose.

1.3.1. babel

SDF file can save multiple ligands in one file. If you start with SDF files for the ligands, you can use the babel to convert the SDF to PDBQT directly:

babel -isdf <ligand-file-name>.sdf -opdbqt <ligand-file-name>.pdbqt

1.3.2. MGLTools (http://mgltools.scripps.edu/)

The AutoDock developer team provides graphic user interface, AutoDockTools (ADT), to prepare the input files. The receptor input file can also use ADT to convert the file format. There is a Vina video tutorial (http://vina.scripps.edu/tutorial.html) to show how to use ADT to prepare receptor, ligand, and determine the grid size that use in the program.

1.4. Grid size.

An important thing to remember when calculate the grid size:

x_grid=<number of poiont in x-dimension>*spacing 
y_grid=<number of poiont in y-dimension>*spacing
z_grid=<number of poiont in z-dimension>*spacing

spacing in ADT is equal to granularity in Vina.

2. Docking with flexible receptor/induced docking capabilities

2.1 Create rigid and flexible pdbqt files for recetpor

Split the receptor into two parts: rigid and flexible, with the latter represented somewhat similarly to how the ligand is represented. See the section "Flexible Receptor PDBQT Files" of the AutoDock4.2 User Guide (page 14) for how to do this in AutoDock Tools (ADT).

Flexibles option is used to assign these residues as flexible. As with the ligand, you can choose which bonds to keep rotatable by clicking on the bonds.

FlexibleResidues>Output>SaveRigidPDBQT: 
FlexibleResidues>Output>SaveFlexiblePDBQT:

these two commands launch a browser to write PDBQT files for the rigid portion of the receptor and the flexible portion of the receptor. Two PDBQT files will be saved for rigid and flexible portions respectively.

Then, you can issue this command:

srun -N4 -n4 -c12 -ppdebug ./vina --recList rigList --ligList
          ligList.txt --geoList geoList.txt --fleList fleList

Where --recList rigList has a list of PDBQT file names containing rigid part receptors, --fleList fleList has a list of PDBQT file names containing rigid part receptors.

2.2 Notes for induced docking

2.2.1. You must make two "special" pdbqt files: one of your receptor that is missing the residues that will be flexible, and one that contains the flexible residues (and also contains info on what of them that you wish to be flexible). It is very well explained (better than I possibly can) in the AutoDock4 - ADT manual, so it is suggested that you first exactly repeat the example for making those two pdbqt-files (with the program ADT) as described in that manual (without actually doing the docking but you may try Vina here and it will work!) and thereafter you're set to do the same with your own receptor and ligand.

2.2.2. From the ADL mailing list as well as the new Vina forum, It' shown that new dockers often are setting quite many receptor residues unnecessarily as flexible. Flexibility should be kept for only those few residues that really need it: too many flexibles hampers the prediction. An advise to new dockers to reflect more on your receptor (say, collect all known 3D structures and make a combined automatic overlap - e.g. the Swiss-PDB-viewer can do this - to see where the differences are, check the b-factors of suspect residues, consider the catalytic mechanism if your receptor is an enzyme, ...) before deciding which few residues must remain flexible. And even then you may consider to do test-dockings to possibly further eliminate some flexibles that only minimally affect ligand binding. Plus, the less flexibles, not only the better the prediction, but also quite faster the docking, which is a very important criterion if you are planning virtual screens with large number of compounds.

3. Running large number of exhaustiveness

Use the "randomize" options. An example to run the new program:

srun -N6 -n6 -c12 ./vina --recList recList --ligList ligList.txt
          --geoList geoList.txt --randomize 1

The tricky is: create multiple files of the same receptor. For example, in the recList:

rec1.pdbqt rec2.pdbqt rec3.pdbqt rec4.pdbqt rec5.pdbqt

They are the same structure in different file name. So you are actually running 5*12=60 MC which are equivalent to 60 exhaustiveness. And your 5 receptors can be sent to 5 slave nodes at the same time.

the same can also be done for ligList: lig1.pdbqt lig2.pdbqt lig3.pdbqt lig4.pdbqt lig5.pdbqt

Then the total exhaustiveness is 5*5*12=300 MC.

You will need to use 'pdbqt2sdf.py' script to parse the results. unzip the recList_ligList.txt.pdbqt.gz file And run

pdbqt2sdf.py recList_ligList.txt.pdbqt test.sdf -model 4 -pose 20

This will sort the top 4 models of all complex and save top 20 pose in the SDF file. You can import the SDF structures into maestro and see the Vina score on the maestro table.

4. Running with large number of ligands and only want to keep top a few ligand structures

Use the "useScoreCF" options. The Vina score cutoff option "--useScoreCF" will tell the VinaLC to save only the Vina score higher than scoreCF (default –8.0 kcal/mol)

srun -N4 -n4 -c12  vinaBMPI --recList recList.txt --ligList ligList.txt --geoList geoList.txt --useScoreCF

User can also specify the Vina score cutoff value

srun -N4 -n4 -c12  vinaBMPI --recList recList.txt --ligList ligList.txt --geoList geoList.txt --useScoreCF --scoreCF -9.0

There is a python script under the VinaLC/scripts/ directory: sortVinaLC.py can sorted the ligands by its top 1 Vina docking score. A example command line to sort the pdbqt structures would be:

python sortVinaLC.py -i recList_ligList.pdbqt.gz -o sort.pdbqt -n 10

Where recList_ligList.pdbqt.gz is the Gzip file produced by VinaLC, sort.pdbqt is the output sorted pdbqt file, "-n 10" is to tell program to save top 10 structures with highest Vina scores.

Here is the help manual for sortVinaLC.py

================================================================
python sortVinaLC.py -h
Usage: sortVinaLC.py -i  -o  -n N

    Optional parameters:
        [-i]    input *.pdbqt.gz file
        [-o]    output *.pdbqt file
        [-n]    keep top N ligands with highest Vina Scores
        [-h]    print command usage
================================================================