VinaLC: Parallel Molecular Docking Program |
Biochemical and Biophysical Systems Group |
1.1. Prepare input files.
In the VinaLC/testground, there is a small test case contains some exmaple input files.
1.1.1. receptor
File contains list of receptor file name: recList.txt two receptor pdbqt files: 1KIJ_protH.pdbqt 1KIJ_protH1.pdbqt
1.1.2. ligand
File contains list of ligand file name: ligList.txt a "data" directory contains two ligand pdbqt files: data/ ligands1.pdbqt ligands2.pdbqt
Each pdbqt file can contain multiple structures. The pdbqt structures for ligands have to be separated by "MODEL" and "ENDMDL" keywords even if the file contains single structure.
1.1.3. grid file
Grid file is to save the docking grid information. Each line corresponds to each receptor in recList.txt. each line has six number where first 3 are center of active site and last 3 are grids dimension with unit of angstrom. The program will compute the num_grid_size on-the-fly by: xnum_grid_size=x_grid/granularity. granularity can be set with program option and its default value is 0.375 angstrom. the file is arranged as:
x_center y_center z_center x_grid y_grid z_grid
x_center y_center z_center x_grid y_grid z_grid
....1.1.4. pdbqt files
The pdbqt files for receptors and ligands are prepared from their pdb file by mgltools (http://mgltools.scripps.edu/) Two python scripts in MGLTools-1.5.6rc2/MGLToolsPckgs/AutoDockTools/Utilities24/ are used
for receptor
prepare_receptor4.py -r rec.pdb -o rec.pdbqt -A checkhydrogensfor ligand
prepare_ligand4.py -l ligand.pdbThe pdbqt files can also be prepared by babel:
for recetpor <-xr : Output as a rigid molecule (i.e. no branches or torsion tree)>
babel -i<format> input.<format> -opdbqt out.<format> -xr
for ligand
babel -i<format> input.<format> -opdbqt out.<format>1.2. Running program
1.2.1. To run the program with slurm in debug mode:
srun -N4 -n4 -c12 -ppdebug ./vina --recList recList.txt --ligList ligList.txt --geoList geoList.txt-N4: 4 nodes will use -n4: 4 tasks will each task run on one node -c12: 12 threads will run on each node -ppdebug: use debug mode
1.2.2. Vina program option:
----------------------------------------------------------------------------------------------------- ./vinaBMPI --help Input: --recList arg receptor list file --fleList arg flex part receptor list file --ligList arg ligand list file --geoList arg receptor geometry file --exhaustiveness arg (=8) exhaustiveness (default value 8) of the global search (roughly proportional to time): 1+ --granularity arg (=0.375) the granularity of grids (default value 0.375) --num_modes arg (=9) maximum number (default value 9) of binding modes to generate --seed arg explicit random seed --randomize arg Use different random seeds for complex --energy_range arg (=2) maximum energy difference (default value 2.0) between the best binding mode and the worst one displayed (kcal/mol) --useScoreCF Use score cutoff to save ligand with top score higher than certain critical value --scoreCF arg (=-8) Score cutoff to save ligand with top score higher than certain value (default -8.0) Information (optional): --help display usage summary -----------------------------------------------------------------------------------------------------1.2.3. Run with different options
srun -N4 -n4 -c12 ./vinaBMPI --recList recList.txt --ligList ligList.txt --geoList geoList.txt --exhaustiveness 12
srun -N4 -n4 -c12 ./vinaBMPI --recList recList.txt --ligList ligList.txt
--geoList geoList.txt --exhaustiveness 12 --granularity 0.333 ...Options, --recList --ligList --geoList are required to be specified.
1.3. More on preparation of the input PDBQT files.
In the PDBQT file make sure you have MODEL and ENDMDL to mark beginning and end of the each structures/poses, even for single structure/pose.
1.3.1. babel
SDF file can save multiple ligands in one file. If you start with SDF files for the ligands, you can use the babel to convert the SDF to PDBQT directly:
babel -isdf <ligand-file-name>.sdf -opdbqt <ligand-file-name>.pdbqt1.3.2. MGLTools (http://mgltools.scripps.edu/)
The AutoDock developer team provides graphic user interface, AutoDockTools (ADT), to prepare the input files. The receptor input file can also use ADT to convert the file format. There is a Vina video tutorial (http://vina.scripps.edu/tutorial.html) to show how to use ADT to prepare receptor, ligand, and determine the grid size that use in the program.
1.4. Grid size.
An important thing to remember when calculate the grid size:
x_grid=<number of poiont in x-dimension>*spacing
y_grid=<number of poiont in y-dimension>*spacing
z_grid=<number of poiont in z-dimension>*spacingspacing in ADT is equal to granularity in Vina.
2.1 Create rigid and flexible pdbqt files for recetpor
Split the receptor into two parts: rigid and flexible, with the latter represented somewhat similarly to how the ligand is represented. See the section "Flexible Receptor PDBQT Files" of the AutoDock4.2 User Guide (page 14) for how to do this in AutoDock Tools (ADT).
Flexibles option is used to assign these residues as flexible. As with the ligand, you can choose which bonds to keep rotatable by clicking on the bonds.
FlexibleResidues>Output>SaveRigidPDBQT:
FlexibleResidues>Output>SaveFlexiblePDBQT:these two commands launch a browser to write PDBQT files for the rigid portion of the receptor and the flexible portion of the receptor. Two PDBQT files will be saved for rigid and flexible portions respectively.
Then, you can issue this command:
srun -N4 -n4 -c12 -ppdebug ./vina --recList rigList --ligList ligList.txt --geoList geoList.txt --fleList fleListWhere --recList rigList has a list of PDBQT file names containing rigid part receptors, --fleList fleList has a list of PDBQT file names containing rigid part receptors.
2.2 Notes for induced docking
2.2.1. You must make two "special" pdbqt files: one of your receptor that is missing the residues that will be flexible, and one that contains the flexible residues (and also contains info on what of them that you wish to be flexible). It is very well explained (better than I possibly can) in the AutoDock4 - ADT manual, so it is suggested that you first exactly repeat the example for making those two pdbqt-files (with the program ADT) as described in that manual (without actually doing the docking but you may try Vina here and it will work!) and thereafter you're set to do the same with your own receptor and ligand.
2.2.2. From the ADL mailing list as well as the new Vina forum, It' shown that new dockers often are setting quite many receptor residues unnecessarily as flexible. Flexibility should be kept for only those few residues that really need it: too many flexibles hampers the prediction. An advise to new dockers to reflect more on your receptor (say, collect all known 3D structures and make a combined automatic overlap - e.g. the Swiss-PDB-viewer can do this - to see where the differences are, check the b-factors of suspect residues, consider the catalytic mechanism if your receptor is an enzyme, ...) before deciding which few residues must remain flexible. And even then you may consider to do test-dockings to possibly further eliminate some flexibles that only minimally affect ligand binding. Plus, the less flexibles, not only the better the prediction, but also quite faster the docking, which is a very important criterion if you are planning virtual screens with large number of compounds.
Use the "randomize" options. An example to run the new program:
srun -N6 -n6 -c12 ./vina --recList recList --ligList ligList.txt --geoList geoList.txt --randomize 1The tricky is: create multiple files of the same receptor. For example, in the recList:
rec1.pdbqt rec2.pdbqt rec3.pdbqt rec4.pdbqt rec5.pdbqtThey are the same structure in different file name. So you are actually running 5*12=60 MC which are equivalent to 60 exhaustiveness. And your 5 receptors can be sent to 5 slave nodes at the same time.
the same can also be done for ligList: lig1.pdbqt lig2.pdbqt lig3.pdbqt lig4.pdbqt lig5.pdbqt
Then the total exhaustiveness is 5*5*12=300 MC.
You will need to use 'pdbqt2sdf.py' script to parse the results. unzip the recList_ligList.txt.pdbqt.gz file And run
pdbqt2sdf.py recList_ligList.txt.pdbqt test.sdf -model 4 -pose 20This will sort the top 4 models of all complex and save top 20 pose in the SDF file. You can import the SDF structures into maestro and see the Vina score on the maestro table.
Use the "useScoreCF" options. The Vina score cutoff option "--useScoreCF" will tell the VinaLC to save only the Vina score higher than scoreCF (default –8.0 kcal/mol)
srun -N4 -n4 -c12 vinaBMPI --recList recList.txt --ligList ligList.txt --geoList geoList.txt --useScoreCFUser can also specify the Vina score cutoff value
srun -N4 -n4 -c12 vinaBMPI --recList recList.txt --ligList ligList.txt --geoList geoList.txt --useScoreCF --scoreCF -9.0There is a python script under the VinaLC/scripts/ directory: sortVinaLC.py can sorted the ligands by its top 1 Vina docking score. A example command line to sort the pdbqt structures would be:
python sortVinaLC.py -i recList_ligList.pdbqt.gz -o sort.pdbqt -n 10Where recList_ligList.pdbqt.gz is the Gzip file produced by VinaLC, sort.pdbqt is the output sorted pdbqt file, "-n 10" is to tell program to save top 10 structures with highest Vina scores.
Here is the help manual for sortVinaLC.py
================================================================ python sortVinaLC.py -h Usage: sortVinaLC.py -i-o -n N Optional parameters: [-i] input *.pdbqt.gz file [-o] output *.pdbqt file [-n] keep top N ligands with highest Vina Scores [-h] print command usage ================================================================