User guide
LovoAlign can be used to to align pairs of proteins, to align a protein to a database of protein structures or to perform an all-on-all structural alignment of a database.Index
1. Aligning a pair of proteins.
2. Aligning one structure to a protein database.
3. Performing an all-on-all database comparison.
4. Align general structures with customized selections
5. Advanced and optional parameters.
6. Scripts for analysing results.
Aligning a pair of proteins | [top] |
The simplest way to run lovoalign to align a pair of protein structure is to run it by:
lovoalign -p1 pdb1.pdb -p2 pdb2.pdb -o pdbout.pdb
where pdb1.pdb and pdb2.pdb are the pdb files containing the structures of each protein. pdbout.pdb is the name of the output file that will contain the structure on pdb1.pdb aligned to the second structure.
If you want to align, for example, the chain "A" of the first protein A to the chain "C" of the second protein, use:
lovoalign -p1 pdb1.pdb -c1 A -p2 pdb2.pdb -c2 C -o pdbout.pdb
If "-c1" was not found in the command line, all chains of the first protein will be considered in the alignment. The same for "-c2" and the second protein.
When alignning a pair of protein, you might want to obtain the RMSF profile of the alignment. To do so, add the option:
-rmsf rmsf.dat
The rmsf.dat file will contain the RMSF plot.
Additionally, you might want the RMSF trend, defined by the profile of the fraction of atoms that are closer than each threshold. To obtain that plot, use:
-rmsftrend rmsftrend.dat
Aligning a protein structure to a structure database | [top] |
The lovoalign package may be used to align a structure to a whole database of protein structures. Aligning one structure with about 300 CA atoms to the whole PDB (~30,000 structures) takes half an hour in a typical personal computer. A database of protein structures consists in a collection of structure files in the pdb format (as the pdb database itself). In order to obtain an alignment of a single protein to a whole database, two simple steps must be taken:
1. Obtain a list of the files to which the structure will be be compared. For example, if the database contains three pdb structures, the pdb list would be (lets call it list.dat):
file1.pdb
file2.pdb
file3.pdb
The list must contain the full path of the files considering the directory in which lovoalign is going to be run.
2. Considering that structure.pdb is the pdb file of the protein that is going to be compared to the database, lovoalign must be run by:
lovoalign -p1 structure.pdb -pdblist list.dat > align.log
In this case the align.log output file contain very concise results for each alignment:
PROTS: B1yh1.pdb B1vl6.pdb 334 373
METHOD: 4 1 10.000 14 0.70389E+00 0.59591E+00
SCORE1: 0.19185E+04 267 0.14279E+02 19
The PROTS line contains the protein names and their number of CA atoms.
The METHOD line contains, in order: a specifier of the method used (default: 6 for database comparisons and 4 for a pair alignment); a specificiation of the type of initial point used (default: 1); the score penalty for gaps; the number of iterations used in the alignment; the time used in this alignment in seconds; the time used not considering post-procesing of the data.
The SCORE1 line contains, in order: The STRUCTAL score obtained for this alignment; the number of atoms in the bijection; the RMSD of the bijected atoms; the number of gaps in the bijection.
Performing an all-on-all database comparison | [top] |
Performing an all-on-all database structural alignment with lovoalign is very simple. First, obtain a pdb file for each protein. Second, obtain a file containing a list of the pdb files to be considered, as was explained above. Once you have the list of pdb files, run lovoalign with:
lovoalign -pdblist list.dat > align.log
The output will be similar to the one explained for the single-protein to database comparison.
Customizable alignments | [top] |
Any selection of atoms (protein atoms or not) can be aligned with lovoalign. For doing so, you have some options:
1. Create different files for each atom selection you want to align. Then, align the structures using the "-all" option, with which all the atoms in the structure will be considered.
2. Modify your PDB file by changing the beta-factor and occupancy values. With the options "-beta1", "-beta2", "-ocup1", "-ocup2", only the atoms with beta-factors or occupancy values different than zero will be considered. For example, using
lovoalign -p1 prota.pdb -p2 protb.pdb -beta1 -ocup2
The atoms with beta-factor different than zero of protein 1 will be considered, and the atoms with occupancy values different than zero for protein 2 will be considered. Alignments of general structures can be performed this way. Section of residues can be selected with the "-rmin1" ... "-rmax2" options.
Advanced usage and optional parameters | [top] |
The lovoalign package actually contains 6 methods for structural alignment implemented and several optional parameters to be set.
User interactive run:
The method used and other parameters may be set user-intearctivelly be running lovoalign without command-line arguments.
Setting optional parameters in the command-line:
Parameter | Possible values | Meaning |
0 or 1 | Concise or extensive output | |
-g | real number | Penalty for gaps in the bijection |
-all | none | Consider all atoms (not only CA) for the alignment |
-m | 1, 2 or 3 | Method used (see below) |
-maxtrial | integer | Maximum number of trials to obtain the global optimium alignment (default: 1000 for pairwise alignments, 4 for database comparisons) |
-ismall | 0 or 1 | Value 1 forces the smallest protein to be protein 1 in a protein pair alignments (affects the time of methods 5 and 6). |
-maxit | integer | Maximum number of iterations allowed for methods 2 to 6 |
-f | real number | Fraction of CA atoms to be considered for methods 5 and 6 |
-pfac | real number | Factor multiplying the internal distance in initial point |
-bijeoff | none | Do not output the bijection between atoms |
-beta1 | none | Consider only atoms of first protein that have beta > 1. |
-beta2 | none | Consider only atoms of second protein that have beta > 1. |
-ocup1 | none | Consider only atoms of first protein that have occupancy > 1. |
-ocup2 | none | Consider only atoms of second protein that have occupancy > 1. |
-rmin1 | integer | Consider only atoms of first protein with residue number > rmin1 |
-rmax1 | integer | Consider only atoms of first protein with residue number < rmax1 |
-rmin2 | integer | Consider only atoms of first protein with residue number > rmin2 |
-rmax2 | integer | Consider only atoms of first protein with residue number < rmax2 |
-seqfix | none | Use fixed sequence alignment based on input sequence of residues. |
-seqnum | none | Use fixed sequence alignment based on residue numbering of the PDB files. |
-fasta | filename | Use fixed sequence alignment based on fasta alignment file provided. |
-noini | none | Do not use pseudoprotein initial point. |
-nglobal | [integer] | Number of times the best alignment must be found until one is convinced of finding the global optimum. Default: 3. Runs faster if nglobal is smaller. |
-rmsf | filename | Writes rmsf profile of the alignment to file. |
-rmsftrend | filename | Writes rmsf trend profile to file (fraction of pairs with rmsf smaller than threshold). |
Scripts for analysing results
| [top] |