Appendix A


Appendix B  Pharmacophore File Format

The format used in pharmacophore files has been changed to a comma-separated-variable (CSV) format. This means that pharmacophore data can be loaded directly from the file into a relational database (such as MySQL or ORACLE). In the past, loading the data would have required converting the file written by THINK into a format that the relational database system could handle.

A pharmacophore file begins with two header records (records 1 and 2 below), which are followed by the molecule data blocks. Each block starts with a molecule header record, and then contains the pharmacophore data. Each molecule in the file has a separate data block. All header records start with a "#" character.

In the following record descriptions, the type of each variable is indicated by the initial letter of its name:

Record 1 (file header)
Data: #IREV, IBINS, IMOLS, CCOUNT
Format: #I2, I3, I6, 1X, A1, F7.3, 1X, A1
Description:
  IREV I2 File revision level (currently revision 3)
  IBINS I3 Number of distance bins
  IMOLS I6 Number of molecules in file
  CCOUNT A1 "Y"if data includes pharmacophore counts, otherwise "N"
  D3DTOL F7.3 3D tolerance. The pharmacophore tolerance ±x is calculated from 2x = D3DTOL
  CNAME A1 "Y" if pharmacophore records include molecule names, otherwise "N"

Record 2 (distance bins)
Data: #RBIN1, RBIN2, RBIN3, …
Format: # free format, values separated by spaces
Description:
  RBINn   Upper limit of distance bin n

Record 3 (molecule header)
Data: #ICENS, IPHARM, ICOUNT, IREC, CMOLE
Format: #I3, I11, F11.2, 1X, A
Description:
  ICENS I3 Number of centres in pharmacophores
  IPHARM I11 Number of unique pharmacophores in molecule
  DCOUNT F11.2 Total number of pharmacophores for molecule if conformer-based counting is enabled, otherwise 0
  CMOLE A Molecule name
Record 4 (pharmacophore data)
Data: If CNAME=0: CPHARM, RCOUNT
If CNAME=1: CMOLE, CPHARM, RCOUNT
Format: free format, values separated by commas
Description:
  CMOLE   Molecule name
  CPHARM  
Data for pharmacophore n, in the format:
  xxxxdddddd 4-centres
  xxxddd 3-centres
  xxd 2-centres
where:
  x is a centre, represented by a 1-letter code
  d is a distance bin, represented by a single digit or letter
  RCOUNT   Pharmacophore count

As a result of the loss of precision which occurs when converting a floating point number into a character string, pharmacophores whose count is less than 0.01 are not written to the file. As a consequence, the sum of the counts in the pharmacophore records may not equal the value of DCOUNT in the molecule header record.


Appendix C