THINK is a modular system designed to assist with drug discovery lead generation and lead optimisation by:
It is designed to be used with other commercial and proprietary software programs.
The THINK system currently consists of the following modules:
| Module | Provides |
| Core module | Import and export of molecules via files; listings of molecular data; command scripts |
| Graphics/GUI module (currently only available for Windows) | Displays 2D and 3D representations of molecules; diversity plots; functional group distribution plots and histograms. The GUI provides a series of dialogs that may be used in place of commands |
| 2D module | Genetic algorithm to generate derivatives of a starting molecule; tertiary SAR analysis to create rejection criteria for the genetic algorithm; property diversity calculation and selection; 2D substructure and similarity searching |
| 3D module | Conformer generation; 3D coordinate generation (including fused rings); 3D searching |
| Pharmacophores | Calculation and export of pharmacophores; site searching (in conjunction with 3D module) |
THINK is controlled through commands or, if the graphics/GUI module is available, dialogs. The command interface offers the widest range of options; the dialogs are designed to present the most commonly used options within each facility.
Some Virtual Screening examples can be found in the appendix.
THINK can be started from the Windows menu START > All Programs > THINK or by double clicking on think.exe in the Windows Explorer. When THINK starts the console and explorer windows open. THINK uses a console window to enter commands or open dialogs. This is the main window used by the program. Additional windows will be created when necessary to display molecules, data or perform searches.

Items on the dialogs may appear as:
Each facility or utility in THINK has its own command, which may take one or more keywords to supply additional options. A keyword may take a value (eg to supply a filename). Commands may be specified in upper, lower or mixed case.
The syntax for a THINK command is:
Command keyword[=value[,value]] [ keyword[=value[,value]] ...]
where items in [] brackets may be optional. A few commands (eg EXIT) do not take any keywords. Spaces are required to separate the command from any subsequent keywords, and between keywords. Any values required by a keyword must be specified using an "=" between the keyword and the value, with no spaces, ie keyword=value. If the keyword may take a list of values, these should be separated by commas, again with no spaces, ie keyword=value1,value2.
| OPEN FILE=capsaicin.smi |
| SUGGEST ... ACTIVITY=EC50 OPTIONS=HIGH,LOG ... |
It is also possible to omit the keyword= for manditory keywords provided they are specified in the same order as the help file. Advanced users may find this practice saves typing but it should not be used in script files.
The HELP command will list all the commands. A full list of keywords and values available for any command may be obtained by issuing the command HELP followed by the name of the command for which information is required or a the command followed by a question mark.
| HELP |
| HELP SUGGEST |
| SAVE ? |
THINK v1.23 allows multiple commands to be specified on a single command line separated by a semi-colon (;).
When THINK detects an error it will display a pop-up error message containing the error number, some explanatory text and three buttons: Continue, Cancel and More. Picking Cancel will cancel the operation immediately; picking Continue will allow the calculation to continue if possible. Picking the More button will display a second pop-up error window giving more information about the error. This information may not be very helpful to the user, but can be useful to the THINK developers when determining the cause of the error. The second pop-up window contains Continue and Cancel buttons; these have the same effect as their equivalents on the first window.
All messages from the first pop-up error window are echoed to the THINK console window and written to the log file (see section 1.5).
1.4 Identifying Molecules and Atoms
Many commands require the user to identify a molecule. Occasionally the user is required to identify atoms by typing their atom specifications (eg when using the ROTATE command). To do this successfully requires some understanding of the molecule and atom specifications used by THINK.
Each molecule has a molecule name. This is either:
If the molecule is one of a set of conformers, its name will include a conformer number enclosed within parentheses (), eg ASP(3). The molecule name may be extended by the addition of an "@" character and the name of the file from which the molecule was read, eg "CAPSAICIN#1@capsaicin". This may be used to distinguish molecules with the same name that came from different sources.
The full specification for any atom consists of three parts: the atom, residue and the molecule identifiers. If there is only one small molecule present, or if the atom identifier can uniquely recognize the atom, the residue and molecule identifiers may be omitted. The atom identifier has the form:
type(serial.group)name
where type is the atom type or element symbol for the atom; serial is the serial number, group is the group number and name is the atom name. The shortest acceptable atom identifiers are (serial) and name; thus atoms are normally identified by their serial numbers or names. The type and group fields are omitted unless they are required to uniquely identify the atom. If the type field is supplied, it must be followed by the parentheses, either with or without the serial number, to distinguish it from the name field. If the group field is not supplied, the leading "." should also be omitted.
If the molecule has been read from a PDB file the residue name, sequence number and insertion code, and chain id will be stored; for molecules read from other file formats these portions of the atom specification are set to blank strings. This information is preceded by an underscore "_" and has the form:
residue(sequence)chain
where residue is the residue name, sequence is the sequence number, including the insertion code if present (eg 370, 85A), and chain is the chain id.
If the molecule identifier is also required to pinpoint an atom it should be separated from the atom and residue identifiers by a "^" character. Thus, the full atom specification is:
type(serial.group)_ residue(sequence)chain^molecule@filename
although this is very rarely used.
Atom and molecule identifiers may be entered in upper, lower or mixed case. Note that THINK v1.25 will report all filenames in lowercase when listed as part of the molecule identifier, even if they were entered as upper or mixed case names.
Symbols (see section 1.6.1) may be used to identify atoms or residues. This is a convenient method of identifying a group of atoms in a single operation (eg the active site of a protein). The symbol must be an array comprising one or more array elements, and each element contains a separate atom or residue specification. Each array element is taken in turn when the symbol is processed. The symbol name must contain more than four characters to avoid confusion with atom or residue names (eg SITEA would be interpreted as a symbol but SITE would be treated as a residue name).
The array is terminated by a blank or undefined element. When a symbol is used for the first time, this will occur automatically. However, if an array symbol is reused, and the new array contains fewer elements, it is important that a blank element is specified after the last valid element to ensure that THINK will determine the array length correctly. For instance, if the new array only contains three elements, then the fourth element should be set to a blank value.
If the symbol is to be used in place of an atom specification (eg when defining the bond for bond rotation) the array element may contain any portions of the full atom specification. For instance, CA_TYR(33) would refer to the alpha carbon in residue TYR(33); and (23:29) would refer to all atoms with serial numbers in the range 23-29. However, if the symbol is to be used in place of a residue specification, the array element must contain only the residue components of the full atom specification, ie residue(sequence)chain. The leading underscore "_" character must be supplied before the symbol name.
In the examples below the symbol names have been highlighted to distinguish them from the rest of the commands. See section 1.6.1 for full details on creating symbols.
| LET ATOM1(1) = CA_TYR(33) LET ATOM1(2) = " " LET ATOM2(1) = N_TYR(33) LET ATOM2(2) = " " ROTATE ABOUT=ATOM1-ATOM2 ANGLE=30 |
| LET SITEA(1) = (19:24) LET SITEA(2) = (45:47) LET SITEA(3) = (63) LET SITEA(4) = " " MODIFY INTERACTIONS=_SITEA |
All commands issued during a THINK session are recorded in a log file. This is named "think[n].log" where n is a counter and is created in the current working directory. If the user is using dialogs to control THINK, the log file contains the commands issued by the dialogs. A log file may sometimes be subsequently replayed by issuing the command "CALL xxx" where xxx is the name of the log file. Note that THINK will automatically create a new log file incrementing n in the name "think[n].log" each time the program is started.
1.6 Command Scripts and Symbols
The THINK command set includes simple control commands such as LET, WHILE and IF to enable the user to create command scripts. Apart from LET commands, these must be saved in files (they cannot be entered directly into the console window) and are played by issuing the command "CALL xxx" where xxx is the name of the script file. The "@" character may be used in place of CALL. Commands within scripts may be in upper, lower or mixed case. LET commands may be issued from command scripts or typed directly into the console window.
Command scripts may be nested up to 10 levels deep. A nested script is invoked by the command "CALL xxx", where xxx is the name of the script. A nested script ends, and control is returned to the calling script, when a RETURN command is encountered. All nested command scripts should finish with a RETURN command.
If a command script is interrupted by typing <CTRL-C> or picking the Cancel button, all script files are closed and control is returned to the THINK console window.
Command scripts support local and global symbols. Local symbols only exist within the command script being executed (and any nested scripts below the current level) and are deleted at the end of the script. Global symbols persist after the script has terminated. Both types of symbol are set via the LET command: local symbols use the form "LET a = b" whereas global symbols use the form "LET a := b", replacing the "=" with ":=". Note that spaces are optional around operators and the "=" or ":=". Text strings should be enclosed within double quotes ""; names of symbols may be enclosed within single quotes '' or left unquoted. However, use of single quotes around symbols and spaces around operators is recommended to avoid ambiguities. Symbol names may use any alphanumeric characters (A-Z, 0-9) but must not conflict with the THINK commands (issuing the command HELP will generate a full list of commands). The symbols P1 to P9 have a special meaning within command scripts (see section 1.6.2) and should be avoided for user-defined symbols.
A symbol (local or global) may contain a single scalar value or a one-dimensional array. Each member of an array is known as an array element and is identified by its position within the array (eg ATOMS(3) would be the third element in the array ATOMS). Each element within an array is set by a separate LET command (unlike some other scripting languages, there is no way to set the contents of the whole array through a single command).
THINK can extract (but not set) substrings of any symbol or array element for use in another operation. A substring is specified as the name of the symbol or array element, followed by the range of characters required. If the first value in the range is omitted or replaced with a "*", THINK will start at the beginning of the string; if the second value is omitted or replaced with "*", THINK will finish at the end of the string. For instance if the symbol ALPHA contained the alphabet in a single text string, ALPHA(5:8) would return the string "EFGH"; ALPHA(:3) would return "ABC" and ALPHA(23:*) would return "WXYZ".
| LET FILE = "capsaicin.smi" |
| LET X-ANGLE := 45.0 |
| LET ATOMS(1) = (9) LET ATOMS(2) = (15) LET ATOMS(3) = (21) |
| LET SUBST = TEXTARRAY(3)(5:17) |
The following arithmetic, string and bit operators are supported in LET statements:
| % | Modulus |
| ^ | Exponentiation |
| * | Multiplication |
| / | Division |
| + | Addition |
| - | Subtraction |
| . | String concatenation |
| ? xx ~ yy | String substitution - replace xx with yy |
| & | Bitwise AND |
| ! | Bitwise NOT |
| | | Bitwise OR |
| : | Bitwise EOR |
The arithmetic operators are processed in the order they are listed above, with exponentiation having the highest priority. The string substitution operator is very rarely used in user-written command scripts, but is used extensively in the scripts that generate the THINK dialogs. THINK also supports relational operators - these are described in section 1.6.3. The LET command may also be omitted where this does not cause any ambiguity.
| LET J1 = J1 + 2 |
| LET FACTOR := 'RANGE' / 'SIZE' |
| LET NEWFILE = 'NAME' . ".smi" |
Up to nine values may be passed as arguments to a command script. This allows a single script to be re-used (eg on different molecules) without having to change the script file before each repetition. The arguments are specified after the name of the command script in the CALL command, and are separated by spaces. Any text strings that include spaces must be enclosed in double quotes "" to ensure the whole string is treated as a single argument.
Within the command script, the arguments are identified by the special local symbols P1 to P9. Unlike other local symbols, P1-P9 only exist within the current command script and are not inherited by any nested command scripts below the current level.
| CALL COUNT.LOG ASP ASP.SMI |
| LET TEXT = "Counting atoms in " . P1 OPEN FILE='P2' |
1.6.3 Control Commands and Relational Operators
Normally the statements in a command script are executed in the order in which they appear in the file. This order may be changed through the use of:
The label used by a "GOTO xxx" command is identified by the statement "LABEL xxx". The IF-ENDIF commands have the form:
IF condition_1
block_1
ELSEIF condition_2
block_2
ELSE
block_3
ENDIF
where the statements in block_1 are executed if the condition condition_1 is true, otherwise the statements in block_2 are executed if condition_2 is true. If neither condition is true, the statements in block_3 will be executed. Either or both of the ELSEIF-block_2 and ELSE-block_3 sections may be omitted; only the IF-block_1 and ENDIF sections are mandatory.
The WHILE-END commands have the form:
WHILE condition
block_a
END
where the statements in block_a are repeatedly executed while condition is true. Note that block_a must include a statement that makes condition false, otherwise the loop will never terminate and THINK will not respond to any further commands.
The conditions used by the IF-ENDIF and WHILE-END commands compare the values of two symbols or constants using one of the following relational operators:
| = | Equal to |
| != | Not equal to |
| > | Greater than |
| >= | Greater than or equal to |
| < | Less than |
| <= | Less than or equal to |
| & | Logical AND |
| ! | Logical NOT |
| | | Logical OR |
| : | Logical EOR |
When the logical operations give a non zero result the condition is considered to be TRUE.
Care must be taken when comparing text strings (they are case-sensitive) and when comparing real numbers using an "equal to" test. It is recommended that real numbers are compared using the "greater than" or "less than" relational operators.
If an error is encountered during a command script, THINK will execute an implicit GOTO command and jump to an error location. This may be defined in one of three ways:
Use of the ON_ERROR symbol allows the destination of the error jump to be changed whilst the script is executing simply by changing the value of the symbol. The special value "CONTINUE" indicates that the error should be ignored and the THINK should execute the next line in the command script.
| LIST INFO=MOLECULES ... LABEL ON_ERROR WRITE CONSOLE "Error listing molecules" |
| LET ON_ERROR = NOMOLS LIST INFO=MOLECULES ... LABEL NOMOLS WRITE CONSOLE "No molecules present" |
Text strings can be written to an external file, the THINK log file or the console window through the WRITE command: "WRITE file text" where file is the name of the file to receive the data and text is one of the following:
File may be specified as LOGFILE or CONSOLE to write data to the THINK log file or console window respectively. If an error occurs whilst the data is being written out, THINK will execute an implicit GOTO command and will jump to the FILE_ERROR location. This is analogous to the ON_ERROR error location (see section 1.6.4) and may be an explicit label ("LABEL FILE_END") or a symbol containing the name of a label or the value "CONTINUE".
Data may be read from a file into a symbol using the corresponding "READ file symbol" command, where file is the name of the file containing the data. Attempting to read past the end of the file will cause THINK to execute an implicit GOTO and jump to the FILE_END location. This may take the same range of values as the FILE_ERROR and ON_ERROR locations.
| READ mols.lis MOLNAME |
| LET TEXT = "Current molecule is " . MOLNAME WRITE CONSOLE TEXT |
THINK includes a variety of numerical, string and system instrinsic functions which are prefixed by "$". Values returned by intrinsic functions can be used like symbols. Each function takes a number of arguments and returns a single integer, real or string value that may be assigned to a variable.
In the table below, ival indicates an integer argument, rval a real argument and cval a character argument.
| Function | Return value | Description |
| $SQRT(rval) | Real | Returns the square root of rval |
| $EXP(rval) | Real | Returns erval |
| $LOG(rval) | Real | Takes the natural logarithm of rval |
| $LOG10(rval) | Real | Takes the common logarithm of rval |
| $ABS(rval) | Real | Returns the absolute value of rval |
| $INT(rval) |
Integer | Returns the integer part of rval truncated towards zero, ie $INT(3.4) returns 3, $INT(-3.4) returns -3 |
| $NINT(rval) | Integer | Returns the nearest integer to rval. If rval>0 $NINT(rval) has the value $INT(rval+0.5). If rval≤0 $NINT(rval) has the value $INT(rval-0.5) |
| $CEILING(rval) | Integer | Returns the nearest integer that is greater than or equal to rval, ie $CEILING(3.1) returns 4, $CEILING(-3.1) returns -3 |
| $FLOOR(rval) | Integer | Returns the nearest integer that is less than or equal to rval, ie $FLOOR(6.3) returns 6, $FLOOR(-6.3) returns -7 |
| $TRUNCATE(rval1,ival2) | Real or integer | Truncates rval1 to ival2 decimal places, ie $TRUNCATE(2.468,2) returns 2.46. ival2 must be in the range 0-3. If ival2=0, $TRUNCATE returns the same value as $INT(rval1) |
| $ROUND(rval1,ival2) | Real or integer | Rounds rval1 to ival2 decimal places, ie $ROUND(2.468,2) returns 2.47. ival2 must be in the range 0-3. If ival2=0, $ROUND returns the same value as $NINT(rval1) |
| $MAX(rval1,rval2) | Real | Returns the larger value of rval1 and rval2 |
| $MIN(rval1,rval2) | Real | Returns the smaller value of rval1 and rval2 |
| $CPUTIME() | Real | Returns the number of seconds of CPU time used by the current THINK session |
| $ICHAR(cval) | Integer | Returns the ASCII value of the first character in cval |
| $CHAR(ival) | Character | Returns the character corresponding to the ASCII code ival |
| $INDEX(cval1,cval2) |
Integer | Returns the starting position of substring cval2 within character string cval1. A value of 0 is returned if cval2 is not found |
| $LENGTH(cval) | Integer | Returns the length of character string cval |
| $TRIM(cval) | Character | Returns the string cval with all leading and trailing spaces removed |
| $LOWCASE(cval) | Character | Returns the string cval with all uppercase characters converted to their lowercase equivalents |
| $UPCASE(cval) | Character | Returns the string cval with all lowercase characters converted to their uppercase equivalents |
| $VERSION() | Character | Returns the THINK version number (eg 1.23b) |
| $FIELD(cval1,cval2) | Real | Returns the value stored in data field cval1 for molecule cval2 |
| $MOLECULE(ival) | Character | Returns the name of the ival'th molecule within THINK |
| $ATOM(ival) | Character | Returns the name of the ival'th atom within THINK |
| $QUERY() | Character | Returns the name of the query molecule |
| $FEXIST(cval) | Character | Returns TRUE if the file cval exists, otherwise returns FALSE |
| $FDELETE(cval) | Character | Attempts to delete the file cval and returns TRUE if the file is deleted successfully, otherwise returns FALSE |