Multi-Coordinate Driven Reaction Path Search

Again, I found something that probably useful for you for computational chemist. I think this stuff not only interesting for theoritical chemist but also applied science and engineering, especially the ones who work in chemical kinetic. It is a software that helps you to find the transition structure between reactants and products. More information could be obtained in http://www.chem.elte.hu/departments/elmkem/berente/mcd/readme.htm. This program is made by Imre Berente and named as MCD 1.2. Published work is presented in J. Phys. Chem. A 110 (2006) 772. Here, I will put the Readme file of the sofware and I hope it will clear to you.

Multi-Coordinate Driven Reaction Path Search

MCD 1.2 by Imre Berente

I. Introduction

This program is to calculate approximate reaction path and transition state candidates for chemical reactions. For scientific details see article J. Phys. Chem. A 110 (2006) 772. If you use this program in your scientific work creating a publication, please refer to this article.

The program is a driver for Gaussian 03, which makes the actual calculations. The Gaussian executable g03.exe is needed in the path.

The program is available as WIN32 and Linux executable and also as source code. You are free to examine, modify or take subroutines from it as long as you refer to this site in your readme. The compilations were made by Freepascal compiler.

The test calculations of the article are also available. You can examine them as input examples.

The results of this program are NEVER accurate intermediates/transition states, their structures and energies should not be used for any other purpose than getting a qualitative picture and as initial guess geometries for optimizations of the intermediates and saddle point searches for the TSs.

Than why use this program?
Because it is robust, deals with any reaction and you can test even the fanciest reaction mechanism theory. Compaired with the other robust methods it is pretty fast. It provides you good initial guess geometries for the geometry optimizations and saddle-point searches which are hard to create manually.

Requirements

The program needs the optimized geometry of the reactants and also the products. The user also must have an idea about the mechanism he wants to calculate with this program. Based on this mechanism internal coordinates needed to be selected which change monotonously with the reaction coordinate (active coordinates).

Active coordinate selection

You shall select all “important” coordinates of the reaction. The bond lengths of all forming and broken bond. The torsion angle of a rotation which inherently belongs to the reaction. Selecting too few coordinates can make the program fail.

Faliures because of lack of coordinates can be spotted by examining the Energy vs reaction coordinate plot created from the mcd.tbl file. You should see gauss-like peaks. If you see a “left sided peak” like on the image here, you can assume that an important coordinate is missing. The program lists the nonactive coordinate changes in the output, if it is much bigger than the stepsize, you can expect such error. The missing active coordinate is often between the jumping ones. In rare cases it is possible that the peak cannot be removed by including new coordinates to the active subspace. In such case the method is not able to solve the problem.

Defining a coordinate active which does not changes monotonously with the reaction coordinate can make artificial results and must be avoided.

Defining and “uninportant” coordinate which changes monotonously with the reaction coordinate increase computational cost but should not cause any faliures.

The program in brief

During the calculation, the program will do the following steps:

  1. Examine the forces and the Hessian on the active coordinates

  2. The program change these coordinates. The sum of the changes is the given stepsize and the changes are distributed among the active coordinates in order to archive minimal energy increase.

  3. The program runs the Gaussian to optimize all nonactive coordinates.

  4. Unless all coordinates reached their ending value, GOTO 1

Input files

The program does not use parameters, all input are in the input files.

  • mcd.par: the parameter file for the MCD program.

  • filename.chk: the checkpoint file of the optimized input structure. MCD will take the input geometry and the initial Hessian from this file. If you have an optimized reactant geometry with no checkpoint, you shall run an optimization which will converge in a few steps.

Output files

  • stepXXXX.gjf: input file for Gaussian to order the geometry optimization in the XXX-th step

  • stepXXXX.out/log: output file from the Gaussian containing the energy and the geometry in the XXX-th step

  • chksavXXX.chk: Gaussian checkpoint file containing lot of data from the XXX-th step. It is needed by the program for restarting. When finished, these huge files can be deleted.

  • mcd.out: the same file which is displayed on the screen during run. Usually unimportant, maybe some warnings should be checked.

  • mcd.tbl: summary of the calculation, contains the relative energies of the steps and also the values of the active coordinates.

How to check the result?

While the program runs it create steps. These steps are the milestones of the reaction path. You can load the step output files into GaussView and see if they are the ones you expected. When the run is complete, the last out file can be converged into the product geometry. If it is not and the difference is significiant than something is odd. Maybe the input was mistyped, the stepsize was too high, an important coordinate was not set active, maybe the reaction mechanism theory was bad. Using smaller stepsize or trying the opposite direction (products -> reactants) might help.

How to get internal minima?

After the calculation is finished, you shall load the mcd.tbl file into Excel or like and plot the Energy vs reaction coordinate. You shall check for hills and valleys. Some hills are TS-es and some valleys are internal minima. The others are just numerical actifacts. The number of artifacts can be greatly decreased by setting smaller stepsizes.

You shall start a local minimalization from every valley-bottom geometries which you think can belong to a minimum. If it is, you will get an optimized minumum. If it is an artifact you will find a minimum which you already have.

After that you shall find the TS candidates. There is exactly one TS between two minima, so the easiest way to find it is picking the highest peak between the two. Don’t forget that TS calculations need a good Hessian and unlike for optimizations the automatically generated approximate is not good enough. You shall specify calcfc in the input to order analytically calculated Hessian. If that calculation would be too expensive you shall start a job with maxcycle=1, calcfc on a smaller level of calculation (HF instead of DFT, smaller basis set) and read it’s Hessian by readfc in the real TS calculation.

  1. TS1 and it’s initial guess. They are very close to each other.

  2. Internal minimum and it’s initial guess. They are very close to each other.

  3. TS2

  4. The highest energy point between the internal minimum and the endpoint, therefore the initial guess of TS2. Though it is pretty far from it, it is still a good initial guess since TS2 was found by the default saddle point search method of the Gaussian 03 started from this geometry.

  5. Artifact minimum. An optimization from here goes to the optimized endpoint.

  6. Artificial TS. Since the minimum next to it found to be fake, we do not care about it.

Restarting a job

The job can be restarted at any step by deleting the .gjf and .out/log files belonging to the unfinsihed step and copying the chksavstepnum_of_the_last_completed_step.chk over the working checkpoint file. When you run mcd.exe again, it will not run the Gaussian for the steps which have an existing .out/log file on the disk.

Input file reference

The program reads the parameters you set from the mcd.par file. This file is a human-readable text file containing sections. Each section has a header and some lines below it. You can download an input file and use as template.

[checkpoint] section

This section contains only one line with the name of the working checkpoint file. This must have the optimized reactant geometry and should have a good Hessian matrix.

[job] section

This section will be copied to the Gaussian input file as header and route section. The opt and geom keywords should not be included, this is given by the program.

[scan] section

This section defines the active coordinates.

  • B 8 13 3.4 1.5
    This definition line means that a bond streching coordinate between atoms 8 and 13 is created. It’s value in the reactants is 3.4A, in the products it is 1.5A.

  • A 1 2 3 120 86
    This line defines an angle bend coordinate between atoms 1 2 3 which. The starting value is 120, the ending is 86 degrees.

  • D 1 2 3 4 120 + 150
    This definition defines a torsion coordinate between atoms 1, 2, 3, 4. The starting value is 120, the ending is 150 degrees, the direction is positive.

[frozen] section

This section defines frozen coordinates like
B 1 2 1.1
A 1 2 3 30
D 1 2 3 4 86
This section can be skipped if no frozen coordinates exist.

[mcd] section

This section contains seven lines. Each contains a keyword and a number. All keywords are neccesary and their order is fixed. Below you can see them, their meaning and their suggested value:

  • blocksize: the number of SCF+grad+geomopt steps the Gaussian makes before the result is examined. Too high value increase computational time. Too low has the chance of bad move if the convergence limits are loose enough. Suggested value: 3

  • stepsize: maximum size of the change of one coordinate in Angstroms (sum of the active coordinate changes/step). The actual stepsize depends on the convergence of the previous step. Too high value has the risk of error, too low wastes computational time. Suggested value: 0.2×number of active coordinates

  • maxstep: maximum active coordinate change in a step in Angstroms. Too high has big risk of error, too low wastes computational time Suggested value: 0.2.

  • rmsstepmin: the actual trust radius is stepsize×(rms_displacement-rmsstepmin)/(rmsstepmax-rmsstepmin). By setting rmsstepmin and max you can set the convergence level of the geometry optimization in the steps. Setting it too high can increase the number of artificial minima, setting it too low increase computational cost. Suggested value: 0.001

  • rmsstepmax if the RMS Displacement of the geometry optimization is bigger then this, the step is not accepted and another blocksize geometry optimization cycles are forced on the step (restart). Suggested value: 0.05

  • forcelimit if the Maximum Force of the geometry optimization is bigger then this, a restart is forced. Suggested value: 0.01

  • steplimit if the Maximum Displacement of of the geometry optimization is bigger then this, a restart is forced. Suggested value: 0.3

[opt] section

This section contains optimizer parameters which are copied to the Gaussian input. loose,tight,gdiis,calcfc and such parameters can be used. If you insert readfc, then the Hessian of the previous point is used creating the next. It is encouraged at small stepsized but not for large ones.

Error messages

The program sometimes ends with an error message. The meaning and possible fixes are listed here.

  • Cannot find final structure: The final step did not provided a complete gaussian output. Check stepend.out and stepend2.out for reasons. If you see curvilinear step not converged error message in the output, you set too high convergence limit in the [scan] section

  • “opt” or “geom” cannot be in the input job section: Error in input file. opt and geom keywords cannot be in the [job] section, you can set your parameters for the opt keyword in the [opt] section.

  • ‘[mcd] field is needed and must have all 7 values’: the input must contain the [mcd] section and the section must have all variables set. Use the example input as template.

  • the order of values in [mcd] field is fixed…: the [mcd] section must hold the variables in the same order as the example input .

  • No scan coordinate section found in the input: the [scan] section holds the list of the active coordinates therefore must exist

  • ‘Too many words in scan/frozen line’ or Wrong number of words in…: Type mismatch in the [scan] section.

  • Unknown type in scan/frozen line: The active coordinates can be bondstrechings, anglebendings and torsions

  • (Non)Optimized parameters section not found in filename: The gaussian job on filename was unsuccesful. Check it for reasons.

  • End of file while reading Hessian and Cannot read complete second derivative matrix: Uncomplete Hessian matrix was found in the last Gaussian file. Check it for reasons.

Warning messages

Some errors are not so fatal that makes termination of the calculation inevetable. Than a warning message is written to the screen and to the mcd.out. You shall consider if they are ignorable or they are signs of a problem.

  • the input structure is not perfectly optimized or the scan/frozen coordinates are not completely accurate: The step_1st.gjf orders Gaussian to optimize the initial structure found in the working checkpoint file applying the initial values of the active coordinates from the input file. Since the checkpoint is supposed to contain the optimized reactant geometries and the initial values of the active coordinates supposed to be equal to their values in the reactants, this step should converge in one geometry optimization cycle. If it does not, than you get this warning. Reasons can be incomplete or loose optimization of the reactants, inaccurate (like 2.15 instead of 2.14878) coordinate value or error in the input.

  • Warning: unexpected change in coordinate N: it means that this active or frozen coordinate has a different value in the output of the geometry optimization step than it was defined in the input. It is the problem of the redundant coordinate system modifier of the Gaussian, since the coordinates are correlated, changing one changes the other as well. Unless your active and frozen coordinates are dependent, fixing their value is possible, but Gaussian sometimes fails to make it. Every time you see this warning you shall look at the output to decide if the structure is acceptable or not. Usually the difference is small and diminishes in the next step. If it does not or the error is inacceptable, the only solution can be decreasing the stepsize.

Support

I tried to make as clear and readable code as possible but I know that clear and readable code is just as rare as honest politican. Feel free to contact me:

  • you need help in modifying the program, creating derivative work

  • you need help in recode it in a different language

  • the program behaves differently than written in this manual (bug)

  • the manual is unclear

  • you get a crash like:
    Runtime error 207 at 0×00401062
    0×00401062
    0×004010D5
    If you report a crash, please send me the input files as well so I can examine the reason of the crash

  • If you have an idea to make it better, need a new functionality


Do not ask for help in selecting appropriate active coordinate for your reaction. This program is not a black box machine for dummies but a tool for chemists. The active coordinate can only be defined on the basis of your mechanism theories. If you don’t have theories, too bad. :-)

You can contact me by sending a mail to imre.berente@mailbox.hu.
I hope this program will become a useful tool for you!
Imre Berente, 2006. Apr. 28, Székesfehérvár, Hungary

Post a Comment