Wednesday, June 26, 2013

Gamess (US) frequently asked questions Part 2: Installation in Linux boxes


This is a guest post by Kirill Berezovsky (Petrozavodsk State university), based on instructions he posted some time ago in the Gamess (US) list



First, you need to get some software!
• Fortran compiler: gfortran or Intel Fortran (ifort);
• Math library's: ACML (for AMD CPU’s), ATLAS or MKL (Intel Math Kernel Library);
• (optionally) MPI: Intel MPI, OpenMPI, MVAPICH2, ... (if you using MPI);
(bold is better)

  • I’m not recommending the use of OpenMPI because it’s really slower than Intel MPI.
    But, if you want to try use it, go to this site. This is a very short and informative solution for building 64-bit OpenMPI.

This software for non-commercial using you can get on
http://software.intel.com/en-us/non-commercial-software-development
(Intel Fortran and MKL places in Intel® Fortran Composer XE 2013 for Linux)

As GAMESS (US) developers, I’m using:
• ifort 12.0.4 (places in Intel Fortran Composer XE 2011 update 4, “l_fcompxe_2011.4.191.tgz”)
• MKL 11.0 (places in Intel Fortran Composer XE 2013 initial release, “l_fcompxe_2013.0.079.tgz”).

Be aware, ifort 13.0.0 can’t compile some GAMESS (US) objects!

Anyway, you can install separate software from these archives – look hard when installing. It will safe HDD space and protect from some mistakes, maybe.




Great, when you have these packages, let’s start to configure your system!
You should be root, as I think that is better.

And before starting, configure memory by terminal commands. First, answer the question – how many RAM does your PC have? For example, my PC has 4 GB RAM, so in bytes it will be:

4 GB = 4*1024 MB = 4*1024*1024 KB = 4*1024*1024*1024 bytes = 4294967296 bytes.

And go on: 4294967296 bytes / 2 = 2147483648 bytes.
This number shows maximum size of 1 segment of shared memory.

Next, total size of shared memory in pages will be: 2147483648 / 4096 = 524288 pages.

This numbers you need to write in /etc/sysctl.conf:
echo “kernel.shmmax=2147483648” >> /etc/sysctl.conf
echo “kernel.shmall=524288” >> /etc/sysctl.conf
And restart your machine.

Install the packages!
1. It's better to use 64-bit Linux, whatever you like. I’m using Debian 6.0.7;

2. In terminal, install these packages (just in case):
apt-get install tcsh gcc g++ gfortran build-essential dpkg-dev binutils zlib1g-dev

3. Install Intel Fortan Composer XE 2011;
1. Unpack downloaded archive by: tar xvf l_fcompxe_2011.4.191.tar
2. Goto unpacked folder by: cd l_fcompxe_2011.4.191
3. Install by run: ./install.sh
4. ...And follow the instructions

4. Install GAMESS (US);
1. Get the GAMESS-archive;
2. Unpack it (default in /usr/local/);
3. Go to the GAMESS-folder and run script: ./config
4. Answer the questions:
1. Target machine name: linux64
2. GAMESS location: /usr/local/gamess
3. Build location: /usr/local/gamess
4. GAMESS executable version name (any name you want. In this example, we will use "cpu" as the name): cpu
5. Fortran compiler (choose what you use):
  •  ifort --> version : 12
  • gfortran --> version (like 4.4) you can get if you run in other terminal by: gfortran -v
6. Math library: mkl
7. Math library location: /opt/intel/composerxe_2011/mkl (verify it for your installation!)
8. When it shows string which contains 'bin' and 'lib' then type: skip
9. If you are not using MPI then type: sockets and you'll finish configuration;

10. Else, if you are using MPI type: mpi
1. Next, choose MPI-program: impi
2. Select MPI location directory, and go on.

11. Answer “no” for “LIBCCHEM”-question. If you are using NVIDIA GPU for calculations, anyway answer “no” at this first configuration time go on and don’t forget to read important note after these steps.

5. Goto ddi folder by: cd ddi
6. Edit 'compddi'-script by: gedit compddi
7. Find and change strings:
1. set MAXCPUS=4 (number of cores, is it 4 in your machine?)
2. set MAXNODES=1 (for single node)
8. Run 'compddi'-script by: ./compddi
• If you're NOT using MPI, there will be file ddikick.x - move in by: mv ddikick.x .. (two dots means upper folder)

9. Go upper folder: cd ..
10. Then compile GAMESS (US) by: ./compall

11. Link by: ./lked gamess cpu which creates gamess.cpu.x file. Of course, you can name it as you want, not only 'cpu'


12. Edit 'rungms'-script:
  • See here the rungms-script rewritten by Kirill.

Next you should create folders:
  • mkdir /scr
  • mkdir /scr/root
  • mkdir /root/scr

Add system variables into the ~/.bashrc file:

# iFort
export PATH=/opt/intel/composerxe-2011.4.191/bin/intel64:$PATH
export LD_LIBRARY_PATH=/opt/intel/composerxe-2011.4.191/compiler/lib/intel64:$LD_LIBRARY_PATH

# iMPI
export PATH=/opt/intel/impi/4.0.2.003/intel64/bin:$PATH
export LD_LIBRARY_PATH=/opt/intel/impi/4.0.2.003/intel64/lib:$LD_LIBRARY_PATH

# MKL
export LD_LIBRARY_PATH=/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64:$LD_LIBRARY_PATH


To run GAMESS (US) just type:

/usr/local/gamess/rungms [input file] [optionally,  the 'version name' of your gamess file ("cpu" in this example)]
just like this:

/usr/loca/gamess/rungms BSi85H95.inp
/usr/loca/gamess/rungms BSi85H95.inp cpu   will run exactly the same.

For simple run edit ~/.bashrc like:
gedit ~/.bashrc
• add line alias gamess=’/usr/local/gamess/rungms’
• apply changes by source ~/.bashrc
• and now you can run it by:

gamess BSi85H95.inp cpu

END!

Thursday, June 20, 2013

Science by press release

I woke up today with the news that researchers at the University of Aveiro had, "for the first time", altered the translational apparatus of an organism. I was outraged with the news: not with the science itself, but with the mindless hype surrounding it: actually, such a modification had already been performed in 2011 in C. elegans . I first thought that the "first time evah" pitch had been added by ignorant journalists, but the hype was already present in the press release from Univ. Aveiro!
The research publicized today is good and interesting, no doubt about that, but the quest for "good press" should never come at the expense of the truth. There is no excuse for that. Every bit of "good press" achieved with hype/exageration unfairly benefits those institutions and/or researchers with no moral qualms, leaving those researchers who are honest enough to not misrepresent their results in a disadvantage.

I've always disliked "science by press release", because (all other things being equal) it disproportionately benefits those who have access to the mass media, or who can afford publicists. Hyped press releases are even worse. And this can only end when science journalists stop relying on press releases to decide what is newsworthy. Though I strongly believe that such a day will not happen in the next 5 * 109 years.


Addendum: Previous reports all reassigned a STOP codon to an unnatural aminoacid. The report from Univ. Aveiro is indeed the first time that a non-STOP codon has been reassigned in an organism. This difference is unfortunately not present in the press release. I still stand by all other points on my post.

Wednesday, June 19, 2013

Gamess (US) frequently asked questions Part 1: SCF convergence

In spite of the very high quality of the Gamess(US) documentation, the Gamess(US) list is very often flooded with requests from new users regarding the lack of convergence of the SCF procedure. A few words of advice:

When your SCF does not converge,  you should re-run the job including a $guess guess=moread $end line, as well as the complete $VEC group present in the output PUNCH file (usually called <jobname>.dat, and present in you scratch directory).

    Addendum:

    Whenever you read a $VEC group from a UHF run you must assign NORB in the $GUESS group. An additional problem is that by default the $VEC group only includes the occupied orbitals, and this means that in UHF runs the $VEC group does not include equal numbers of alpha and beta orbitals (e.g., a run with 41 electrons and MULT=2) will have 21 alpha orbitals and 20 beta orbitals. Therefore, if you include

    $guess guess=moread NORB=21 $end

    Gamess will crash because there are not 21 beta orbitals, and if you input

    $guess guess=moread NORB=20 $end

    there will be another error, since there are more than 20 alpha orbitals. In these cases, you should check the number of alpha and beta orbitals. Then , copy the coefficients of the extra alpha orbitals to the end of the beta orbitals. In my example above

    $guess guess=moread NORB=21 $end

    will yield no problems, since the modification of the VEC group yields equal numbers of alpha and beta orbitals. There is also an option to PUNCH every orbital (occupied+virtuals) at every step. In this case, Gamess always punches a full $VEC group, making it very easy to assign NORB as one can simply inspect the output file to learn the number of orbitals. However, this yields gigantic PUNCH files, and may therefore not be feasible.




You should also experiment with changing convergers, damping, etc. Some systems are notoriously hard to converge, and may require several re-iterations of the whole process. 

Thursday, August 30, 2012

Advances in peptide chemistry

Protein synthesis is nowadays achieved through molecular biology techniques: the relevant gene is cloned in an appropriate vector, over-expressed with e.g. a poly-histidine tag, and then purified through high affinity chromatography. Peptide chemistry is therefore often forgotten by biochemists, unless we need to order a short customized peptide from a commercial source.
Danishefsky et al. have now combined solid phase peptide synthesis, native chemical ligation and metal-free dethyilation to synthesize a number of analogues of human parathormone. Their strategy afforded native parathormone with higher purity than obtained from commercial sources, as well as pure analogues not achievable by any other means. These analogues were shown to be much more stable (10% decomposition in 7 days) than parathormone ,(>90% loss in 7 days), and to be as active as parathormone when injected to mice.
This is a very interesting work, which should pave the way towards the synthesis of long-lived synthetic peptide hormones, thus potentially decreasing the number of injections needed to control hormone levels in patients suffering from impaired endocrine function.

Friday, April 13, 2012

Drawing can be torture



Drawing complex three-dimensional molecules in two-dimensions can be a real torture. I am glad I have never had to draw anything as convoluted as palhinine A. Check the 3-D structure on the left, and try to draw it in less than 10 minutes in ChemDraw or ChemSketch. Good luck!
palhinin A


Thursday, March 15, 2012

QM/MM vs. QM-only studies of large cluster models

How large must a quantum model of an enzyme active site be to achieve optimum results? Proponents of the so-called "cluster model" argue that, most often, good results may be obtained even with small models (< 100 atoms). Fahmi Himo has repeatedly shown that fully including the first layer of aminoacids surrounding the reacting substrate (i.e. to about 150 atoms) yields results that are insensitive to the inclusion of a polarizable-continuum solvent field, and has concluded from these data that such models are sufficient to capture all the relevant enzymatic effexts on catalysis.

Walter Thiel has now published a QM/MM analysis of the reaction mechanism of acetylene hydratase (previously studied by Fahmi Himo using increasingly large QM-only models). Inclusion of the surrounding protein dramatically changed the results for the largest model studied by Himo, due to the absence (in the "cluster model") of two negatively charged phosphate groups adjacent to the active site. Although these charges are quite "shielded" from the active site because of neighbouring positively-charged amino acids, they originate local charge assymmetries that interact differently with the active site during each step of the catalytic cycle. This effect is quite similar to the major influence of the internal protein dipoles on enzyme catalysis expounded by Arieh Warshel, and should be kept in mind by all of us who tend to prefer the QM-only approach: a polarizable-continuum model assumes a homogeneous environment surrounding the QM system, and in proteins "it ain't necessarily so".

Tuesday, November 29, 2011

An interesting hypothesis on the selection of glucose as major fuel source in neurons

Earlier this year, I wondered why neurons preferentially use glucose as fuel. I have now found an interesting paper by Dave Speijer regarding this problem. He proposes the following reasoning to explain this observation:
  • reactive oxygen species are generated in large amounts by NADH dehydrogenase (complex I) when the amount of oxidized ubiquinone is limited
  • generation of large amounts of FADH2 increases the rate of reduction of ubiquinone, and therefore increases indirectly the amount of harmful radical species generated by NADH dehydrogenase
  • glucose oxidation generates a much smaller amount of FADH2 than fatty-acid oxidation. Therefore:


  • Especially vulnerable cells may be expected to have evolved a preference for glucose.

    Incidentally, neurons do seem to lack large amounts of one of the enzymes involved in fatty acid oxidation: thiolase.

  •