Sunday Recap

5 minute read

Published:

Sed helps me prepare files for ForceBalance

In my workflows I use ForceBalance to optimize the force fields of my small molecules. I normally start from gaff2 or sage and then further optimize the bonded parameters as well as the charges (LJ parameters are a different beast and it is better to take the values from a pre-trained and highly tested force field like gaff2). To modify an Amber parameters file for use with ForceBalance, it is best to use sed and edit the files like this:

#!/bin/bash

CountParameters() {
        read -p "Enter name of file: " PARM_FILE
        bond_parms=$(( ($(sed -n '/BOND/,/ANGLE/p' "$PARM_FILE" | wc -l) - 3) * 2 ))
        echo "bond parms is $bond_parms"
        angle_parms=$(( ($(sed -n '/ANGLE/,/DIHE/p' $PARM_FILE | wc -l) - 3) * 2 ))
        echo "ang parms is $angle_parms"
        dih_parms=$(( ($(sed -n '/DIHE/,/IMPROPER/p' $PARM_FILE | wc -l) - 3)))
        echo "dih parms is $dih_parms"
        imp_parms=$(( ($(sed -n '/IMPROPER/,/NONBON/p' $PARM_FILE | wc -l) - 3)))
        echo "improper parms is $imp_parms"
        total=$(($bond_parms + $angle_parms + $dih_parms + $imp_parms))
        echo "total number of parameters is $total"
}

PasteFBKeyword() {
        read -p "Enter name of file: " PARM_FILE
        BOND_KW="# PRM 1 2"
        sed -n '/MASS/,/BOND/p' $PARM_FILE > temp_file.txt
        sed -n '/BOND/,/ANGLE/p' $PARM_FILE | sed "s/$/ $BOND_KW/" >> temp_file.txt
        ANGLE_KW="# PRM 1 2"
        sed -n '/ANGLE/,/DIHE/p' $PARM_FILE | sed "s/$/ $ANGLE_KW/" >> temp_file.txt
        DIH_KW="# PRM 2"
        sed -n '/DIHE/,/IMPROPER/p' $PARM_FILE | sed "s/$/ $DIH_KW/" >> temp_file.txt
        IMP_KW="# PRM 1"
        sed -n '/IMPROPER/,/NONBON/p' $PARM_FILE | sed "s/$/ $IMP_KW/" >> temp_file.txt
        sed -n '/NONBON/,$p' $PARM_FILE >> temp_file.txt
}

CountParameters
PasteFBKeyword

One function can count the number of bonded parameters to give an idea of the size of the optimization. The other function adds the FB keywords for parameters to be optimized. The file will need some further editing because the script adds some redundant lines but whatever.

Love for bash but there are limits

Bash and command line are awesome but they are not made for large-data analysis. Over the weekend I was analyzing the moment of inertia of multiple molecules and that requires a vast amount of inertia tensor diagonalization plus the sorting of the eigenvalues. For the former I used Amber’s cpptraj but for the latter I attempted to use bash:

N=150
T=7
Frames=5000

for ((m=0; m<=${T}; m++)); do
        mkdir -p temp_${m} && cd temp_${m}
        for ((i=1; i<=${N}; i++)); do
                mkdir -p molecule_${i} && cd molecule_${i}
                cat << EOF >  principal_${i}.in
parm ../../5cb.box.parm7
trajin ../../remd.nc.${m}
principal :${i} out all.eigens.frames_${i}.dat
EOF
                cpptraj -i principal_${i}.in
                cd ../
        done
        cd ../
done

for ((m=0; m<=${T}; m++)); do
        touch all-data-temp_${m}.dat
        for ((j=1; j<=${Frames}; j++)); do
                for ((i=1; i<=${N}; i++)); do
                        grep "^${j} EIGENVECTOR 2" temp_${m}/molecule_${i}/all.eigens.frames_${i}.dat >> all-data-temp_${m}.dat
                done
        done
done

The script is terribly slow at the grep part as it took roughly 3 hours to finish. I really should not use grep for stuff like that but it is just too easy to use. I will rewrite this in C++.

Binding thermodynamics calculation

I completed the binding free energy calculation between Sandercyanin (Receptor) and Biliverdin (Ligand) using the MMGBSA algorithm.

Do not drink classical water - Mariana Rossi’s visit at UNC

I attended a presentation by Mariana Rossi on ab initio Molecular Dynamics. She is incorporating quantum effects (like the tunneling of protons) into classical simulations. At some point, she mentioned that if we treat water completely classically then the pKa would be roughly 8.5 instead of 7. I am not sure I understand it but I found an old youtube video where she explains just that at the 36:38/52:59 timestamp. Worth giving this another watch and thought.

Package your code

Found this guide to packaging your code. Should have a look for sure.

Beautiful visuals and good paper

Read Timothy Duignan’s paper on the Carbonic anhydrase II and the visuals as well as the results are awesome. For the visuals he referenced logmd, will also have a look! About the deep neural network potentials I am still not sure because one might say that in ML there is no potential function, no Newton (for classical mechanics). All there is is data – such as x-ray structures (or sequences in what I did). No obvious (to me at least) physical insight results in the ML approach.

Awesome paper

Read this very awesome paper.

Prat’s lectures

Prat is a really good lecturer and has a very deep understanding of statistical mechanics. I am learning a lot from his lectures online even though he works in a different field, ML, and not necessarily on theoretical statistical mechanics. In this youtube video he mentioned something that sounded super interesting to me: temperature replica exchange is similar to quantum tunneling where your system is moved across free energy barriers by swapping to higher temperature replicas. Very clever, I liked it a lot. He codes are not easy to use but I definitely want to familiarize myself with all the ML tricks he has been cooking in Maryland.