Skip to contents

In protein structure prediction a key measure of accuracy is how well does the predicted energy or score correlate with the distance to a native conformation. A common distance measure is the all-atom root mean squared distance (RMSD). A challenge, however, is that we don't expect that far away from the native conformation, the energy should be discriminating, so we want to bias the assessment to those near the native conformation. We therefore The Pnear metric defined in (Bhardwaj, et al., Nature, 2016) measures the how "funnel-like" a score-vs-rmsd plot is. Pnear Rosetta Documentation

Usage

Pnear(score, rmsd, lambda = 1.5, kbt = 0.62, verbose = FALSE)

Arguments

score

a vector of scores e.g. Rosetta energies e.g. in the Ref2015.

rmsd

root mean squared deviation values for e.g. backbone atoms

lambda

Lambda is a value in Angstroms indicating the breadth of the Gaussian used to define "native-like-ness". The bigger the value, the more permissive the calculation is to structures that deviate from native. Typical values for peptides range from 1.5 to 2.0, and for proteins from 2.0 to perhaps 4.0.

kbt

The value of k_B*T, in energy units, determines how large an energy gap must be in order for a sequence to be said to favor the native state. The default value, 0.62, should correspond to physiological temperature for ref2015 or any other scorefunction with units of kcal/mol.

verbose

give verbose output.

Value

numeric value.

Details

# subtract off the min-score as is done in the Rosetta Code
scores = scores - min(scores)

# write down the equation in more code-like notation
Pnear <- Sum_i[exp(-RMSD[i]^2/lambda^2)*exp(-scores[i]/k_BT)] /
         Sum_i[exp(-scores[i]/k_BT)]

# combine the terms in the first exponential
Pnear = Sum_i[exp(-RMSD[i]^2/lambda^2 - scores[i]/k_BT)] /
        Sum_i[exp(-scores[i]/k_BT)]

let x_i  = RMSD[i]^2/lambda^2 * k_BT/scores[i]
    beta = -scores[i]

Pnear = Sum_i[exp(-RMSD[i]^2/lambda^2*k_BT/scores])]

# Use the log-sum-exponential trick
log(Pnear) =   log_sum_exp(-RMSD[i]^2/lambda^2 - scores[i]/k_BT)
             - log_sum_exp(-scores[i]/k_BT)

Note

Unlike the Conway discrimination score, the PNear calculation uses no hard cutoffs. This is advantageous for repeated testing: if the scatter of points on the RMSD plot changes very slightly from run to run, the PNear value will only change by a small amount, whereas any metric dependent on hard cutoffs could change by a large amount if a low-energy point crosses an RMSD threshold.

Author

Vikram K. Mulligan vmulligan@flatironinstitute.org adapted from Rosetta.

Examples

if (FALSE) { # \dontrun{
 Pnear(score = score_a, rmsd = rmsd_a)
} # }