Bioinformatics Highlight: Assessing protein mass spectrometry data using Percolator – how to weed out valuable information from the noise

Mass spectrometry (MS) is currently the most effective way to analyze protein on a large scale, and hence one of the most important tools for answering those questions. There are still however difficult challenges in analysing the wealth of data MS-based experiments produce.

Lukas Käll has been addressing these questions for several years and has developed several open source tools that are highly appreciated in the proteomics community. His software has been incorporated in commercial offerings, and currently the majority of all MS-equipment for proteomics usage is shipped with his software.

An important focus has been on discriminating correct from incorrect peptide identifications in database searches. False identification is a plague in high-throughput biology and Käll’s methods utilize modern machine-learning techniques to remove false identifications and estimate false-discovery rates. For examples of his software work see Percolator and qvalue. You can read more in the papers listed below.


Granholm V, Kim S, Navarro JC, Sjölund E, Smith RD, Käll L (2013) “Fast and Accurate Database Searches with MS-GF+Percolator.” J Proteome Res
Granholm V, Noble WS, Kall L (2012) “A cross-validation scheme for machine learning algorithms in shotgun proteomics.” BMC Bioinformatics 13 Suppl 16, S3
Granholm V, Navarro JF, Noble WS, Kall L (2012) “Determining the calibration of confidence estimation procedures for unique peptides in shotgun proteomics.” J Proteomics 80C, 123-131
Granholm V, Noble WS, Kall L (2011) “On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics.” J Proteome Res 10(5), 2671-2678
Kall L, Storey JD, Noble WS (2008) “Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry.” Bioinformatics 24(16), 42-48
Käll L, Canterbury JD, Weston J, Noble WS, MacCoss MJ (2007) “Semi-supervised learning for peptide identification from shotgun proteomics datasets.” Nat Methods 4(11), 923-925