MSClustering

Download   Publications 
Contact: Ari Frank

Summary

Tandem mass spectrometry (MS/MS) experiments often generate redundant datasets containing multiple spectra of the same peptides. Clustering of MS/MS spectra takes advantage of this redundancy by identifying multiple spectra of the same peptide and replacing them with a single representative spectrum. Analyzing only representative spectra results in significant speed-up of MS/MS database searches. We present an efficient clustering approach for analyzing large MS/MS datasets (over ten million spectra) with a capability to reduce the number of spectra submitted to further analysis by an order of magnitude. The MS/MS database search of clustered spectra results in fewer spurious hits to the database and increases number of peptide identifications as compared to regular non-clustered searches. Our open source software MS-Clustering is designed to rapidly cluster large MS/MS datasets. The program merges similar spectra (having similar m/z values – within a given tolerance), and creates a single consensus spectrum as a representative. The input formats accepted are: dta, mgf, mzXML. The output format is mgf. For more details see documentation in the downloaded zip file.


Download

Source Code

Publications

Clustering Millions of Tandem Mass Spectra.
Ari M. Frank, Nuno Bandeira, Zhouxin Shen, Stephen Tanner, Steven P. Briggs, Richard D. Smith and Pavel A. Pevzner.
To appear in J. of Proteome Research, 2007.

 

Latest Releases

PepNovo

2008.07.08

MS-Clustering

2008.06.09

Inspect, MS-Alignment

2008.04.04

MS-Dictionary

2007.11.30

MS-GeneratingFunction

2007.11.28

Spectral Networks

Sept 2007

 

 

 

Media Coverage


A powerful tool for PTM discovery (Jan 2008, Journal of Proteome research, Vol 7. Issue 1)


From spectral networks to shotgun sequencing (June 2007, Nature Methods, Vol. 4 No. 6)

Identifying peptides without a database (May 2007, Journal of Proteome Research)

UCSD Computer Scientist Wins Young Investigator Award, Research on Snake Venom Proteins Highlighted (Nov 2006, UCSD)