PGA data analysis report
Introduction

The data of mass spectrometry (MS)-based proteomics is generally achieved by peptide identification through comparison of the experimental mass spectra with the theoretical mass spectra that are derived from a reference protein database. PGA constructs customized proteomic databases based upon RNA-Seq data and then novel peptides could be identified based on the database.

Methods and Data

Firstly, the package PGA was used to construct the customized proteomic database based on RNA-Seq data. Then the MS/MS data was searched against this database. A refined FDR estimation approach for these identifications was employed.

Results
Quality plot

Figure 1.  Precusor ion error distribution.

Figure 2.  Unique spectrum per protein chart.

Figure 3.  Unique peptide per protein chart.

Figure 4.  Comparison of charge distributions of canonical peptides versus novel peptides. All of the peptides were filterd with 1% false discovery rate.

Figure 5.  Comparison of score distributions of canonical peptides versus novel peptides. All of the peptides were filterd with 1% false discovery rate.

Figure 6.  Comparison of mass distributions of canonical peptides versus novel peptides. All of the peptides were filterd with 1% false discovery rate.

Peptide/protein identification

Table 1.  The summary table of identification result.

Item Value
No. of PSMs 224700
No. of peptides 73459
No. of proteins 8399
No. of PSMs(novel peptides) 1453
No. of novel peptides 632

Figure 7.  The pie plot of the novel peptides

Table 2.  Novel peptide identification. Click the link in the last column to view the detailed information.

class n detail
SNV 508 See detail
INDEL 2 See detail
AS 53 See detail
Novel transcripts 71 See detail

Table 3.  Get Full Table Peptide identification result. Click GET FULL TABLE (in the top right of the table) to get the full result.

index charge mass delta_ppm peptide Qvalue
407765 2 2500 -2.1 AADSQNSGEGNTGAAESSFSQEVSR 0
407766 2 2500 -1.1 AADSQNSGEGNTGAAESSFSQEVSR 0
376071 2 2200 -0.0045 IDTASLGDSTDSYIEVLDGSR 0
392037 2 2300 -2 TMMACGGSIQTSVNALSADVLGR 0
358782 2 2100 -5.5 IQAAASTPTNATAASDANTGDR 0
368366 2 2200 -0.068 EQSSEAAETGVSENEENPVR 0
368463 2 2200 -1.3 DGSTTAGNSSQVSDGAAAILLAR 0
394111 2 2400 -2.8 TMMACGGSIQTSVNALSADVLGR 0
403161 2 2400 -5.6 VQVLTAGSLMGLGDIISQQLVER 0
396020 2 2400 -1.4 TMMACGGSIQTSVNALSADVLGR 0

Table 4.  Get Full Table Protein identification result. Click GET FULL TABLE (in the top right of the table) to get the full result.

Accession Mass NumOfUniqPeps NumOfUniqSpectra
ENSP00000444169 100000 15 33
ENSP00000470310 26000 5 12
ENSP00000433153 18000 1 1
ENSP00000344002 290000 1 2
ENSP00000351339 280000 106 461
ENSP00000471999 280000 109 465
ENSP00000448520 31000 13 39
ENSP00000420315 39000 2 4
ENSP00000304467 80000 1 2
ENSP00000453092 8000 1 1