FAQs for de novo sequencing
Q1) What is de novo sequencing?
Ans: MS spectra are interpreted from first principles to determine the amino acid sequence using specialist software. This technique can readily provide sufficient data to show homology with other proteins, or for novel proteins, to obtain internal amino acid sequence information to enable cloning of the gene.
NOTE: Due to the complex nature of the spectra there is often more than one possible interpretation of the data.
Q2) When is de novo sequencing required?
Ans: When the genome is not available in public databases de novo sequencing [Service 002] will be performed. This service must include Protein identification [Service 001]. For protein sequencing by MS/MS the protein is digested into peptides and a spectra is obtained for the major peptide ions.
Data analysis falls into two types:
1.Known genome [Service 001]
The sample spectra is compared to databases to look for identical sequences
using MASCOT. This can be adequate to identify a known protein, or closely related proteins.
2.Unknown genome [Service 002]
The sample spectra is interpreted to obtain de novo protein sequence for each peptide ion using PEAKS. This is the equivalent of N-terminal Edman sequencing of internal peptides. The sequences can range in length from 6-20+ amino acids and these can be used to design a probe.
NOTE: The identification of several peptides may be required to produce a sequence that best suits design of an oligo probe.
Q3) What length of peptide can be sequenced?
Ans: The best sequence is obtained from peptides of 6-15 amino acids. Longer sequences (15-20+) can be possible if large amounts of the peptide are available for analysis.
Q4) How are the de novo sequencing results interpreted?
Ans: The first sequence (Rank 0) is considered the most likely interpretation of the MS spectra, but the lower ranked sequences may also be possible interpretations. Please refer to the notes in the report.
After de novo sequence analysis is performed the customer must perform BLAST searches to look for homology with biologically relevant proteins: http:// blast.ncbi.nlm.nih.gov/Blast.cgi (Select the protein blast option from the Basic BLAST list). Due to the multiple interpretations of the subsequent data we do not perform this task.
The objective of the BLAST search is to seek homologous sequences from the databases to indicate possible biological function. If the sequences shown do not match with any known peptide sequence of the target genome then it can be claimed that these sequences are novel. If the protein is novel then the de novo sequences can be used to design oligonucleotide probes to clone the gene.
Q2) How should the sample be prepared?
(1) The protein must be pure, i.e. no other proteins should be present.
(2) The protein must be sent lyophilised (freeze dried) from a low salt buffer. Please state the volume of the liquid before drying.
NOTE: Gel bands and PVDF membranes cannot be analysed.
(3) The protein concentration before drying must be at least 0.1mg/ml.
(4) The solution must be a low salt buffer, e.g. 50mM ammonium hydrogen carbonate; for TRIS or phosphate buﬀers the maximum salt strength is 20mM. NaCl and surfactants (e.g. SDS) cannot be present.
NOTE: We do not perform sample desalting; there are many methods described in the literature for de-salting.
Q3) How much sample is required?
Ans: For pure proteins a minimum of 100ng is required (e.g. if the sample was run on a gel then the band would be visible using Coomassie staining).
For pure peptides a minimum of 10ng is required.
Q2) The sample is on PVDF membrane, is it OK?
Ans: No, the protein must be lyophilised.
Q3) The sample is in solution, is it OK?
Ans: No, the sample must be freeze dried prior to shipping to ensure maximum stability.
(a) The volume of the liquid before drying
(b) Solvent/buffer used before drying
(c) Known or estimated concentration of sample
Q4) What is the accuracy of amino acid analysis (AAA)?
Ans: Generally most AAA laboratories accept between 10 and 15% for the variation from expected. Normally the results presented are less than 10%, but some amino acids are more difficult.
For the high sensitivity AAA (no cys or trp) the recoveries for the control sample (BSA) are typically 90 to 95% for the total of the amino acids. NOTE: This does not affect the AA composition which is the relative ratio of amino acids (i.e. mole%).
For the amino acid composition, the run is accepted if the result for BSA for individual amino acids is within +/- 10% of the expected mole% value. Most amino acids are within <5%.
Q5) Are some amino acids difficult to measure?
Ans: Yes. Serine is labile under acid conditions and typically for BSA there is a loss of 6-7% in the assay. Met can cause problems due to oxidation under the hydrolysis. It is also the least abundant and +/- 15% is allowed for this amino acid. Met can also be analysed by the Cys assay and some prefer this method of analysis. For Cys the recoveries are ~85% for BSA and +/- 15% from this value is acceptable. Trp can give variable results and the data is not accepted unless the BSA result is less than +/- 15% of the expected result.
Q6) My peptide contains non-standard amino acids, can they be measured?
Ans: No, the following cannot be analysed:
(1) Mpr (mercaptopropionyl)
(4) Pal (pyridyl alanine)
(6) Nal (naphthyl alanine)