Zhiliang's Workbench:
Information / progress track
Pig array re-annotations using new pig genome annotation data
Objectives: Use current NCBI RefSeq annotation information to enrich/update
the annotations of the pig oligo sequences.
Works:
1. Feb 12. 2014: Pig RefSeq data sources
Source I: ftp://ftp.ncbi.nlm.nih.gov/genomes/Sus_scrofa/RNA/
- Contains 51,361 sequences.
Source II: http://www.ncbi.nlm.nih.gov/nuccore
a. Limit to 'sus scrofa'[orgn] and filter by RefSeq[property]
b. Manually filter out those that do not have annotation info:
CLEAN | - 4,562 "unplaced genomic scaffold" sequences
| - 5,652 "genomic scaffold" sequences
| - 6,195 "whole genome shotgun" sequences
c. This end up with 51,488 annotated RefSeq sequences.
ANALYZE | Data from 'Source I' and 'Source II' differ by (51,488-51,209)
| => 279 sequences, and share 51,150 sequences (with additional
| 397 sequences unique to one of them).
d. Extract the sequence header and combine the 51,547 annotated
sequences as a pig refseq target data set
Source III: ftp://ftp.ncbi.nlm.nih.gov/refseq/S_scrofa/ (official)
- Contains 51,362 sequences.
2. Mar 14, 2014: BWA match against the pig refseq, of the consensus sequences from
the "2006 pig transcripts consortium"
(downloaded from http://www.pigoligoarray.org; now stored at the
AnimalGenome.ORG data repository -> "pig_Oligo_consensus_seq.fa.gz")
o Briefly, of 18,224 "pig consortium (2006) oligo" sequences,
20,062 bwa matches to pig refseq were found on
16,784 uniq sequences (download):
o Break down: 14,164 -> single match
2,108 -> 2 matches
404 -> 3 matches
79 -> 4 matches
24 -> 5 matches
3 -> 6 matches
1 -> 7 matches
1 -> 9 matches
3. Mar 19, 2014: BWA match against the pig refseq, of the Swine Protein-Annotated
Oligonucleotide Microarray data (Illumina 70-mer Oligo synthesis; aka.
"GPL7435"; Out of the "2006 pig consensus transcripts consortium" work)
o Briefly, of 20,400 "GPL7435" sequences
17,245 bwa matches to pig refseq were found on
17,207 uniq sequences (download):
o Break down: 17,169 -> single match
38 -> 2 matches
• Zhiliang Hu •