The sequencing data used for analyses on this website represent the latest publicly available data sets. All Illumina data generated internally can be downloaded from the appropriate subsection below. Data sets were contributed from original manuscripts or made available through company website. URLs to their online locations are listed below.
| Description | Platform | File Type | Suitable Analysis1 | Download |
|---|---|---|---|---|
| Sequencing Runs | ||||
| Chr 21 NA185074 | Illumina v3 cBOT chemistry HiSeq3 | BAM | Q,C,V,FN | [Download] |
| Chr 19 NA185074 | Illumina v3 cBOT chemistry HiSeq3 | BAM | Q,C,V,FN | [Download] |
| Chr 4 NA185074 | Illumina v3 cBOT chemistry HiSeq3 | BAM | Q,C,V,FN | [Download] |
| Chr 21 NA192402 | Illumina GA IIx | BAM | Q,C,V,FN | [Download] |
| Chr 21 NA192402 | Illumina HiSeq | BAM | Q,C,V,FN | [Download] |
| Chr 21 NA185072 | Illumina GA IIx | BAM | Q,C,V,FN | [Download] |
| E. coli (MG1655) Read 1 |
Illumina GA IIx | BAM | Q,C | [Download] |
| E. coli (MG1655) Read 2 |
Illumina GA IIx | BAM | Q,C | [Download] |
| E. coli (MG1655) Paired |
Illumina GA IIx | BAM | Q,C | [Download] |
| Variant Calls | ||||
| NA18507 SNPs | Illumina GA IIx | TXT | T,FN | [Download] |
| NA18506 SNPs | Illumina GA IIx | TXT | T,FN | [Download] |
| NA18508 SNPs | Illumina GA IIx | TXT | T,FN | [Download] |
| NA19240 SNPs | Illumina GA IIx | TXT | FN | [Download] |
| NA19240 Indels | Illumina GA IIx | TXT | FN | [Download] |
| Coverage Output | ||||
| NA19240 coverage2 | Illumina GA IIx | TXT | G | [Download] |
1 Q = Quality Scores; C = Coverage; V = Variant Calling; T = Trio Inheritance; FN = False Negatives; G = Gene Region Gaps
2 Chromosome 21. All human data were aligned to the b36/hg18 reference genome.
3 v3 cBOT Kit Chemistry, available Q2, 2011.
4 All data generated with v3 cBOT Kit Chemistry on HiSeq2000 and aligned to the b37/hg19 reference genome.
| Description | Platform | File Type | Suitable Analysis1 | URL |
|---|---|---|---|---|
| Sequencing Runs | ||||
| KB1 (Bushman)2 | Illumina GA IIx | BAM | Q,C,V,FN | ftp://ftp.bx.psu.edu/data/bushman/hg18/bam/ as of 13-10-2010 |
| ABT (Bantu)2 | SBL (version 3) |
BAM | Q,C,V,FN | ftp://ftp.bx.psu.edu/data/bushman/hg18/bam/ as of 13-10-2010 |
| Human1 (NA18507)3 | Illumina GA II | FASTQ | Q,C,V,FN | http://www.ncbi.nlm.nih.gov/sra/?term=SRA000271 as of 13-10-2010 |
| E. coli (MG1655) | Illumina GA IIx | FASTQ | Q,C, | http://www.ebi.ac.uk/ena/data/view/ERP000092 as of 13-10-2010 |
| E. coli (DH10B) | SBL (version 4) |
CSFASTA; QUAL | Q,C | http://solidsoftwaretools.com/gf/project/dh10bfrag/ as of 13-10-2010 |
| 1000 Genomes4 | Illumina, SBL |
BAM | Q,C,V,T,FN | ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data as of 13-10-2010 |
| Variant Calls | ||||
| NA18507 SNPs | SBL (version 2) | TXT | FN | http://solidsoftwaretools.com/gf/project/yoruban as of 13-10-2010 |
| NA18507 Indels | SBL (version 2) | TXT | FN | http://solidsoftwaretools.com/gf/project/yoruban as of 13-10-2010 |
1 Q = Quality Scores; C = Coverage; V = Variant Calling; T = Trio Inheritance; FN = False Negatives
2 Complete Khoisan and Bantu genomes from southern Africa. All human data were aligned to the b36/hg18 reference genome. (Schuster SC, et al. Nature. 2010 Feb 18; 463 (7283): 943-947)
3 Accurate whole human genome sequencing using reversible terminator chemistry. All human data were aligned to the b36/hg18 reference genome. (Bentley DR, et al. Nature. 2008 Nov 6; 456 (7218): 53-59)
4 1000 Genomes files can be used by individuals for personal script assessment, but cannot be published or used for competitive purposes. Data from 1000 Genomes are aligned to the b37/hg19 reference genome.
| Description | File Type | Download |
|---|---|---|
| Reference Genomes | ||
| NCBI build 36 | FASTA | http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes/ as of 13-10-2010 |
| NCBI build 37 | FASTA | http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/ as of 13-10-2010 |
| NCBI build 36 Chr 21 | FASTA | [Download] |
| E. coli (MG1655) | FASTA | ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Escherichia_coli_K_12_substr__MG1655 as of 13-10-2010 |
| E. coli (DH10B) Option 1 |
FASTA | ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Escherichia_coli_K_12_substr__DH10B/ as of 13-10-2010 |
| E. coli (DH10B) Option 2 |
FASTA | http://solidsoftwaretools.com/gf/project/dh10bfrag/DH10B_WithDup_FinalEdit_validated.fasta.zip as of 13-10-2010 |
| Variant Databases | ||
| dbSNP130 | TXT | ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/database/b130_archive/ as of 13-10-2010 |
| NA192401 SNPs |
TXT | [Download] |
| NA192401 Indels |
TXT | [Download] |
| NA19240&Yoruba2 SNPs |
TXT | [Download] |
| NA19240&Yoruba2 Indels |
TXT | [Download] |
| NA19240&18507& Yoruba3 SNPs |
TXT | [Download] |
| NA19240&18507& Yoruba3 Indels |
TXT | [Download] |
| Gene Databases | ||
| OMIM genes4 | BED | [Download] |
1 Consists of those dbSNP 130 variants that were also observed in a capillary sequencing study of NA19240 (Kidd, et al. Nature. 2008 May 1; 453 (7191): 56-64)
2 Consists of those dbSNP 130 variants that were also observed in a capillary sequencing study of NA19240 and in at least one other Yoruban individual (of NA18506, NA18507, and NA18508) (Kidd, et al. Nature. 2008 May 1; 453 (7191): 56-64)
3 Consists of those dbSNP 130 variants that were also observed in a capillary sequencing study of NA19240 and NA18507, and at least one other Yoruban individual (of NA18506 and NA18508) (Kidd, et al. Nature. 2008 May 1; 453 (7191): 56-64)
4 Also available from http://genome.ucsc.edu/cgi-bin/hgTables?command=start
References