Skip to main content

Table 2 Ortholog dataset used in current analysis

From: Comparison of RefSeq protein-coding regions in human and vertebrate genomes

Tax ID

Organism

Annotation release date

Genes in ortholog dataset

Annotated protein-coding genes

% Pipeline prediction

EST count

Assembly accession

Contig N50

Scaffold N50

7955

*zebrafish

3/24/2011

5481

26329

48

1,481,937

GCF_000002035.4

1,073,451

1,551,602

8128

nile tilapia

9/30/2011

5898

22130

100

120,196

GCF_000188235.1

29,493

2,802,423

8364

western clawed frog

7/29/2010

9611

21989

62

1,271,375

GCF_000004195.1

17,038

1,567,461

9031

*chicken

12/16/2011

11170

16725

71

600,433

GCF_000002315.3

279,750

12,877,381

9103

turkey

3/25/2011

8981

12129

100

17,435

GCF_000146605.1

12,520

857,645

9258

platypus

9/3/2011

6917

16477

99

9,699

GCF_000002275.2

11,554

958,970

9305

tasmanian devil

7/16/2012

12456

19365

100

0

GCF_000189315.1

20,139

1,847,106

9483

white-tufted-ear marmoset

6/8/2012

15191

19408

100

2,605

GCF_000004665.1

29,293

5,167,444

9544

rhesus macaque

6/2/2010

15228

22541

97

58,412

GCF_000002255.3

25,707

6,094,595

9555

olive baboon

9/5/2012

15583

21785

98

145,582

GCF_000264685.1

40,262

528,927

9593

western gorilla

12/6/2012

16250

22059

100

0

GCF_000151905.1

11,661

913,458

9597

pygmy chimpanzee

7/25/2012

16519

20463

100

0

GCF_000258655.1

66,775

10,124,892

9598

chimpanzee

10/27/2012

16997

21396

96

17,130

GCF_000001515.5

50,679

8,925,874

9601

sumatran orangutan

7/18/2012

14981

22822

86

46,981

GCF_000001545.4

15,648

747,460

9606

*human

10/30/2012

18421

19527

2

8,699,560

GCF_000001405.22

38,508,932

44,983,201

9615

*dog

2/2/2011

15784

19163

93

382,638

GCF_000002285.3

267,478

45,876,610

9646

giant panda

7/30/2010

14466

17892

100

0

GCF_000004335.1

39,886

1,281,781

9685

domestic cat

11/7/2012

15864

18201

98

919

GCF_000181335.1

20,621

4,658,941

9785

*african savanna elephant

8/25/2011

14259

18389

100

0

GCF_000001905.1

69,023

46,401,353

9796

*horse

6/28/2011

14668

18002

96

37,199

GCF_000002305.2

112,381

46,749,900

9823

pig

10/11/2011

12283

21992

84

1,624,129

GCF_000003025.5

69,669

576,008

9913

*Bos taurus (bovine)

12/2/2011

16013

21157

39

1,559,494

GCF_000003055.4

96,955

6,380,747

9940

sheep

12/2/2012

15588

19097

96

338,483

GCF_000298735.1

40,376

100,079,507

9986

rabbit

4/23/2010

9032

16117

94

34,938

GCF_000003625.2

64,648

35,972,871

10029

chinese hamster

10/17/2011

13835

19702

99

0

GCF_000223135.1

39,361

1,147,233

10090

*house mouse

10/1/2012

16142

21780

6

4,853,5*8

GCF_000001635.21

32,273,079

52,589,046

10116

*norway rat

6/20/2012

15718

22719

29

1,103,577

GCF_000001895.4

59,469

2,178,346

10141

*domestic guinea pig

10/3/2011

14436

18029

98

19,975

GCF_000151735.1

80,583

27,942,054

13616

gray short-tailed opossum

5/31/2011

12942

17924

98

265

GCF_000002295.2

108,014

59,809,810

27679

*bolivian squirrel monkey

9/9/2012

16089

19331

100

0

GCF_000235385.1

38,823

18,744,880

28377

*green anole

3/30/2011

10041

15645

100

156,802

GCF_000090745.1

79,867

4,033,265

30611

small-eared galago

7/18/2012

15596

19454

100

0

GCF_000181295.1

27,100

13,852,661

31033

torafugu

11/6/2012

5560

18592

98

0

GCF_000180615.1

52,883

928,938

61853

northern white-cheeked gibbon

5/6/2011

15209

19556

100

0

GCF_000146795.1

35,148

22,692,035

  1. The assembly name, assembly accession, contig N50, and scaffold N50 are reported from the NCBI Assembly resource. The % pipeline prediction column indicates the percent of annotated computationally predicted proteins (XP accession prefixes) out of the total annotated proteins (XP and NP accession prefixes). Reference species are flagged with * in the Organism column. EST count refers to the number of same-species ESTs that were available at the date of the annotation run; some annotation runs also used cross-species transcript data or 454 RNAseq data.