- Methodology article
- Open Access
Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken
© The Author(s). 2018
- Received: 23 November 2017
- Accepted: 31 July 2018
- Published: 7 August 2018
The domestic chicken (Gallus gallus) is widely used as a model in developmental biology and is also an important livestock species. We describe a novel approach to data integration to generate an mRNA expression atlas for the chicken spanning major tissue types and developmental stages, using a diverse range of publicly-archived RNA-seq datasets and new data derived from immune cells and tissues.
Randomly down-sampling RNA-seq datasets to a common depth and quantifying expression against a reference transcriptome using the mRNA quantitation tool Kallisto ensured that disparate datasets explored comparable transcriptomic space. The network analysis tool Graphia was used to extract clusters of co-expressed genes from the resulting expression atlas, many of which were tissue or cell-type restricted, contained transcription factors that have previously been implicated in their regulation, or were otherwise associated with biological processes, such as the cell cycle. The atlas provides a resource for the functional annotation of genes that currently have only a locus ID. We cross-referenced the RNA-seq atlas to a publicly available embryonic Cap Analysis of Gene Expression (CAGE) dataset to infer the developmental time course of organ systems, and to identify a signature of the expansion of tissue macrophage populations during development.
Expression profiles obtained from public RNA-seq datasets – despite being generated by different laboratories using different methodologies – can be made comparable to each other. This meta-analytic approach to RNA-seq can be extended with new datasets from novel tissues, and is applicable to any species.
- Gallus gallus
- Expression atlas
- Network graph
Aggregation and meta-analysis of multiple large gene expression datasets based upon common microarray platforms is relatively commonplace in many species (e.g. [1–3]). Although RNA-seq is rapidly supplanting microarrays for gene expression profiling, it is not yet clear whether data from multiple different labs can be analysed together in an informative manner. Confounding variables reflect the many technical – and bias-prone – aspects of library preparation and sequencing (see reviews [4, 5]), with RNA-seq datasets often differing in read length , depth of coverage , strand specificity , RNA extraction and library selection methods , sequencing platform [10, 11] and the choice to sequence single- or paired-end reads . For a given dataset, these variables can together affect both the number and type of genes detectable and the accuracy of their expression level estimates. Expression quantification is also affected by sample quality  and storage method , irrespective of sequencing technique: RNA degrades with lengthier post-mortem intervals  (the extent of which is tissue-dependent ) with degradation resulting in inaccurate quantification, particularly for shorter transcripts . Sequencing composite biological structures (those with internal structures that have distinct functions), whether intentionally or inadvertently, can mask the signal of structure-specific differential expression . Despite these variables, meta-analysis combining mammalian gene expression datasets [19–21] suggests that RNA-seq datasets are generally robust to inter-study variation, with the expression profiles of homologous tissues clustering more closely with each other than with different samples from the same study or species .
Expression atlases are valuable resources for functional genomics. Groups of transcripts – members of which will have similar expression profiles – can be associated with a shared function, such as a particular pathway or biological process. This principle is known as ‘guilt by association’  and has previously been used to annotate genes of unknown function in human [2, 24, 25], pig , sheep  and mouse [28, 29] datasets. Co-expression information is also informative in genome-wide association studies (GWAS) of complex traits and disease susceptibility. The simple principle, that genes involved in the same trait or phenotype tend to be expressed in the same cell type or tissue, or otherwise participate in the same pathway, has been confirmed in multiple datasets [28, 30].
Because of the ease of access in ovo, the chicken (Gallus gallus) embryo has been widely used as a model system in cell and developmental biology, constrained only by methods for genomic manipulation in situ, or in the germ line. These constraints were largely overcome through the sequencing of the genome, and technological developments such as in vivo electroporation, more than 15 years ago [31, 32]. More recent innovations including the generation of reporter transgenes  and genome editing via primordial germ cells [34–36] have transformed the utility of the chicken as a model organism. However, the current genome build still has many unannotated or minimally annotated genes about which very little is known . Of the 18,347 protein-coding genes in version GalGal5 of the chicken genome in Ensembl89, 7275 (40%) have only been assigned an Ensembl placeholder ID.
The domestic chicken is also a major source of animal protein worldwide, with different lines heavily selected for optimal production traits such as increased egg production or rapid weight gain. The molecular basis for these traits is increasingly being associated with genomic loci through genome-wide association studies based upon high density SNP platforms . Both the application of the chick as a model organism, and for candidate gene analysis in genomic intervals associated with trait variation, would be expedited by improvements in functional genome annotation. In particular, it would be useful to identify the sets of protein-coding genes that share transcriptional regulation between the chick and the mouse, the most widely-studied mammalian model organism. For this purpose, we aimed to generate a comprehensive atlas of mRNA expression for the chicken.
With the removal of antibiotics from the food chain and threats from emerging diseases, there is also interest in the selection of birds with increased resistance to infection or resilience to disease . To support this activity, we were particularly interested in identifying and annotating genes expressed specifically at high levels in cells of the innate immune system. Such gene sets have been identified in previous studies of human [2, 24, 25], pig , sheep  and mouse .
The current version of the chicken assembly was largely derived from high-throughput (i.e. comparatively cheap but imprecise) short read sequencing and primarily contains protein-coding gene models. The recent use of long-read – PacBio SMRT Iso-Seq – data has demonstrated that the transcriptomic complexity of chickens is comparable to humans, with many additional lncRNA models (among others) scheduled for inclusion in future Ensembl annotations .
To identify the set of genes expressed in innate immune cells in both unchallenged and activated conditions, we generated pure cultures of bone marrow-derived macrophages (BMDMs) grown in the presence of recombinant chicken macrophage colony-stimulating factor (CSF1), and stimulated them with the archetypal microbial agonist, lipopolysaccharide (LPS) . To complement the data generated from macrophages in vitro, we also obtained RNA-seq libraries from the caecal tonsils of birds infected with Campylobacter, as well as from previous studies of macrophage, dendritic cell and heterophil populations. A global expression atlas for the chicken transcriptome was created by combining our immune-related data with 20 publicly archived RNA-seq datasets. Some were collated by the Avian RNA-seq Consortium , while others are drawn from a diverse range of existing publications, including studies that characterised the genetic basis of retinogenesis , the genetic determinants of meat tenderness , the morphological diversity of skin appendages , visceral fat metabolism , the transition between laying and brooding phases , the effect of heat stress upon pituitary development  and spleen function , the pathways involved in avian influenza resistance , the role of lncRNAs in the development of muscle , liver and adipose , and the transcriptional landscape of mRNA editing . In total, 279 RNA-seq libraries were obtained, representing 48 distinct tissue and cell types at developmental stages spanning early embryonic (5 days) to mature adult (70 weeks post-hatching). In addition, we accessed a recently published transcriptional analysis of chick development generated by Cap Analysis of Gene Expression (CAGE) , a technique which can be used to quantify gene expression based on the transcript start site . We show that the ‘guilt by association’ approach to functional annotation is viable even when combining disparate RNA-seq datasets, and utilise the meta-dataset to identify macrophage-specific and other informative co-expression clusters, providing a resource for genetic and genomic study of avian trait variation.
Selecting samples for inclusion in an RNA-seq meta-dataset
Many chicken RNA-seq datasets are available in public repositories, as detailed in . Robust co-expression clustering of any two genes depends upon sampling tissues and cells in which both vary across the widest possible range. To maximise the co-expression signal, we chose datasets to represent the greatest possible diversity of tissues and organ systems. Not all studies contain links to a publicly archived dataset, such as a study of induced ochratoxicosis in the kidney cortex  and two studies of the bursa of Fabricius [56, 57]. Samples containing less than 10 million reads were not used, such as those from a study of the follicular transcriptome throughout the ovulation cycle .
Datasets used are detailed in Additional file 1: Table S1, and have few commonalities: they were sequenced using a variety of Illumina instruments (HiSeq 2000/2500/3000/4000, Genome Analyzer II/IIx, NextSeq 500 and HiScanSQ), and include single- and paired-end, strand-specific and non-specific, polyA-selected (mRNA-seq) and rRNA-depleted (total RNA-seq) libraries at different read lengths and depths. For 12 tissues, independently sequenced RNA-seq datasets for the same tissue (Additional file 2: Table S2) also allow for internal tests of the validity of aggregating the data. Throughout this text studies are referred to by their NCBI BioProject ID.
Quantifying expression by iteratively revising a reference transcriptome
Expression was quantified – as transcripts per million (TPM) – using an RNA-seq processing pipeline  which iteratively runs the quantification tool Kallisto  with each iteration using an incrementally revised transcriptome. Kallisto requires that the user provide a set of transcripts, which are decomposed into k-mers. The expression of each transcript is quantified by matching this set of k-mers to the k-mers of the reads. For the first iteration of Kallisto, a non-redundant transcriptome (57,234 transcripts, representing 17,680 Ensembl protein-coding genes) was obtained by combining Ensembl transcript models with NCBI mRNA RefSeqs (see Materials and Methods).
Randomly down-sampling RNA-seq datasets does not quantitatively alter their expression profiles
Higher resolution expression profiles are dependent upon higher sequencing depths  with diminishing returns – after approximately 10 million reads – on the power to detect genes differentially expressed between conditions . For the purpose of functional annotation, it is more important to minimise variation between samples than to comprehensively capture transcripts. Accordingly, all datasets were randomly down-sampled to exactly 10 million reads before quantification.
Biologically meaningful expression profiles are identified even after combining disparate RNA-seq datasets
If a meta-analytic approach to RNA-seq is valid, subsets of transcripts enriched in a given tissue should have annotations functionally appropriate to that tissue. To test this, we calculated a preferential expression measure (PEM) for each gene , essentially the median expression divided by the mean. We then obtained the set of Gene Ontology (GO) terms enriched in each subset of genes with the highest PEM associated with a particular tissue (Additional file 10: Table S10) (see Materials and Methods). Consistent with the function of each tissue, the bursa of Fabricius (the site of B cell synthesis ) showed tissue-specificity for the expression of genes enriched for ‘defence response to bacterium’ (p = 8.3 × 10− 5), breast muscle for ‘striated muscle contraction’ (p = 1.9 × 10− 6), cerebrum for ‘synaptic transmission’ (p = 1.5 × 10− 4), claw epithelium for ‘bone mineralisation’ (p = 6.4 × 10− 4), heart for both ‘muscle contraction’ (p = 8.8 × 10− 6) and ‘cellular respiration’ (p = 4.6 × 10− 15), kidney for ‘oxidation-reduction process’ (p = 5.3 × 10− 5), pancreas for ‘proteolysis’ (p = 0.001), pituitary gland for ‘endocrine system development’ (p = 2 × 10− 4), retina for ‘visual perception’ (p = 7.2 × 10− 17), spleen for ‘immune response’ (p = 2.2 × 10− 6), and trachea for ‘cilium morphogenesis’ (p < 1 × 10− 30) (Additional file 10: Table S10).
Signals of co-expression allow for informative functional annotation
Network analysis of the meta-dataset was performed using Graphia Professional, a commercial version of BioLayout Express3D [67, 68], previously applied to pig , sheep  and mouse  microarray datasets and CAGE data from the FANTOM5 consortium [24, 25]. A Pearson’s correlation matrix for each gene-to-gene comparison was visualised as a network graph of 18,127 nodes (genes) linked by 632,038 edges (correlations above a certain threshold; in this case, r = 0.8). Clusters of interconnected nodes represent sets of genes that share a signal of co-expression. These clusters were identified by applying the Markov clustering (MCL) algorithm  to the network graph, at an inflation value (which determines cluster granularity) of 2.2. The contents of each cluster are given in Additional file 13: Table S13.
Many of the co-expression clusters comprised genes with a tissue- or process-specific expression profile. Additional file 14: Table S14 summarises the highest PEM value for a tissue in each of the clusters with > 25 members. Cluster 2 was largely brain-specific: of the 655 genes in this cluster, 281 (43%) had their highest PEM in the hypothalamus, 155 (24%) had their highest PEM in the cerebrum and 115 (18%) had their highest PEM in the cerebellum. Other clusters contained genes with expression enriched in liver (cluster 6), ovary (cluster 7), trachea (cluster 8), testis (cluster 10), retina (clusters 13 and 24), feather epithelium (cluster 14), breast muscle (cluster 16), kidney (cluster 17), pituitary gland (clusters 19 and 25), Campylobacter-infected caecal tonsils (cluster 20), spleen (clusters 21 and 22) and adipose (cluster 23).
The tissues in some of these clusters were represented by multiple independent projects combined in this meta-atlas. For instance, cluster 6 comprises genes that were enriched in the liver, with data from three separate BioProjects. Some variation in expression estimates between these independent liver samples did not affect their inclusion in the same co-expression cluster. Furthermore, the GO terms enriched in each cluster are functionally consistent with its observed tissue-specificity (Additional file 15: Table S15).
Some clusters were associated with processes shared by multiple tissues. The largest cluster, cluster 1, was enriched in embryo-derived samples, and the GO terms were strongly associated with the cell division cycle and DNA repair (Additional file 15: Table S15). The genes within this list include the key transcriptional regulator, FOXM1, and multiple cyclins (CCNA2/B2B3/C/E1/F and J), and overlap substantially with cell cycle-associated lists derived from previous cluster analysis [2, 70].
We used the ‘guilt by association’ principle to contextualise individual gene annotations – obtained by protein-level alignment and of varying quality (see Materials and Methods) – as there is an a priori expectation that by virtue of being co-expressed, the genes within a given cluster have related (that is, tissue- or process-specific) functions. In this respect, we can increase confidence in otherwise lower-quality alignments. Some examples and proposed annotations are summarised in Additional file 13: Table S13.
Genes in cluster 14 with known function
metalloprotease, possibly involved in tissue remodelling to form feather follicles
ankyrin repeat and kinase domain containing 1
interacts with keratin filaments
epithelial growth factor
tight junction membrane protein found in all epithelia
homeobox protein DLX4
homeobox protein that regulates epithelial-mesenchymal interactions
epidermal differentiation protein starting with MTF motif 4
markers of the feather barbule and members of the epidermal differentiation complex; this has a role in integumentary development, including feather pigmentation
epidermal differentiation protein starting with MTF and rich in histidine
feather keratin 21
feather keratin 27
patatin-like phospholipase domain-containing protein 4
enzyme with a role in retinol metabolism (retinol and related compounds regulate epithelial cell growth and differentiation)
Ras-related protein RAB38
GTPase involved in melanosome biogenesis and epithelial pigmentation
Ras association domain family member 10
tumour suppressor that mediates the epithelial-mesenchymal transition
epidermal retinol dehydrogenase 2
overexpressed in psoriatic human skin
Xg blood group
blood group antigen
Annotation of co-expression clusters associated with innate and acquired immunity and macrophage biology
The CSF1R gene was contained within cluster 27 (n = 129 genes, of which 32 [25%] are unannotated), which had an expression profile shared by both dendritic cells and macrophages. 36% of the genes in cluster 27 had their highest PEM for dendritic cells and 26% for untreated BMDMs (both samples from BioProject PRJEB7475), with the remaining 26% for BMDMs treated with CSF1 (from BioProject PRJEB7662) (Additional file 14: Table S14). This cluster also contained the lipopolysaccharide receptor and commonly used monocyte marker, CD14, several genes (C1QA/B/C, MARCO, P2RY12/13, and STAB1) that are associated with tissue-specific macrophage populations in mice , and a single myeloid-associated transcription factor, MAFB, which is required for tissue macrophage development in mice . The cells referred to as dendritic cells are bone marrow cells grown in GM-CSF (CSF2), rather than CSF1. As noted in previous analyses of mouse  and human  transcriptomes, cells differentiated in GM-CSF have much more in common with macrophages than with classical dendritic cells dependent upon FLT3-ligand.
The clusters associated with the acquired immune response, predominantly B and T cells, are somewhat smaller and poorly-annotated (clusters 20, 21, 22, 29 and 78). Cluster 21, expressed most highly in spleen, contains TIMD4 (ENSGALG00000003876), which promotes T-cell expansion and survival , and is enriched with B cell-associated genes, including the B cell transcription factors BATF, IRF4, PAX5, RUNX3, and SPIC, as well as the class II trans-activator CIITA, class II subunit CD74 and the class II MHC gene BLB2. The thymus-enriched cluster 29 contains CD4, the recombination activating genes RAG1 and RAG2, and the T cell transcription factors LEF1, RORC and TCF7.
Integrating gene expression and protein-protein interaction networks
Biological systems can be functionally organised into many different (and intersecting) networks based on the nature of their interaction, including – aside from gene co-expression networks – metabolic/biochemical networks, signal transduction networks, regulatory networks, and protein-protein interaction (PPI) networks . Data from different networks can be integrated: for instance, subunits of the same protein complex are known to be co-expressed , with those genes present in both a co-expression and PPI network having a high probability of performing similar functions . We therefore determined the set of genes present in both the same co-expression cluster and a PPI network (Additional file 16: Table S16), obtaining chicken PPI data by mapping human PPIs to orthologous chicken genes (see Materials and Methods). The PPI and co-expression data are mutually supportive. For example, there were 32 PPIs among the genes in the macrophage-specific cluster 4. These include STAT1 (signal transducer and activator of transcription 1-alpha/beta) – a critical mediator of the pro-inflammatory response of macrophages to LPS  – and the transcription factors ATF3, a known inducer of STAT1 , and SPI1/PU.1, which is essential for macrophage differentiation . Also in the network are the tyrosine kinase LYN, which is activated alongside STAT1 in response to IL5 (a key mediator of eosinophil activation ), and the adaptor protein GRB2, which facilitates the activation of ERK by tyrosine kinases  (ERK signalling is essential to macrophage development ). In addition, the network contained SOCS3, a negative regulator of cytokine signalling that inhibits the nuclear translocation of STAT1 in response to IFN stimulation , with this stimulation being a key constituent of classical macrophage activation .
Integrating gene expression and promoter expression networks
RNA-seq is a multi-step process of reverse transcription, amplification, fragmentation, purification, adaptor ligation and sequencing, with each step subject to error . Such laboratory-specific variation is also independent of intrinsic sequencing biases, which can influence the nucleotide composition of the reads  (leading to mismatches between the sequenced read and the original RNA fragment ), the GC content of the reads , and the sequencing error rate .
Despite all of these constraints, Fig. 3 shows that in a sample-to-sample network graph of many independently sequenced tissues, the signal of co-expression clearly outweighs the noise.
The critical step in reducing the noise, and making the datasets comparable, was to down-size the RNA-seq libraries so that the depth of coverage of the transcriptome was the same in each case. This has the effect of removing a great deal of the stochastic detection of more lowly-expressed transcripts. Figure 2 and Additional file 9: Table S9 show that the random sampling used to down-size does not substantially alter the relative expression estimates of any two genes within any given sample, with equivalent expression profiles reconstructed for each of 100 random samples. Combined with the use of Kallisto to quantify expression, which maps a common depth of k-mers to a standardised reference transcriptome, the method we have developed effectively ensured that each RNA-seq library was exploring an equivalent transcriptomic space.
The success of the aggregation of public domain data in terms of genome annotation is evident from the analysis of the membership of co-expression clusters in Additional file 13: Table S13. Each cluster clearly contains genes of known function, shows evidence of very strong GO enrichment, and as noted in similar array-based studies [2, 26] commonly contains the transcription factors that regulate the other members of the cluster. On that basis, it would be reasonable to provisionally assign the same GO terms to genes of unknown function, at least within the larger clusters. For example, the genes within cluster 1 that are not currently functionally annotated or assigned a clear orthologue are likely to be involved in some way in the cell cycle. Indeed, the provisional annotations of many of them shown in Additional file 13: Table S13 indicate this is very likely to be the case. Similarly, the genes we have identified that were enriched in innate and acquired immune cells are likely to be associated with heritable variation in disease resistance/susceptibility.
Detailed examination of individual clusters can provide significant biological insights. Cluster 8, enriched in trachea, and with the second highest expression in lung, was strongly enriched with GO terms associated with cilium, microtubule binding, motor activity and the actin cytoskeleton (Additional file 15: Table S15), and includes, for example, multiple members of the cilia and flagella-associated protein (CFAP), dynein regulatory complex (DRC) and other dynein-related gene families. Mutations in many of these genes have been associated with human ciliopathies . This cluster also contained the transcription factor FOXJ1, which is essential for the formation of motile cilia in mice . Provisional annotations of genes of unknown function in this cluster are consistent with the overall enrichment for genes associated with motility. The presence of the epithelial transcription factors ELF5 and PAX9 in this cluster suggests that both genes could have a role in regulation of this key gene set, providing a possible reason for the embryonic lethality of the knockouts of each gene [105, 106]. Interestingly, KIAA0586 (also known as TALPID3) is not found in cluster 8 but in a separate cluster – number 139 – containing only 13 genes. The TALPID3 protein encodes a centromeric component, and mutation affects the formation of primary, non-motile cilia and signaling by the morphogen sonic hedgehog [107, 108]. Many of the genes that are apparently co-regulated with TALPID3 have been associated in some way with regulatory functions of primary cilia, including CEP120 which, like KIAA0596, is mutated in human Joubert syndrome . Other members of the cluster may be candidate interactors with TALPID3.
The validity of the approach, and of the clusters generated, was established by comparing tissue- and function-specific clusters obtained by an alternate method of quantifying RNA expression levels, CAGE, using a public dataset of chicken embryo development. This showed that tissue-specific developmental gene expression can be detected using whole embryos (as we have previously shown for mouse ), and that the genes in the developmental stage clusters matched those found in the adult tissue atlas.
The clustering we have presented is based upon an arbitrary correlation threshold. For every gene of interest, it can be informative to identify its transcriptional companions. To this end, as we have done previously for human , pig , sheep  and mouse , we have made the current version of this atlas available as a searchable database using the gene annotation portal BioGPS  (http://www.biogps.org/chickenatlas), where one can utilise a simple “find correlated” function to identify genes with similar expression profiles. In turn, this resource allows a rapid comparative assessment of the expression of a gene of interest in mammals and birds and the extent to which functional information is likely to be transferable across species.
Expression profiles obtained from public RNA-seq datasets – despite being generated by different laboratories using different methodologies – can be made comparable to each other by randomly down-sampling to a common depth and then quantifying expression against a reference transcriptome. Using this method, we generated a comprehensive atlas of mRNA expression for the domestic chicken. The advantage of the aggregation method that we have applied is that it can be extended with new data from tissues and cell types we have not currently included. The larger the dataset, and the greater the transcriptional space sampled, the more stringent the correlations that will be generated and the more likely they are to produce new biological insights.
To obtain bone marrow-derived macrophages, nine chickens of approximately 8 weeks of age (3 female and 3 male Ross 308 broilers, and 3 female CSF1R-MacApple transgenic NOVOgen Brown layers) were euthanized by cervical dislocation and confirmed dead by decapitation. Likewise were euthanized 9 female and 14 male broiler chickens of an Aviagen pedigree line, each 5 weeks of age, to obtain the caecal tonsils. All animal work was conducted in accordance with guidelines of the Roslin Institute and the University of Edinburgh and carried out under the regulations of the Animals (Scientific Procedures) Act 1986 under Home Office project license PPL 60/4420. Approval was obtained from the Roslin Institute’s and the University of Edinburgh’s Protocols and Ethics Committees.
Macrophage cell culture and RNA isolation
Bone marrow-derived macrophage (BMDM) culture and challenge in vivo were performed as previously described . Chicken bone marrow was cultured for 7 days with 350 ng/μl chicken CSF1 on Sterilin plastic to differentiate BMDMs. Adherent cells were then transferred to tissue culture plastic and cells plated at 80% confluence. BMDMs were challenged with the addition of LPS at 100 ng/ml to culture medium and then harvested after 0 (null condition), and 24 h. Cells were harvested in TRIzol® (15,596,018; Thermo Fisher Scientific) and RNA extraction performed with the RNeasy Mini Kit (74,106; Qiagen Hilden, Germany) according to manufacturer’s instructions.
Collection of campylobacter-infected caecal tonsils
The 23 birds from which caecal tonsils were harvested were housed within a non-bio-secure environment referred to as a sib-test environment, intended to resemble broader commercial conditions. A detailed description of the environmental parameters and management practices can be found in . Briefly, birds were fed a standard (maize-based) feed ration in the form of a starter, grower and finisher diet in line with industry practice. All birds throughout the study received the same vaccinations as per a commercial regimen and were reared under the same management practices. Birds were naturally exposed to Campylobacter spp. under these conditions. Caeca and caecal tonsil samples were collected in RNAlater (AM7021; Thermo Fisher Scientific, Waltham, USA). Campylobacter load in caeca was determined by selective culture as previously described . Seven serial ten-fold dilutions of caecal content were prepared in phosphate-buffered saline and 100 μl plated to mCCDA (modified cefoperazone-deoxycholate agar) supplemented with cefoperazone (32 mg/L) and amphotericin B (10 mg/L; Oxoid), followed by incubation for 48 h under microaerophilic conditions (5% O2, 5% CO2, and 90% N2) at 41C. Dilutions were plated in duplicate and colonies with morphology typical of Campylobacter detected in all samples. RNA was extracted from the caecal tonsils using the RNeasy Mini Kit (74,106; Qiagen Hilden, Germany) according to manufacturer’s instructions. As chickens were exposed naturally rather than being explicitly challenged with Campylobacter, bacterial load varied considerably between individuals. Accordingly, tonsil samples were partitioned into two broad subsets: those from chickens whose caecum has high Campylobacter load (> = 10,000 CFU/g), and those with low Campylobacter load (< 10,000 CFU/g).
For both BMDM and caecal tonsil samples, library preparation was performed by Edinburgh Genomics using either the Illumina TruSeq total RNA library preparation protocol (Ilumina; Part: 15031048, Revision E) or the Illumina TruSeq mRNA (poly-A selected) library preparation protocol (Ilumina; Part: 15031047, Revision E). Total RNA (for BMDMs) and mRNA (for caecal tonsils) was, in both cases, sequenced by Edinburgh Genomics at a depth of > 40 million strand-specific 75 bp paired-end reads per sample, using an Illumina HiSeq 4000. The raw data is deposited in the European Nucleotide Archive under accessions PRJEB22373 (BMDMs) and PRJEB22580 (caecal tonsils).
Public RNA-seq datasets
Publicly accessible datasets used in this study are described in Additional file 1: Table S1. The NCBI BioProject and Sequence Read Archive (SRA) sample IDs for these datasets are given in Additional file 6: Table S6. All public datasets for this study are available via the SRA, a public repository for sequence data maintained by the International Nucleotide Sequence Database Collaboration (INSDC) and accessible from the websites of its constituent members: known as the SRA if via the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm.nih.gov/sra), the DRA (DDBJ Read Archive) if via the DNA Data Bank of Japan (DDBJ) (http://trace.ddbj.nig.ac.jp/dra/), and the European Nucleotide Archive (ENA) if via the European Bioinformatics Institute (EBI) (www.ebi.ac.uk/ena) . For retrieving the raw files used in this study or for expanding this work with new datasets from novel tissues, note that data are directly accessible in fastq format from the ENA and DDBJ but only in a binary .sra format from the NCBI. Decompiling the latter into fastq files – using the fastq-dump tool within the SRA Toolkit (https://trace.ncbi.nlm.nih.gov/Traces/sra/?view=software) – is far slower than analysing fastq files with Kallisto, and so forms a bottleneck in the expression atlas creation pipeline. For this reason, obtaining fastq files in bulk from NCBI is not recommended unless necessary.
Defining a reference transcriptome and quantifying expression
Prior to expression level quantification, all RNA-seq datasets were randomly down-sampled to 10 million reads using seqtk (https://github.com/lh3/seqtk, downloaded 29th November 2016) with parameter -s 100 (to seed the random number generator). Expression level was then estimated, as transcripts per million (TPM), using the high-speed quantification tool Kallisto v0.43.1  and default parameters. For datasets comprising single-end reads, we used parameters -l 100 -s 10; estimates of the average fragment length and standard deviation of the fragment length, respectively. Kallisto quantifies expression at the transcript level by building an index of k-mers from a set of reference transcripts and then mapping the RNA-seq reads to it, matching k-mers generated from the reads with the k-mers present in the index. Transcript-level TPM estimates are then summarised to the gene level. A critical aspect of this method is in selecting an appropriate set of reference transcripts for which expression is quantified. An appropriate value of k for the index is also required because if k is too large relative to read length, there is a higher chance the k-mers of the reads will contain errors (as read quality decreases towards the 3′ end of reads ). If the reads generate erroneous k-mers, they will not match the k-mers of the index. We used a value of k = 21, which lies – approximately – between half the length of the shortest read and a third the length of the longest read.
As a reference transcriptome, we obtained from Ensembl v89 the set of GalGal5 protein-coding transcripts, parsing the batch release (ftp://ftp.ensembl.org/pub/release-89/fasta/gallus_gallus/cds/Gallus_gallus.Gallus_gallus-5.0.cds.all.fa.gz, accessed 21st June 2017) to retain only those transcripts with the ‘protein-coding’ biotype (n = 28,768 transcripts, representing 10,846 genes). To this was added the CDS of 28,466 NCBI mRNA RefSeqs that had neither been assigned Ensembl transcript IDs, nor whose sequence was already present in the Ensembl release (under any other identifier). To reduce the likelihood of spurious read mapping, CDS < 300 bp were excluded from analysis. Erroneous expression level estimates are more likely when fewer possible reads can be derived from a gene, i.e. if the CDS is short . While this approach arguably improves accuracy, it unavoidably excludes members of certain families, for instance the gallinacins , antimicrobial peptides known for their short chain lengths .
Although the Ensembl and NCBI sets of transcripts overlap, there are many unique entries in each. For example, RefSeqs XM_015294055 and XM_015294059 are both predicted transcripts of the macrophage-marker gene CD163 , although Ensembl refers to this gene only by the numerical ID ‘418303’. RefSeq records beginning with ‘XM’ are produced by the NCBI genome annotation pipeline and can lack transcript or protein homology support; by contrast, ‘NM’ records are validated . Consequently, neither of the CD163 RefSeqs are assigned Ensembl transcript IDs, and so they are excluded from the Ensembl batch release.
The RefSeq mRNA set also includes predictions of novel transcript sequences for existing Ensembl genes. For instance, the chicken BF1 gene (classical MHC class 1; Ensembl gene ID ENSGALG00000033932) has 7 transcripts (Ensembl v89), encoding proteins of length 228, 323, 345, 346, 350, 354 and 360 amino acids (aa). However, BF1 has only 3 associated mRNA RefSeqs, 1 validated and 2 predicted: NM_001044683, XM_015294995, and XM_015294996. These RefSeqs do not necessarily encode different proteins to those present in Ensembl – rather, the RefSeq mRNAs incorporate untranslated regions (UTRs) and so can encapsulate Ensembl CDS. For instance, the validated RefSeq mRNA NM_001044683 encodes the same 360aa protein as Ensembl CDS ENSGALT00000066783 (i.e. the same transcript model is independently available from both resources), but the RefSeq nucleotide sequence extends 17 bases upstream (the 5’ UTR) and 146 bases downstream (the 3’ UTR) of the coding ORF. By contrast, XM_015294995 encodes a putative 356aa peptide (XP_015150481) and XM_015294996 a 349aa peptide (XP_015150482), neither of which are available from Ensembl. As the XM_015294996 mRNA – an automated prediction – fully incorporates ENSGALT00000086848 (the CDS encoding the 228aa BF1 protein), we considered the sequence better supported by the Ensembl model, as Ensembl takes a conservative approach to annotation , and the predicted peptide spurious. By contrast, the XM_015294995 mRNA does not contain any existing Ensembl CDS and so encodes a protein absent from Ensembl.
Overall, we retained RefSeq ‘XM’ mRNAs only if they can be assigned to a gene not yet present in the Ensembl annotation, or, if that gene is present, they do not incorporate a CDS from any of that gene’s Ensembl transcript models. UTRs were trimmed from each RefSeq mRNA by excluding all sequence outside the longest ORF. This combined set of Ensembl and RefSeq transcripts constitutes a standardised RNA space against which expression can be quantified, as in .
After quantifying expression with this initial transcriptome, a revised transcriptome was created, excluding those transcripts whose average TPM was < 1 in all tissues (Additional file 5: Table S5), or which were only detectable in one tissue (as these may be artefacts of differential sequencing depth). Tissues whose distribution of TPM estimates does not comply with Zipf’s law (see below) were not counted. The revised transcriptome contains 28,276 Ensembl transcripts (representing 10,826 Ensembl genes) and 26,694 NCBI transcripts (which account for only 4665 existing Ensembl genes).
Compliance of RNA-seq datasets with Zipf’s law
In a correctly prepared RNA-seq dataset, a minority of genes will produce the majority of reads and so its distribution of gene-level TPM estimates should comply, to a reasonable approximation, with Zipf’s law (which states that the probability of an observation is inversely proportional to its rank). A custom Perl script was used to identify, per sample, the number of unique TPM values and the number of genes with a TPM at or exceeding this level. After excluding, for robustness, data from the first and last order of magnitude (as in ) and all values of TPM < 5 (which have a higher likelihood of transcriptional noise), the data was log-transformed and a linear regression model fitted using R v3.2.0 . Samples whose exponents deviated too greatly from − 1 (by ±20%, i.e. if the exponent is < − 0.8 or > − 1.2) were considered erroneous.
For each gene, we calculated a preferential expression measure (PEM) in a manner similar to . PEM relates the average expression of that gene in a given tissue to the average expression of that gene in all tissues. For each gene i, then for tissue ti, PEM(ti) = S-A, where S = expression of gene i in tissue ti, and A = arithmetic mean expression of gene i across the set of all tissues. Prior to calculation, all TPM values < 1 were considered to be 1, and a log2-transformation applied. This is to ensure that genes with expression indistinguishable from noise (TPM < 1) will have a PEM of 0. Each gene will have a distribution of PEM values, one for each tissue in the meta-datasets. Genes with higher PEM values for a given tissue are more tissue-specific in their expression profile.
Gene ontology (GO) term enrichment
GO term enrichment was assessed using the R package topGO , which utilises the ‘weight’ algorithm to account for the nested structure of the GO tree . topGO requires a reference set of GO terms, which was built manually from the GalGal5 set (obtained from Ensembl BioMart v89 ) and filtered to remove those terms with evidence codes NAS (non-traceable author statement) or ND (no biological data available), and those assigned to fewer than 10 genes in total. Significantly enriched GO terms (p < 0.05) are reported only if the observed number per tissue exceeds the expected by 2-fold or greater.
Unannotated genes in GalGal5 – those with only an Ensembl placeholder ID, rather than an HGNC name – are annotated by reference to the NCBI non-redundant (nr) peptide database v77 , with each annotation assigned a quality category of 1 to 8 (highest to lowest quality, respectively), as previously described . For each unannotated gene, we took the longest encoded peptide and obtained the set of blastp alignments  against NCBI nr, at a scoring threshold of p < = 1e− 25. These alignments are a set of possible gene descriptions, of which only one can be selected as the annotation of that gene. The lowest quality category, 8, is the blastp hit with the lowest E-value. All subsequent quality categories require higher-quality hits, which: (a) have a % identity within the aligned region of > = 90%, (b) have an alignment length > = 90% of the length of the query protein, (c) have an alignment length > = 50 amino acids, and (d) have no gaps. Hits to proteins labelled either ‘low quality’, ‘hypothetical’, ‘unnamed’, ‘uncharacterized’ or ‘putative’ are excluded, as are those having a third-party annotation (as these can be by inference and not experiment). Quality category 7 is the best-scoring (i.e. lowest E-value) of these higher quality hits. Category 6 is as above, but with at least one identifiable hit to the human proteome. Category 5 requires that the set of alignments span at least 4 different genera (excluding Gallus). At this point, if > = 75% of the alignments have the same description, the gene is named for the associated HGNC name (according to ftp://ftp.ebi.ac.uk/pub/databases/genenames/new/tsv/locus_types/gene_with_protein_product.txt, downloaded 24th August 2016). However, as NCBI nr aggregates multiple sources of data, gene descriptions have numerous synonyms and so it is not always possible to automatically assign an HGNC symbol. The highest quality categories, 1 to 4, not only meet the above criteria but have degrees of reciprocal % identity to the human proteome. The highest quality category, 1, is if there is also a near-perfect match to an existing, related, peptide (alignment length > = 90% of the length of a human protein). Other quality categories, in descending order, are: 2 (alignment length > = 75% of the length of a human protein), 3 (> = 50%), and 4 (< 50%). Human protein sequences were obtained from genebuild GRCh38.p8 (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.34_GRCh38.p8/GCF_000001405.34_GRCh38.p8_protein.faa.gz, downloaded 30th August 2016).
Network analysis was performed using Graphia Professional (Kajeka Ltd., Edinburgh, UK), a commercial version of BioLayout Express3D [67, 68]. Graphia Professional determines the similarities between individual expression profiles by building a correlation matrix for both gene-to-gene and sample-to-sample comparisons. This matrix is then filtered to remove all correlations below a certain threshold (for the gene-to-gene comparison in the RNA-seq atlas, Pearson’s r < 0.8). A network graph is constructed by connecting nodes (genes) with edges (correlations above the threshold), and its local structure interpreted by applying the Markov clustering (MCL) algorithm  at an inflation value (which determines cluster granularity) of 2.2, consistent with a previous study .
Protein-protein interaction data was obtained from the IID (Integrated Interactions Database) version 2017–04 (http://iid.ophid.utoronto.ca/iid, accessed 25th July 2017) , a resource which combines computationally predicted PPIs with experimentally determined PPIs drawn from multiple databases. These include BIND (Biomolecular Interaction Network Database) , BioGRID (Biological General Repository for Interaction Datasets) , DIP (Database of Interacting Proteins) , HPRD (Human Protein Reference Database) , IntAct , I2D (Interologous Interaction Database) , InnateDB  and MINT (Molecular Interaction Database) . The format of the PPI data is as a list of UniProt IDs, with one of three evidence types for the interaction: ‘exp’ (experimentally determined in this species), ‘pred’ (an in silico prediction from one of four previous studies [136–139]) and ‘ortho’ (predicted by mapping experimentally determined PPIs from another species to orthologous protein pairs in this species). As chicken PPI data is unavailable, we obtained human PPIs from the IID, and considered only those PPIs that (a) involve genes that each have a one-to-one orthologue to the chicken with an orthology confidence score of 1 (using data from Ensembl Compara , a score of 1 indicates compliance with the gene tree), a reciprocal % gene identity of > = 75%, a whole genome alignment score of > = 75%, and a gene order conservation score of > = 75% (indicating a high degree of contiguity around the gene of interest), (b) have UniProt IDs that are unambiguously assigned to only one human gene ID (and thereby only one orthologous chicken gene ID), and (c) have PPI evidence type ‘exp’ or ‘pred’.
Availability of datasets
To test whether down-sampling quantitatively alters the expression profile of an RNA-seq dataset, we randomly down-sampled each of the 18 BMDM datasets (+/− LPS) to 10 million reads 100 times, using seqtk seeded with a random integer between 0 and 10,000. These sets of expression estimates are available as Dataset S1, hosted on the University of Edinburgh DataShare portal (https://doi.org/10.7488/ds/2137). The meta-atlas of chicken gene expression is available in full as Additional file 6: Table S6 and via the cross-species annotation portal BioGPS (http://biogps.org/dataset/BDS_00031/chicken-atlas/). To compare genes between species and to visualise expression profiles, BioGPS requires that each gene have an Entrez ID, although this is not the case for all genes in GalGal5. The expression profiles of those genes without Entrez IDs can be found in Additional file 6: Table S6.
Analysis of chicken developmental samples
The expression data derived from CAGE  were obtained from http://fantom.gsc.riken.jp/5/suppl/Lizio_et_al_2017/data; the expression file is named galGal5.cage_peak_tpm.osc.txt.gz and the annotation file galGal5.cage_peak_ann.txt. The annotation and expression files were emerged based on chromosomal location of the promoter. All promoters where no sample exceeded 10 tags per million (tagsPM) were excluded from the analysis. The expression data were then entered into Graphia Professional (as described above), using a correlation coefficient threshold of 0.75. Twenty-two thousand eight hundred thirty-nine nodes joined by 5,035,102 edges were entered into the analysis and clustered with an MCL inflation value of 2.2, resulting in 132 clusters of at least 10 nodes.
This work was supported by a Biotechnology and Biological Sciences Research Council (BBSRC) project grant (BB/M011925/1) to DAH and institute strategic program grants ‘Farm Animal Genomics’ (BBS/E/D/20211550) and ‘Transcriptomes, Networks and Systems’ (BBS/E/D/20211552). RNA-seq of the Campylobacter-infected chickens was supported by funding from the Rural and Environmental Science and Analytical Sciences Division of the Scottish Government, and from a BBSRC LINK project with Aviagen Ltd. (BB/J006815/1). Edinburgh Genomics is partly supported through core grants from the BBSRC (BB/J004243/1), Natural Environmental Research Council (R8/H10/56), and Medical Research Council (MR/K001744/1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Availability of data and materials
The datasets generated during this study are available in the European Nucleotide Archive under accessions PRJEB22373 and PRJEB22580. All data analysed during this study are included in this published article (and its supplementary information files). The atlas of chicken gene expression is also available via the cross-species annotation portal BioGPS (http://biogps.org/dataset/BDS_00031/chicken-atlas/).
DAH coordinated the study. LF, AJM, and JOD performed macrophage cell culture and RNA extraction. AP, JS and MS funded, generated and provided RNA-seq data from the caecal tonsils of Campylobacter-infected birds. CW and CA prepared data for visualisation with BioGPS. SJB performed all bioinformatic analyses with the exception of the CAGE analysis. KMS performed the CAGE analysis. SJB and DAH wrote the manuscript. All authors read, contributed to, and approved the final manuscript.
All animals in this study were bred at the National Avian Research Facility at the Roslin Institute. Approval was obtained from the Protocols and Ethics Committee of the Roslin Institute and the Protocols and Ethics Committee of the University of Edinburgh. All animal work was carried out under the regulations of the Animals (Scientific Procedures) Act 1986 under Home Office project license PPL 60/4420.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- He F, Yoo S, Wang D, Kumari S, Gerstein M, Ware D, Maslov S. Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis. Plant J. 2016;86(6):472–80.PubMedView ArticleGoogle Scholar
- Mabbott NA, Baillie JK, Brown H, Freeman TC, Hume DA. An expression atlas of human primary cells: inference of gene function from coexpression networks. BMC Genomics. 2013;14(1):1–13.View ArticleGoogle Scholar
- Doig TN, Hume DA, Theocharidis T, Goodlad JR, Gregory CD, Freeman TC. Coexpression analysis of large cancer datasets provides insight into the cellular phenotypes of the tumour microenvironment. BMC Genomics. 2013;14:469.PubMedPubMed CentralView ArticleGoogle Scholar
- Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szcześniak MW, Gaffney DJ, Elo LL, Zhang X, et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17:13.PubMedPubMed CentralView ArticleGoogle Scholar
- van Dijk EL, Jaszczyszyn Y, Thermes C. Library preparation methods for next-generation sequencing: tone down the bias. Exp Cell Res. 2014;322(1):12–20.PubMedView ArticleGoogle Scholar
- Chhangawala S, Rudy G, Mason CE, Rosenfeld JA. The impact of read length on quantification of differentially expressed genes and splice junction detection. Genome Biol. 2015;16(1):131.PubMedPubMed CentralView ArticleGoogle Scholar
- Sinha R, Lenser T, Jahn N, Gausmann U, Friedel S, Szafranski K, Huse K, Rosenstiel P, Hampe J, Schuster S, et al. TassDB2 - a comprehensive database of subtle alternative splicing events. BMC Bioinformatics. 2010;11(1):1–7.View ArticleGoogle Scholar
- Zhao S, Zhang Y, Gordon W, Quan J, Xi H, Du S, von Schack D, Zhang B. Comparison of stranded and non-stranded RNA-seq transcriptome profiling and investigation of gene overlap. BMC Genomics. 2015;16(1):675.PubMedPubMed CentralView ArticleGoogle Scholar
- Sultan M, Amstislavskiy V, Risch T, Schuette M, Dökel S, Ralser M, Balzereit D, Lehrach H, Yaspo M-L. Influence of RNA extraction methods and library selection schemes on RNA-seq data. BMC Genomics. 2014;15(1):1–13.View ArticleGoogle Scholar
- Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Gu Y. A tale of three next generation sequencing platforms: comparison of ion torrent, Pacific biosciences and Illumina MiSeq sequencers. BMC Genomics. 2012;13(1):341.PubMedPubMed CentralView ArticleGoogle Scholar
- Li S, Tighe SW, Nicolet CM, Grove D, Levy S, Farmerie W, Viale A, Wright C, Schweitzer PA, Gao Y, et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat Biotech. 2014;32(9):915–25.View ArticleGoogle Scholar
- González E, Joly S. Impact of RNA-seq attributes on false positive rates in differential expression analysis of de novo assembled transcriptomes. BMC Research Notes. 2013;6(1):503.PubMedPubMed CentralView ArticleGoogle Scholar
- Adiconis X, Borges-Rivera D, Satija R, DeLuca DS, Busby MA, Berlin AM, Sivachenko A, Thompson DA, Wysoker A, Fennell T, et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat Methods. 2013;10(7):623–9.PubMedPubMed CentralView ArticleGoogle Scholar
- Esteve-Codina A, Arpi O, Martinez-García M, Pineda E, Mallo M, Gut M, Carrato C, Rovira A, Lopez R, Tortosa A, et al. A comparison of RNA-Seq results from paired formalin-fixed paraffin-embedded and fresh-frozen glioblastoma tissue samples. PLoS One. 2017;12(1):e0170632.PubMedPubMed CentralView ArticleGoogle Scholar
- Gallego Romero I, Pai AA, Tung J, Gilad Y. RNA-seq: impact of RNA degradation on transcript quantification. BMC Biol. 2014;12(1):42.PubMedPubMed CentralView ArticleGoogle Scholar
- Seear PJ, Sweeney GE. Stability of RNA isolated from post-mortem tissues of Atlantic salmon (Salmo salar L.). Fish Physiol Biochem. 2008;34(1):19–24.PubMedView ArticleGoogle Scholar
- Opitz L, Salinas-Riester G, Grade M, Jung K, Jo P, Emons G, Ghadimi BM, Beißbarth T, Gaedcke J. Impact of RNA degradation on gene expression profiling. BMC Med Genet. 2010;3(1):36.Google Scholar
- Johnson BR, Atallah J, Plachetzki DC. The importance of tissue specificity for RNA-seq: highlighting the errors of composite structure extractions. BMC Genomics. 2013;14(1):586.PubMedPubMed CentralView ArticleGoogle Scholar
- Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. The evolution of gene expression levels in mammalian organs. Nature. 2011;478(7369):343–8.PubMedView ArticleGoogle Scholar
- Lin S, Lin Y, Nery JR, Urich MA, Breschi A, Davis CA, Dobin A, Zaleski C, Beer MA, Chapman WC, et al. Comparison of the transcriptional landscapes between human and mouse tissues. Proc Natl Acad Sci U S A. 2014;111(48):17224–9.PubMedPubMed CentralView ArticleGoogle Scholar
- Merkin J, Russell C, Chen P, Burge CB. Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science (New York, NY). 2012;338(6114):1593–9.View ArticleGoogle Scholar
- Sudmant PH, Alexis MS, Burge CB. Meta-analysis of RNA-seq expression data across species, tissues and studies. Genome Biol. 2015;16(1):287.PubMedPubMed CentralView ArticleGoogle Scholar
- Oliver S. Guilt-by-association goes global. Nature. 2000;403(6770):601–3.PubMedView ArticleGoogle Scholar
- Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, et al. An atlas of active enhancers across human cell types and tissues. Nature. 2014;507(7493):455–61.PubMedPubMed CentralView ArticleGoogle Scholar
- Forrest AR, Kawaji H, Rehli M, Baillie JK, de Hoon MJ, Haberle V, Lassmann T, Kulakovskiy IV, Lizio M, Itoh M, et al. A promoter-level mammalian expression atlas. Nature. 2014;507(7493):462–70.PubMedView ArticleGoogle Scholar
- Freeman TC, Ivens A, Baillie JK, Beraldi D, Barnett MW, Dorward D, Downing A, Fairbairn L, Kapetanovic R, Raza S, et al. A gene expression atlas of the domestic pig. BMC Biol. 2012;10(1):1–22.View ArticleGoogle Scholar
- Clark EL, Bush SJ, McCulloch MEB, Farquhar IL, Young R, Lefevre L, Pridans C, Tsang HG, Wu C, Afrasiabi C, et al. A high resolution atlas of gene expression in the domestic sheep (Ovis aries). PLoS Genet. 2017;13(9):e1006997.PubMedPubMed CentralView ArticleGoogle Scholar
- Hume DA, Summers KM, Raza S, Baillie JK, Freeman TC. Functional clustering and lineage markers: insights into cellular differentiation and gene function from large-scale microarray studies of purified primary cell populations. Genomics. 2010;95(6):328–38.PubMedView ArticleGoogle Scholar
- Carpanini SM, Wishart TM, Gillingwater TH, Manson JC, Summers KM. Analysis of gene expression in the nervous system identifies key genes and novel candidates for health and disease. Neurogenetics. 2017;18(2):81–95.PubMedPubMed CentralView ArticleGoogle Scholar
- Eising E, Huisman SM, Mahfouz A, Vijfhuizen LS, Anttila V, Winsvold BS, Kurth T, Ikram MA, Freilinger T, Kaprio J, et al. Gene co-expression analysis identifies brain regions and cell types involved in migraine pathophysiology: a GWAS-based study using the Allen human brain atlas. Hum Genet. 2016;135(4):425–39.PubMedPubMed CentralView ArticleGoogle Scholar
- Stern CD. The chick; a great model system becomes even greater. Dev Cell. 2005;8(1):9–17.PubMedGoogle Scholar
- Intarapat S, Stern CD. Chick stem cells: current progress and future prospects. Stem Cell Res. 2013;11(3):1378–92.PubMedPubMed CentralView ArticleGoogle Scholar
- Balic A, Garcia-Morales C, Vervelde L, Gilhooley H, Sherman A, Garceau V, Gutowska MW, Burt DW, Kaiser P, Hume DA, et al. Visualisation of chicken macrophages using transgenic reporter genes: insights into the development of the avian macrophage lineage. Development. 2014;141(16):3255–65.PubMedPubMed CentralView ArticleGoogle Scholar
- Han JY, Lee HJ. Genome Editing Mediated by Primordial Germ Cell in Chicken. Methods Mol Biol (Clifton, NJ). 2017;1630:153–63.View ArticleGoogle Scholar
- Woodcock ME, Idoko-Akoh A, MJ MG. Gene editing in birds takes flight. Mamm Genome. 2017;28:315–23.PubMedPubMed CentralView ArticleGoogle Scholar
- Taylor L, Carlson DF, Nandi S, Sherman A, Fahrenkrug SC, McGrew MJ. Efficient TALEN-mediated gene targeting of chicken primordial germ cells. Development. 2017;144(5):928–34.PubMedPubMed CentralView ArticleGoogle Scholar
- Kranis A, Gheyas AA, Boschiero C, Turner F, Yu L, Smith S, Talbot R, Pirani A, Brew F, Kaiser P, et al. Development of a high density 600K SNP genotyping array for chicken. BMC Genomics. 2013;14:59.PubMedPubMed CentralView ArticleGoogle Scholar
- Cheng HH, Kaiser P, Lamont SJ. Integrated genomic approaches to enhance genetic resistance in chickens. Annu Rev Anim Biosci. 2013;1:239–60.PubMedView ArticleGoogle Scholar
- Kuo RI, Tseng E, Eory L, Paton IR, Archibald AL, Burt DW. Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human. BMC Genomics. 2017;18(1):323.PubMedPubMed CentralView ArticleGoogle Scholar
- Kapetanovic R, Fairbairn L, Beraldi D, Sester DP, Archibald AL, Tuggle CK, Hume DA. Pig bone marrow-derived macrophages resemble human macrophages in their response to bacterial lipopolysaccharide. J Immunol. 2012;188(7):3382–94.PubMedView ArticleGoogle Scholar
- Smith J, Burt DW, the Avian RC. The Avian RNAseq consortium: a community effort to annotate the chicken genome. Cytogenet Genome Res. 2015;145(2):78–179.PubMedPubMed CentralView ArticleGoogle Scholar
- Langouet-Astrie CJ, Meinsen AL, Grunwald ER, Turner SD, Enke RA. RNA sequencing analysis of the developing chicken retina. Sci Data. 2016;3:160117.PubMedPubMed CentralView ArticleGoogle Scholar
- Piórkowska K, Żukowski K, Nowak J, Połtowicz K, Ropka-Molik K, Gurgul A. Genome-wide RNA-Seq analysis of breast muscles of two broiler chicken groups differing in shear force. Anim Genet. 2016;47(1):68–80.PubMedView ArticleGoogle Scholar
- Wu P, Ng CS, Yan J, Lai Y-C, Chen C-K, Lai Y-T, Wu S-M, Chen J-J, Luo W, Widelitz RB, et al. Topographical mapping of α- and β-keratins on developing chicken skin integuments: functional interaction and evolutionary perspectives. Proc Natl Acad Sci. 2015;112(49):E6770–9.PubMedView ArticleGoogle Scholar
- Resnyk CW, Chen C, Huang H, Wu CH, Simon J, Le Bihan-Duval E, Duclos MJ, Cogburn LA. RNA-Seq analysis of abdominal fat in genetically fat and lean chickens highlights a divergence in expression of genes controlling adiposity, hemostasis, and lipid metabolism. PLoS One. 2015;10(10):e0139549.PubMedPubMed CentralView ArticleGoogle Scholar
- Shen X, Bai X, Xu J, Zhou M, Xu H, Nie Q, Lu X, Zhang X. Transcriptome sequencing reveals genetic mechanisms underlying the transition between the laying and brooding phases and gene expression changes associated with divergent reproductive phenotypes in chickens. Mol Biol Rep. 2016;43(9):977–89.PubMedView ArticleGoogle Scholar
- Pritchett EM, Lamont SJ, Schmidt CJ. Transcriptomic changes throughout post-hatch development in Gallus gallus pituitary. J Mol Endocrinol. 2016;58(1):43–55.PubMedPubMed CentralView ArticleGoogle Scholar
- Van Goor A, Ashwell CM, Persia ME, Rothschild MF, Schmidt CJ, Lamont SJ. Unique genetic responses revealed in RNA-seq of the spleen of chickens stimulated with lipopolysaccharide and short-term heat. PLoS One. 2017;12(2):e0171414.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang Y, Lupiani B, Reddy SM, Lamont SJ, Zhou H. RNA-seq analysis revealed novel genes and signaling pathway associated with disease resistance to avian influenza virus infection in chickens. Poult Sci. 2014;93(2):485–93.PubMedView ArticleGoogle Scholar
- Li Z, Ouyang H, Zheng M, Cai B, Han P, Abdalla BA, Nie Q, Zhang X. Integrated analysis of long non-coding RNAs (LncRNAs) and mRNA expression profiles reveals the potential role of LncRNAs in skeletal muscle development of the chicken. Front Physiol. 2016;7:687.PubMedGoogle Scholar
- Muret K, Klopp C, Wucher V, Esquerré D, Legeai F, Lecerf F, Désert C, Boutin M, Jehl F, Acloque H, et al. Long noncoding RNA repertoire in chicken liver and adipose tissue. Genetics, selection, evolution : GSE. 2017;49:6.PubMedView ArticleGoogle Scholar
- Roux P-F, Frésard L, Boutin M, Leroux S, Klopp C, Djari A, Esquerré D, Martin PGP, Zerjal T, Gourichon D, et al. The Extent of mRNA Editing Is Limited in Chicken Liver and Adipose, but Impacted by Tissular Context, Genotype, Age, and Feeding as Exemplified with a Conserved Edited Site in COG3. G3. 2016;6(2):321–35.View ArticleGoogle Scholar
- Lizio M, Deviatiiarov R, Nagai H, Galan L, Arner E, Itoh M, Lassmann T, Kasukawa T, Hasegawa A, Ros MA, et al. Systematic analysis of transcription start sites in avian development. PLoS Biol. 2017;15(9):e2002887.PubMedPubMed CentralView ArticleGoogle Scholar
- Deviatiiarov R, Lizio M, Gusev O. Application of a CAGE Method to an Avian Development Study. Methods Mol Biol (Clifton, NJ). 2017;1650:101–9.View ArticleGoogle Scholar
- Zeferino CP, Wells KD, Moura ASAMT, Rottinghaus GE, Ledoux DR. Changes in renal gene expression associated with induced ochratoxicosis in chickens: activation and deactivation of transcripts after varying durations of exposure. Poult Sci. 2017;96(6):1855–65.PubMedGoogle Scholar
- Han D, Zhang Y, Chen J, Hua G, Li J, Deng X, Deng X. Transcriptome analyses of differential gene expression in the bursa of Fabricius between silky fowl and white leghorn. Sci Rep. 2017;7:45959.PubMedPubMed CentralView ArticleGoogle Scholar
- X-d L, Zhang F, Shan H, Wang S-B, Chen P-Y. mRNA expression in different developmental stages of the chicken bursa of Fabricius. Poult Sci. 2016;95(8):1787–94.View ArticleGoogle Scholar
- Zhu G, Mao Y, Zhou W, Jiang Y. Dynamic changes in the follicular transcriptome and promoter DNA methylation pattern of steroidogenic genes in chicken follicles throughout the ovulation cycle. PLoS One. 2016;10(12):e0146028.View ArticleGoogle Scholar
- Bush SJ, McCulloch MEB, Summers KM, Hume DA, Clark EL. Integration of quantitated expression estimates from polyA-selected and rRNA-depleted RNA-seq libraries. BMC Bioinformatics. 2017;18(1):301.PubMedPubMed CentralView ArticleGoogle Scholar
- Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotech. 2016;34(5):525–7.View ArticleGoogle Scholar
- Lu T, Costello CM, Croucher PJ, Hasler R, Deuschl G, Schreiber S. Can Zipf's law be adapted to normalize microarrays? BMC Bioinformatics. 2005;6:37.PubMedPubMed CentralView ArticleGoogle Scholar
- Furusawa C, Kaneko K. Zipf's law in gene expression. Phys Rev Lett. 2003;90(8):088102.PubMedView ArticleGoogle Scholar
- Tarazona S, García-Alcalde F, Dopazo J, Ferrer A, Conesa A. Differential expression in RNA-seq: a matter of depth. Genome Res. 2011;21(12):2213–23.PubMedPubMed CentralView ArticleGoogle Scholar
- Liu Y, Zhou J, White KP. RNA-seq differential expression studies: more sequence or more replication? Bioinformatics. 2014;30(3):301–4.PubMedView ArticleGoogle Scholar
- Huminiecki L, Lloyd A, Wolfe K. Congruence of tissue expression profiles from gene expression atlas, SAGEmap and TissueInfo databases. BMC Genomics. 2003;4(1):31.PubMedPubMed CentralView ArticleGoogle Scholar
- Glick B. Historical perspective: the bursa of Fabricius and its influence on B-cell development, past and present. Vet Immunol Immunopathol. 1991;30(1):3–12.PubMedView ArticleGoogle Scholar
- Freeman TC, Goldovsky L, Brosch M, van Dongen S, Maziere P, Grocock RJ, Freilich S, Thornton J, Enright AJ. Construction, visualisation, and clustering of transcription networks from microarray expression data. PLoS Comput Biol. 2007;3(10):2032–42.PubMedView ArticleGoogle Scholar
- Theocharidis A, van Dongen S, Enright AJ, Freeman TC. Network visualization and analysis of gene expression data using BioLayout express(3D). Nat Protoc. 2009;4(10):1535–50.PubMedView ArticleGoogle Scholar
- van Dongen S, Abreu-Goodger C. Using MCL to extract clusters from networks. Methods Mol Biol (Clifton, NJ). 2012;804:281–95.View ArticleGoogle Scholar
- Bar-Joseph Z, Siegfried Z, Brandeis M, Brors B, Lu Y, Eils R, Dynlacht BD, Simon I. Genome-wide transcriptional analysis of the human cell cycle identifies genes differentially regulated in normal and cancer cells. Proc Natl Acad Sci U S A. 2008;105(3):955–60.PubMedPubMed CentralView ArticleGoogle Scholar
- Wu D-D, Irwin DM, Zhang Y-P. Molecular evolution of the keratin associated protein gene family in mammals, role in the evolution of mammalian hair. BMC Evol Biol. 2008;8(1):241.PubMedPubMed CentralView ArticleGoogle Scholar
- Eckelhoefer HA, Rajapaksa TE, Wang J, Hamer M, Appleby NC, Ling J, Lo DD. Claudin-4: Functional Studies Beyond the Tight Junction. In: Turksen K, editor. Claudins: Methods and Protocols. Totowa: Humana Press; 2011. p. 115–28.View ArticleGoogle Scholar
- So A, Thorens B. Uric acid transport and disease. J Clin Invest. 2010;120(6):1791–9.PubMedPubMed CentralView ArticleGoogle Scholar
- Galvan I, Solano F. Bird integumentary Melanins: biosynthesis, forms, function and evolution. Int J Mol Sci. 2016;17(4):520.PubMedPubMed CentralView ArticleGoogle Scholar
- Hume DA, Summers KM, Rehli M. Transcriptional Regulation and Macrophage Differentiation. Microbiol Spectr. 2016;4(3):MCHD-0024-2015.Google Scholar
- Mass E, Ballesteros I, Farlik M, Halbritter F, Gunther P, Crozet L, Jacome-Galarza CE, Handler K, Klughammer J, Kobayashi Y, et al. Specification of tissue-resident macrophages during organogenesis. Science (New York, NY). 2016;353(6304) https://doi.org/10.1126/science.aaf4238.
- Aziz A, Soucie E, Sarrazin S, Sieweke MH. MafB/c-Maf deficiency enables self-renewal of differentiated functional macrophages. Science (New York, NY). 2009;326(5954):867–71.View ArticleGoogle Scholar
- Hume DA, Mabbott N, Raza S, Freeman TC. Can DCs be distinguished from macrophages by molecular signatures? Nat Immunol. 2013;14(3):187–9.PubMedView ArticleGoogle Scholar
- Joshi A, Pooley C, Freeman TC, Lennartsson A, Babina M, Schmidl C, Geijtenbeek T, Michoel T, Severin J, Itoh M, et al. Technical advance: transcription factor, promoter, and enhancer utilization in human myeloid cells. J Leukoc Biol. 2015;97(5):985–95.PubMedPubMed CentralView ArticleGoogle Scholar
- Rodriguez-Manzanet R, Meyers JH, Balasubramanian S, Slavik J, Kassam N, Dardalhon V, Greenfield EA, Anderson AC, Sobel RA, Hafler DA, et al. TIM-4 expressed on APCs induces T cell expansion and survival. J Immunol. 2008;180(7):4706.PubMedPubMed CentralView ArticleGoogle Scholar
- Pavlopoulos GA, Secrier M, Moschopoulos CN, Soldatos TG, Kossida S, Aerts J, Schneider R, Bagos PG. Using graph theory to analyze biological networks. BioData Min. 2011;4:10.PubMedPubMed CentralView ArticleGoogle Scholar
- Jansen R, Greenbaum D, Gerstein M. Relating whole-genome expression data with protein-protein interactions. Genome Res. 2002;12(1):37–46.PubMedPubMed CentralView ArticleGoogle Scholar
- Tornow S, Mewes HW. Functional modules by relating protein interaction networks and gene expression. Nucleic Acids Res. 2003;31(21):6283–9.PubMedPubMed CentralView ArticleGoogle Scholar
- Kovarik P, Stoiber D, Novy M, Decker T. Stat1 combines signals derived from IFN-gamma and LPS receptors during macrophage activation. EMBO J. 1998;17(13):3660–8.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim JY, Song EH, Lee S, Lim JH, Choi JS, Koh IU, Song J, Kim WH. The induction of STAT1 gene by activating transcription factor 3 contributes to pancreatic beta-cell apoptosis and its dysfunction in streptozotocin-treated mice. Cell Signal. 2010;22(11):1669–80.PubMedView ArticleGoogle Scholar
- Celada A, Borras FE, Soler C, Lloberas J, Klemsz M, van Beveren C, McKercher S, Maki RA. The transcription factor PU.1 is involved in macrophage proliferation. J Exp Med. 1996;184(1):61–9.PubMedView ArticleGoogle Scholar
- Pazdrak K, Justement L, Alam R. Mechanism of inhibition of eosinophil activation by transforming growth factor-beta. Inhibition of Lyn, MAP, Jak2 kinases and STAT1 nuclear factor. J Immunol. 1995;155(9):4454–8.PubMedGoogle Scholar
- Frühbeck G. Intracellular signalling pathways activated by leptin. Biochem J. 2006;393(Pt 1):7–20.PubMedView ArticleGoogle Scholar
- Richardson ET, Shukla S, Nagy N, Boom WH, Beck RC, Zhou L, Landreth GE, Harding CV. ERK signaling is essential for macrophage development. PLoS One. 2015;10(10):e0140064.PubMedPubMed CentralView ArticleGoogle Scholar
- Song MM, Shuai K. The suppressor of cytokine signaling (SOCS) 1 and SOCS3 but not SOCS2 proteins inhibit interferon-mediated antiviral and antiproliferative activities. J Biol Chem. 1998;273(52):35056–62.PubMedView ArticleGoogle Scholar
- Su X, Yu Y, Zhong Y, Giannopoulou EG, Hu X, Liu H, Cross JR, Ratsch G, Rice CM, Ivashkiv LB. Interferon-gamma regulates cellular metabolism and mRNA translation to potentiate macrophage activation. Nat Immunol. 2015;16(8):838–49.PubMedPubMed CentralView ArticleGoogle Scholar
- Arner E, Daub CO, Vitting-Seerup K, Andersson R, Lilje B, Drablos F, Lennartsson A, Ronnerblad M, Hrydziuszko O, Vitezic M, et al. Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science (New York, NY). 2015;347(6225):1010–4.View ArticleGoogle Scholar
- Summers KM, Hume DA. Identification of the macrophage-specific promoter signature in FANTOM5 mouse embryo developmental time course data. J Leukoc Biol. 2017;102:1081–92.PubMedView ArticleGoogle Scholar
- Garceau V, Balic A, Garcia-Morales C, Sauter KA, McGrew MJ, Smith J, Vervelde L, Sherman A, Fuller TE, Oliphant T, et al. The development and maintenance of the mononuclear phagocyte system of the chick is controlled by signals from the macrophage colony-stimulating factor receptor. BMC Biol. 2015;13:12.PubMedPubMed CentralView ArticleGoogle Scholar
- Gautier EL, Shay T, Miller J, Greter M, Jakubzick C, Ivanov S, Helft J, Chow A, Elpek KG, Gordonov S, et al. Gene-expression profiles and transcriptional regulatory pathways that underlie the identity and diversity of mouse tissue macrophages. Nat Immunol. 2012;13(11):1118–28.PubMedPubMed CentralView ArticleGoogle Scholar
- Epelman S, Lavine KJ, Randolph GJ. Origin and functions of tissue macrophages. Immunity. 2014;41(1):21–35.PubMedPubMed CentralView ArticleGoogle Scholar
- Chang MK, Raggatt LJ, Alexander KA, Kuliwaba JS, Fazzalari NL, Schroder K, Maylin ER, Ripoll VM, Hume DA, Pettit AR. Osteal tissue macrophages are intercalated throughout human and mouse bone lining tissues and regulate osteoblast function in vitro and in vivo. J Immunol. 2008;181(2):1232–44.PubMedView ArticleGoogle Scholar
- Feng R, Desbordes SC, Xie H, Tillo ES, Pixley F, Stanley ER, Graf T. PU.1 and C/EBPα/β convert fibroblasts into macrophage-like cells. Proc Natl Acad Sci. 2008;105(16):6057–62.PubMedView ArticleGoogle Scholar
- Li X, Nair A, Wang S, Wang L. Quality control of RNA-seq experiments. Methods Mol Biol (Clifton, NJ). 2015;1269:137–46.View ArticleGoogle Scholar
- Hansen KD, Brenner SE, Dudoit S. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 2010;38(12):e131.PubMedPubMed CentralView ArticleGoogle Scholar
- van Gurp TP, McIntyre LM, Verhoeven KJF. Consistent errors in first strand cDNA due to random hexamer Mispriming. PLoS One. 2013;8(12):e85583.PubMedPubMed CentralView ArticleGoogle Scholar
- Risso D, Schwartz K, Sherlock G, Dudoit S. GC-content normalization for RNA-Seq data. BMC Bioinformatics. 2011;12(1):480.PubMedPubMed CentralView ArticleGoogle Scholar
- Reiter JF, Leroux MR. Genes and molecular pathways underpinning ciliopathies. Nat Rev Mol Cell Biol. 2017;18:533–47.PubMedView ArticlePubMed CentralGoogle Scholar
- Stauber M, Boldt K, Wrede C, Weidemann M, Kellner M, Schuster-Gossler K, Kuhnel MP, Hegermann J, Ueffing M, Gossler A. 1700012B09Rik, a FOXJ1 effector gene active in ciliated tissues of the mouse but not essential for motile ciliogenesis. Dev Biol. 2017;429:186–99.PubMedView ArticleGoogle Scholar
- Zhou J, Chehab R, Tkalcevic J, Naylor MJ, Harris J, Wilson TJ, Tsao S, Tellis I, Zavarsek S, Xu D, et al. Elf5 is essential for early embryogenesis and mammary gland development during pregnancy and lactation. EMBO J. 2005;24(3):635–44.PubMedPubMed CentralView ArticleGoogle Scholar
- Kist R, Greally E, Peters H. Derivation of a mouse model for conditional inactivation of Pax9. Genesis. 2007;45(7):460–4.PubMedView ArticleGoogle Scholar
- Bangs F, Antonio N, Thongnuek P, Welten M, Davey MG, Briscoe J, Tickle C. Generation of mice with functional inactivation of talpid3, a gene first identified in chicken. Development. 2011;138(15):3261–72.PubMedPubMed CentralView ArticleGoogle Scholar
- Yin Y, Bangs F, Paton IR, Prescott A, James J, Davey MG, Whitley P, Genikhovich G, Technau U, Burt DW, et al. The Talpid3 gene (KIAA0586) encodes a centrosomal protein that is essential for primary cilia formation. Development. 2009;136(4):655–64.PubMedPubMed CentralView ArticleGoogle Scholar
- Roosing S, Romani M, Isrie M, Rosti RO, Micalizzi A, Musaev D, Mazza T, Al-Gazali L, Altunoglu U, Boltshauser E, et al. Mutations in CEP120 cause Joubert syndrome as well as complex ciliopathy phenotypes. J Med Genet. 2016;53(9):608–15.PubMedPubMed CentralView ArticleGoogle Scholar
- Wu C, Jin X, Tsueng G, Afrasiabi C, Su AI. BioGPS: building your own mash-up of gene annotations and expression profiles. Nucleic Acids Res. 2016;44(D1):D313–6.PubMedView ArticleGoogle Scholar
- Garcia-Morales C, Nandi S, Zhao D, Sauter KA, Vervelde L, McBride D, Sang HM, Clinton M, Hume DA. Cell-autonomous sex differences in gene expression in chicken bone marrow-derived macrophages. J Immunol. 2015;194(5):2338–44.PubMedPubMed CentralView ArticleGoogle Scholar
- Bailey RA, Kranis A, Psifidi A, Watson KA, Rothwell L, Hocking PM, Kaiser P, Stevens MP, Avendano S (2018) Colonization of a commercial broiler line by Campylobacter is under limited genetic control and does not significantly impair performance or intestinal health, Poultry Science, pey295, https://doi.org/10.3382/ps/pey295.
- Psifidi A, Fife M, Howell J, Matika O, van Diemen PM, Kuo R, Smith J, Hocking PM, Salmon N, Jones MA, et al. The genomic architecture of resistance to campylobacter jejuni intestinal colonisation in chickens. BMC Genomics. 2016;17:293.PubMedPubMed CentralView ArticleGoogle Scholar
- Kodama Y, Shumway M, Leinonen R. The sequence read archive: explosive growth of sequencing data. Nucleic Acids Res. 2012;40(Database issue):D54–6.PubMedView ArticleGoogle Scholar
- Lynn DJ, Higgs R, Gaines S, Tierney J, James T, Lloyd AT, Fares MA, Mulcahy G, O'Farrelly C. Bioinformatic discovery and initial characterisation of nine novel antimicrobial peptide genes in the chicken. Immunogenetics. 2004;56(3):170–7.PubMedView ArticleGoogle Scholar
- Le C-F, Gudimella R, Razali R, Manikam R, Sekaran SD. Transcriptome analysis of Streptococcus pneumoniae treated with the designed antimicrobial peptides, DM3. Sci Rep. 2016;6:26828.PubMedPubMed CentralView ArticleGoogle Scholar
- Fabriek BO, Dijkstra CD, van den Berg TK. The macrophage scavenger receptor CD163. Immunobiology. 2005;210(2):153–60.PubMedView ArticleGoogle Scholar
- O'Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B, Robbertse B, Smith-White B, Ako-Adjei D, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016;44(D1):D733–45.PubMedView ArticleGoogle Scholar
- Curwen V, Eyras E, Andrews TD, Clarke L, Mongin E, Searle SMJ, Clamp M. The Ensembl automatic gene annotation system. Genome Res. 2004;14(5):942–50.PubMedPubMed CentralView ArticleGoogle Scholar
- Balwierz PJ, Carninci P, Daub CO, Kawai J, Hayashizaki Y, Van Belle W, Beisel C, van Nimwegen E. Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepCAGE data. Genome Biol. 2009;10(7):R79.PubMedPubMed CentralView ArticleGoogle Scholar
- R: A Language and Environment for Statistical Computing [http://www.r-project.org]. Accessed 24 Aug 2016.
- topGO: Enrichment analysis for Gene Ontology [http://www.bioconductor.org/packages/release/bioc/html/topGO.html]. Accessed 24 Aug 2016.
- Alexa A, Rahnenführer J, Lengauer T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006;22(13):1600–7.PubMedView ArticleGoogle Scholar
- Kinsella RJ, Kahari A, Haider S, Zamora J, Proctor G, Spudich G, Almeida-King J, Staines D, Derwent P, Kerhornou A, et al. Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database. 2011;2011:bar030.PubMedPubMed CentralView ArticleGoogle Scholar
- Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005;33(Database Issue):D501–4.PubMedView ArticleGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–402.PubMedPubMed CentralView ArticleGoogle Scholar
- Kotlyar M, Pastrello C, Sheahan N, Jurisica I. Integrated interactions database: tissue-specific view of the human and model organism interactomes. Nucleic Acids Res. 2016;44(D1):D536–41.PubMedView ArticleGoogle Scholar
- Bader GD, Betel D, Hogue CWV. BIND: the biomolecular interaction network database. Nucleic Acids Res. 2003;31(1):248–50.PubMedPubMed CentralView ArticleGoogle Scholar
- Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, et al. The BioGRID interaction database: 2015 update. Nucleic Acids Res. 2015;43(Database issue):D470–8.PubMedView ArticleGoogle Scholar
- Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D. The database of interacting proteins: 2004 update. Nucleic Acids Res. 2004;32(Database issue):D449–51.PubMedPubMed CentralView ArticleGoogle Scholar
- Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A, et al. Human protein reference database--2009 update. Nucleic Acids Res. 2009;37(Database issue):D767–72.PubMedView ArticleGoogle Scholar
- Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A, et al. IntAct: an open source molecular interaction database. Nucleic Acids Res. 2004;32(Database issue):D452–5.PubMedPubMed CentralView ArticleGoogle Scholar
- Brown KR, Jurisica I. Unequal evolutionary conservation of human protein interactions in interologous networks. Genome Biol. 2007;8(5):R95.PubMedPubMed CentralView ArticleGoogle Scholar
- Lynn DJ, Winsor GL, Chan C, Richard N, Laird MR, Barsky A, Gardy JL, Roche FM, Chan TH, Shah N, et al. InnateDB: facilitating systems-level analyses of the mammalian innate immune response. Mol Syst Biol. 2008;4:218.PubMedPubMed CentralView ArticleGoogle Scholar
- Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G. MINT: the molecular INTeraction database. Nucleic Acids Res. 2007;35(Database issue):D572–4.PubMedView ArticleGoogle Scholar
- Rhodes DR, Tomlins SA, Varambally S, Mahavisno V, Barrette T, Kalyana-Sundaram S, Ghosh D, Pandey A, Chinnaiyan AM. Probabilistic model of the human protein-protein interaction network. Nat Biotechnol. 2005;23(8):951–9.PubMedView ArticleGoogle Scholar
- Elefsinioti A, Sarac OS, Hegele A, Plake C, Hubner NC, Poser I, Sarov M, Hyman A, Mann M, Schroeder M, et al. Large-scale de novo prediction of physical protein-protein association. Mol Cell Proteomics : MCP. 2011;10(11):M111 010629.PubMedView ArticleGoogle Scholar
- Zhang QC, Petrey D, Deng L, Qiang L, Shi Y, Thu CA, Bisikirska B, Lefebvre C, Accili D, Hunter T, et al. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature. 2012;490(7421):556–60.PubMedPubMed CentralView ArticleGoogle Scholar
- Kotlyar M, Pastrello C, Pivetta F, Lo Sardo A, Cumbaa C, Li H, Naranian T, Niu Y, Ding Z, Vafaee F, et al. In silico prediction of physical protein interactions and characterization of interactome orphans. Nat Methods. 2015;12(1):79–84.PubMedView ArticleGoogle Scholar
- Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E. EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2009;19(2):327–35.PubMedPubMed CentralView ArticleGoogle Scholar
- Diaz-Perales A, Quesada V, Peinado JR, Ugalde AP, Alvarez J, Suarez MF, Gomis-Ruth FX, Lopez-Otin C. Identification and characterization of human archaemetzincin-1 and -2, two novel members of a family of metalloproteases widely distributed in archaea. J Biol Chem. 2005;280(34):30367–75.PubMedView ArticleGoogle Scholar
- Jiang TX, Tuan TL, Wu P, Widelitz RB, Chuong CM. From buds to follicles: matrix metalloproteinases in developmental tissue remodeling during feather morphogenesis. Differentiation. 2011;81(5):307–14.PubMedPubMed CentralView ArticleGoogle Scholar
- Takeda M, Obara N, Suzuki Y. Keratin filaments of epithelial and taste-bud cells in the circumvallate papillae of adult and developing mice. Cell Tissue Res. 1990;260(1):41–8.PubMedView ArticleGoogle Scholar
- Plowman GD, Green JM, McDonald VL, Neubauer MG, Disteche CM, Todaro GJ, Shoyab M. The amphiregulin gene encodes a novel epidermal growth factor-related protein with tumor-inhibitory activity. Mol Cell Biol. 1990;10(5):1969–81.PubMedPubMed CentralView ArticleGoogle Scholar
- Günzel D, Yu ASL. Claudins and the modulation of tight junction permeability. Physiol Rev. 2013;93(2):525–69.PubMedPubMed CentralView ArticleGoogle Scholar
- Quinn LM, Kilpatrick LM, Latham SE, Kalionis B. Homeobox genes DLX4 and HB24 are expressed in regions of epithelial-mesenchymal cell interaction in the adult human endometrium. Mol Hum Reprod. 1998;4(5):497–501.PubMedView ArticleGoogle Scholar
- Alibardi L, Holthaus KB, Sukseree S, Hermann M, Tschachler E, Eckhart L. Immunolocalization of a histidine-rich epidermal differentiation protein in the chicken supports the hypothesis of an evolutionary developmental Link between the embryonic subperiderm and feather barbs and barbules. PLoS One. 2016;11(12):e0167789.PubMedPubMed CentralView ArticleGoogle Scholar
- Lopes Ricardo J, Johnson James D, Toomey Matthew B, Ferreira Mafalda S, Araujo Pedro M, Melo-Ferreira J, Andersson L, Hill Geoffrey E, Corbo Joseph C, Carneiro M. Genetic basis for red coloration in birds. Curr Biol. 2016;26(11):1427–34.PubMedPubMed CentralView ArticleGoogle Scholar
- Strasser B, Mlitz V, Hermann M, Rice RH, Eigenheer RA, Alibardi L, Tschachler E, Eckhart L. Evolutionary origin and diversification of epidermal barrier proteins in amniotes. Mol Biol Evol. 2014;31(12):3194–205.PubMedPubMed CentralView ArticleGoogle Scholar
- Holmes RS. Vertebrate patatin-like phospholipase domain-containing protein 4 (PNPLA4) genes and proteins: a gene with a role in retinol metabolism. 3. Biotech. 2012;2(4):277–86.Google Scholar
- Long AC, Bomser JA, Grzybowski DM, Chandler HL. All-trans retinoic acid regulates Cx43 expression, gap junction communication and differentiation in primary Lens epithelial cells. Curr Eye Res. 2010;35(8):670–9.PubMedView ArticleGoogle Scholar
- Wasmeier C, Romao M, Plowright L, Bennett DC, Raposo G, Seabra MC. Rab38 and Rab32 control post-Golgi trafficking of melanogenic enzymes. J Cell Biol. 2006;175(2):271–81.PubMedPubMed CentralView ArticleGoogle Scholar
- Coppola U, Annona G, D’Aniello S, Ristoratore F. Rab32 and Rab38 genes in chordate pigmentation: an evolutionary perspective. BMC Evol Biol. 2016;16(1):26.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang F, Feng Y, Li P, Wang K, Feng L, Liu Y-F, Huang H, Guo Y-B, Mao Q-S, Xue W-J. RASSF10 is an epigenetically inactivated tumor suppressor and independent prognostic factor in hepatocellular carcinoma. Oncotarget. 2016;7(4):4279–97.PubMedGoogle Scholar
- Lee S-A, Belyaeva OV, Kedishvili NY. Biochemical characterization of human epidermal retinol dehydrogenase 2. Chem Biol Interact. 2009;178(1):182–7.PubMedView ArticleGoogle Scholar
- Johnson NC. XG: the forgotten blood group system. Immunohematology. 2011;27(2):68–71.PubMedGoogle Scholar