There are thousands of strongly conserved noncoding elements (CNEs) in vertebrate genomes, and their functions remain largely unknown. However, without biologically relevant criteria for prioritizing them, selecting particular CNE sequences to study can be haphazard. To address this problem, we have developed cneViewer -- a database and webtool that systematizes information on non-coding DNA elements in zebrafish that are conserved strongly with human. A key feature is the ability to search for CNEs that may be relevant to tissue-specific gene regulation, based on known expression patterns of nearby genes. cneViewer provides this and other organizing features that significantly facilitate experimental design and CNE analysis.
Contact: chuangj [at] bc [dot] edu
1. First, open the Anatomy Tree pane by clicking on the bar on the far left. Choose the (zfin-based) stages and anatomical structures you would like to focus on. You can include genes from an anatomy by setting the select box to 'green'. Additionally, you can exclude those genes by setting the box to 'red'.
2. Choose genes whose nearby CNEs you want to display in the Gene Selection pane.
3. View and manage information on individual CNEs in the display pane at right. Data can be sorted by clicking on the header. Links to sequence and location data are provided for each CNE and the relevant gene, as well as primer sequences for cloning.
4. (optional) Use the Tools menu in the Gene Selection pane to add/remove specific genes by name or choose a specific list of CNEs to display. You can also filter the displayed CNEs for those containing particular subsequences based on sequence conservation, distance from the gene, or conserved synteny in zebrafish and human. The Tools menu provides a batch data download feature as well.
Q: Where do the anatomy-specific expression data for the genes come from?
A: All anatomical data are from the tissue-specific expression annotations which have been compiled by the zebrafish model organism resource ZFIN.
Q: How are "syntenic CNEs" defined?
A: Syntenic CNEs are defined to be those for which the zebrafish CNE sequence is within 500 kb of at least one zebrafish gene such that the corresponding human CNE and corresponding human gene are also within 500 kb of each other.
Q: What is the update schedule for cneViewer?
A: cneViewer is currently built on the zv7 (danRer5) and hg18 genome builds. cneViewer will be updated 3 times a year in addition to when major new genome releases come out.
Q: What is the correct threshold to use to predict conserved noncoding elements?
A: Zebrafish and human are diverged enough that non-functional sequences would not be expected to appear similar to each other, except at the same rate that randomly generated sequences would (naively 25%). The minimum identity and length in cneViewer is 50bp/50% id, and at this level there is a strong probability that any given CNE is functional. Typical parameters that have been used in the literature are ~70% ID over at least ~100 bp, which is a very conservative criterion. The choice of exact threshold is a complicated question that is an active subject of research. It involves the consideration of: the structure of the functional elements encoded within CNEs; the orthologous sequences in all available species; lineage-specific effects; and uncertainties in the alignment. All our CNEs have links to some of the more sophisticated attempts to answer this question (e.g. phastCons, ECRbrowser, Ancora). However, there are still considerable uncertainties in all such methods. cneViewer implements user-specified length and identity thresholds since in practice these are relatively standard, and they impose few assumptions about the underlying data.
Q: I would like to filter CNEs further using some other criteria. Is there a way for me to do this and still view the data in cneViewer?
A: Yes. The simplest way to do this is to download the CNE file using the options in the Tools menu, and then to perform the filter locally on your computer. CNE sets can be re-uploaded to cneViewer by specifying a list of IDs, using the Tools menu. For example, suppose one wished to filter out all CNEs overlapping annotated repeat sequences in the human genome. One could download the CNE file, and convert to BED format using the fields specifying the human location. One could then upload this BED file to the genome browser tool Galaxy (at Penn State) and remove all CNEs intersecting with known human repeats available in Galaxy. The results could then be uploaded back to cneBrowser based on the IDs of the CNEs that passed the filter.
Q: The size of some text is too small/large.
A: While we have tested the appearance of cneViewer in browsers (Firefox, Safari, IE) on multiple operating systems (XP,Vista,Mac,Linux), if the view looks unusual in your screen, you can still adjust fontsizes using standard browser commands (ctrl+,ctrl-,command+,command-).
Q: What are "duplicate CNEs"?
A: Because cneViewer reports CNEs in the 5' to 3' direction with respect to the annotated gene, the database contains both forward and reverse entries for some CNEs. This occurs if there are genes on both sides of a CNE, both within the cutoff distance and in opposite orientations. Users wishing to download data but who wish to avoid such duplications can filter to just one of them in the Download CNE Data tool.
Use this tool to download a tab-delimited file (?) containing CNEs from the Genes you have selectedFilters:
The left column contains all known genes (and their aliases). Genes in the right column are genes that are shown regardless of anatomy selected.
Enter a list of CNEs, separated by spaces or commas
Search for motifs in the CNEs currently shown in the GeneList. CNEs with the motif will be highlighted with a red striped background.
You can filter CNEs by their conserved sequence identity, their distance from a gene, their minimum length or whether or not they are syntenic.Min Conserved Sequence Identitiy: %