ARID gene name proposal (version 2)

changes in v2

1. ARID_NJMJ be changed to NjmjARID to better illustrate the early (pre-plant) divergence from the ARIDxxx genes.

2. regroup the Mrf and Rbp1 gene families. Although these two gene families are likely closely related and represent a duplication after the divergence of nematode and insect from the ancestor of hemichordates, this does not seem sufficient to warrant grouping these families together, and further an alternate hypothesis that both families were present at the time of the other ARID families, but was lost in the nematode/insect line can not as yet be disproven.

the name AridTud (arid tudor) and AridMrf is proposed for these two genes under scheme 3.

3. added the ARID genes from puffer and zebrafishes to a figure showing the metazoan ARID genes.

___________

There are certainly advantages to having a uniform naming convention for all of the ARID domain containing genes, particularly if this naming scheme reflects aspects of the evolutionary relationships between these gene families.

[We are proposing a modest change to the submitted proposal in grouping the Mrf1 and Mrf2 genes with the Rbp1 and Bcaa genes to reflect their descent from a common ancestral gene. Otherwise, these proposals are identical in grouping the ARID subfamilies to the previous proposals. ***eliminated in v 2]

In addition, we are presenting two naming systems that diverge from the proposed Arid# system. The first recognizes a fundamental division in the ARID's containing the JmjC domain from those that do not. This is easily seen in the fungal genomes where the ARID containing genes can be divided into two distinct classes: the Swi1 like genes and the Rbp2 like genes. Finally the third proposal retains short elements of the current gene names, instead of simply adopting the ARID#. This preserves continuity with the previous literature as well as being somewhat easier to remember.

the evolution of the ARID family of genes

The ARID domain can be found in green plants, slime mold and fungus. At least one ARID gene family can be clearly traced from plant to metazoans (Rbbp2 family) by the conservation of the order of multiple conserved domains. In other ARID families, the evolutionary relationship between the plant and the fungus and metazoan ARID's is less clear.

The chart below illustrates the metazoan ARID genes identified in the genomic and EST databases. For mosquito, tunicate, and fish only the ARID domain has been analyzed, the predicted gene is shown for clarity. The ARID domain alone provides sufficient resolution to identify the gene, with each gene family clustering appropriately using a neighbor joining phylogeny. Note that fish and tunicate genomes are currently incomplete.


In fungus (S cerivisiae and S pombe), we can divide the ARID containing proteins into two groups. The Swi1 and Rsc9p proteins, which are both elements in large heterogeneous multiprotein complexes that remodel chromatin, form one group; and the Rbbp2 like proteins the second. It seems likely that these yeast genes have undergone duplication as well as domain loss in numerous instances. This is particularly evident in the Rbbp2 family, where the overall domain order is identical between the metazoan and the plant genes, but exhibits the loss of several domains in several yeast genes since the divergence of the yeast.

I have included the Gasc1 like protein family in this analysis because of their relationship with the Jumonji containing ARID proteins. The jumonjiC domain is found in many proteins, however the Jumonji N domain is only found in proteins containing the Jumonji C, thereby forming a subset of jumonjiC proteins. This subset can be further divided into two families, those which contain the ARID and C5HC2 zinc fingers (rbp2, Jumonji) and those which do not (Gasc1 like proteins). Both of these families are represented in the fungi and metazoans.

Examination of the exon structure of the non Jumonji containing genes reveals a well maintained genomic structure in each of the four separate gene families within the human genes. In some cases, the genomic structure is conserved at some intron positions to the worm and fly genes (Bright, Swi1, and Aridnew genes), although in many cases the introns have been lost partially or totally in worm and fly. Examination of the plant genes also reveals a well conserved structure within the ARID domain, further there is a splice site that is present in two families of the plant genes (ARID-HMG and ARID-zf) that is also present in 3 of the 4 non-Jmj containing metazoan ARID families. Overall, the gene exon structure supports an evolutionary relationship between the plant Arid-HMG and Arid-zf family genes and the Bright, Swi1, Aridnew and Rbp1 family of genes, and suggests that these genes descend from a common ancestor.

The genomic structure strongly supports the descent of the Rbp1 and Mrf1 family of genes from a common ancestor. Examples of both an Rbp1 like and an Mrf1 like ARID have been identified in ciona, so this duplication must have occured before the divergence of tunicate from the chordates. The absense of an Mrf like gene in insect and nematode suggest that this duplication likely occured after the divergence of these groups from the lineage that would become the hemichordates.

Exon structure of the ARID domain in all ARID family genes shown below. The metazoan genes are in the top of the chart, and the worm, fly and human genes are indicated symbolically on the left. The Mrf1 and Mrf2 gene structures are grouped with the related Rbp1 and BCAA1 genes.