Role of HECT ubiquitin protein ligases in Arabidopsis thaliana

Ubiquitination is a kind of posttranslational modifi cation of proteins in eukaryotes, and it plays an important role in the growth and development of organisms. The ubiquitination of proteins is a cascade enzymatic reaction involving three enzymes. The homologous to E6-AP carboxy terminus ubiquitin-protein ligases (HECT E3s) family is an important ubiquitin-protein ligases family. The family all have a HECT domain of approximately 350 amino acids in the C-terminus. However, studies on plant HECT E3s, such as structural features, prediction of HECT domain function, and their regulatory mechanisms, are very limited. In this paper, Arabidopsis thaliana HECT family genes were analyzed, including gene structure and functional domains and its limited known functions in protein degradation, gene transcription regulation, epigenetically regulation or other functions, fi nally speculate their roles in plant morphologies, aging or responsive to environmental stress. Review Article Role of HECT ubiquitin protein ligases in Arabidopsis thaliana Wei Lan#, Weibo Ma# and Ying Miao* Center for Molecular Cell and Systems Biology, College of Life Sciences, Fujian Agriculture and Forestry University, China #Equal contribution authors *Address for Correspondence: Ying Miao, Center for Molecular Cell and Systems Biology, College of Life Sciences, Fujian Agriculture and Forestry University, Email: ymiao@fafu.edu.cn Submitted: 07 March 2018 Approved: 19 March 2018 Published: 20 March 2018 Copyright: 2018 Lan W, et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Introduction
Proteins are the basic components of cells and the main carriers of life activities. In order to effectively adapt to various external environment, the protein is constant turnover during the cell life cycle. This turnaround is crucial for the development and function of the cell. The degradation of protein has an irreplaceable effect on biological activities, including the treatment of damaged or denatured proteins, degradation of foreign proteins into amino acids, intracellular reusability and maintenance cell self-balancing [1,2]. The degradation of intracellular proteins is mainly through two pathways, Lysosome pathway and the ubiquitin proteasome system [3,4]. The lysosomal pathway mainly degrades proteins in organelles that have long-lived or intracellular aging defects. And autophagy can simultaneously degrade multiple proteins, or even organelles [5]. But the ubiquitination pathway mainly and individually degrades proteins with short and misfolded life. Therefore, the ubiquitination pathway is an important post-translational modi ication of proteins that involves a variety of essential functions of all eukaryotes, such as protein degradation, endocytosis, signal transduction pathways, transcriptional regulation [6][7][8]. In Arabidopsis, over 1400 genes have been found to encode components of the Ub/26S proteasome pathway make the inest regulatory mechanisms in plants [9,10].
The ubiquitination pathway mainly transfers the activation of ubiquitin to the corresponding target protein through three signi icant enzymes [11][12][13][14]. Firstly, free ubiquitin is activated by E1 enzyme with the involvement of ATP. Subsequently, by ubiquitin-E1 intermediate activation of ubiquitin to E2 to form a pan-ubiquitin-E2 intermediate; inally, the E3 enzyme recognizes the speci ic protein substrate that is degraded and transfers the ubiquitin molecule from the E2-ubiquitin intermediate to the corresponding proteins. Ubiquitin is a globular thermostable protein, and it is composed of 76 amino acid residues [10]. It forms different polymeric ubiquitin chains through seven lysine residues [8,15], and their biological functions of polyubiquitin chains formed at different sites are also different. The target protein is usually recognized and degraded by the 26S proteasome when it is tagged by Lys48 linked polyubiquitin [16,18]. A recent report shows that ubiquitin can be degraded by proteasomes through both Lys63 and Lys11 residues, and Lys11-labeled substrates can also speci ically participate in endoplasmic reticulum-dependent degradation (ERAD) [19]. At present, it is also found that there is a single ubiquitination of multiple sites of the target protein, but its function is unclear.
As the speci icity of ubiquitination is mainly determined by ubiquitin-protein ligases (E3s), one organism usually contains a large amount of E3s [20,21]. The study found that there are hundreds of E3s in the human genome, with few E1s and E2s [22]. Many studies have been conducted on E3s in plants, for example in Arabidopsis and Rice [23]. In plants, there are several main types of E3s: RING type, HECT type and U-box type [24][25][26][27]. Among them, the RING type E3s directly mediate the transfer of ubiquitin from E2-ubiquitin intermediate to the target protein ( Figure 1b). However, HECT E3s can directly participate in the ubiquitin transfer process. Finally, activated ubiquitin is transferred to a speci ic Lys residue in the corresponding target protein [8,13] (Figure 1a). HECT E3s is present in all eukaryotes, and there are 5 HECT E3s in S. cerevisiae genome and 28 members in human genome [28,29]. E3 ubiquitin ligase is a key enzyme for the degradation of substrates, and is a selective identi ication of degraded proteins and then to connect ubiquitin to the substance. Accumulating evidence suggests that E3s play a vital effect in the process of eukaryotic ubiquitination. The function of such E3s has been studied deeply in animals and found it plays a major effect in disease-related processes [8,30]. At present, studies on E3s in plants mainly focus on RING and U-box ubiquitin ligases and they have been found to be mainly involved in plant biology, abiotic stress and in plant disease resistance [31][32][33]. However, there are relatively few studies on the plant HECT ubiquitin protein ligase. In this paper, the information, gene structure and domain architecture of HECT E3s family genes were analyzed, and the function of HECT proteins in Arabidopsis thaliana were speculated.

Survey methodology
Since 1995, Scheffner et al., described protein ubiquitination involving an E1-E2-E3 enzyme ubiquitin thioester cascade [13], there has been a surge of new publications on the function of E3s in animals and found it plays a major effect in disease-related processes [8,30]. In the sections below, we discuss mainly plant HECT E3s family on information, gene structure and domain architecture of HECT E3s family genes and its function in epigenetically regulation or other functions, speculate their roles in plant morphologies, aging or responsive to environmental stress. We add some insights from older publications that were not previously summarized, and provide new research directions. For new studies, we used standard search methods (e.g., Web of Science, Google Scholar) to identify over 10 papers since 2015 and two our own publications that made new discoveries speci ically related to HECT E3s family, along with other citations that have an important bearing on the broader topics relevant to this review.

HECT ubiquitin protein ligase
HECT E3s consists of a HECT domain consisting of a polypeptide chain with 350 amino acids at its C-terminus and is widespread in eukaryotes [34]. This domain involves ubiquitin-binding proteins that accept ubiquitin and catalyze their transfer to the target protein [8,35]. At present, the related functions of HECT E3s in animals and microorganisms are studied in more detail. Studies in animals have shown that they play a key role in protein degradation and DNA damage related. In recent years, studies have found that human HECT E3s is also closely linked to the occurrence of various diseases such as cancer [8,36,37].
This unique HECT domain is of great importance to HECT-type E3s. Research discovery that the HECT domain contains the catalytic center of HECT E3s [38,39], a long N-lobe that contains the E2 binding site, and a shorter C-lobe with the active site cysteine residue [35,40,41]. They are connected by a short lexible hinge, forming a downed L-shaped structure [42]. Recently, the spatial structure of the HECT domain has been analyzed .Curiously, the spatial structure shows that the binding site of E2 UbcH7 is 41 Å away from the cysteine residue at the active site on the E6-AP [42]. Such a large distances to the ubiquitin are dif icult to complete the transfer of ubiquitin. It contends that the E2 and E3 by thiol binding process should be a conformational change [8,42]. More detail information has to be addressed in the future.

Phylogenetic classifi cation of HECT E3s
The HECT domain has a long amino acid sequence that can provide much information for systematic analysis [34]. The phylogenetic analysis of HECT proteins in organism has done [34,43]. HECT proteins exist in all organisms from bacteria to high plant and human beings. To investigate the evolutionary history of Arabidopsis thaliana HECT proteins in Cruciferae plants, we constructed the phylogenetic tree of Brassicaceae using the Maximum Parsimony method (Figure 2). The Arabidopsis thaliana UPLs protein sequences were used as a probes to query for available plant genomes using the NCBI database [44]. The plants mainly include Arabidopsis thaliana (7 protein sequences), Arabis alpina (4 protein sequences), Brassica napus (17 protein sequences), Brassica rapa (14 protein sequences), Camelina sativa (26 protein sequences), Capsella rubella (6 protein sequences) and Raphanus sativus (8 protein sequences). The MEGA 7.0 software was used to build an adjacent phylogenetic tree using the JTT model and 1000 bootstrapping replicas [45]. Using Clustal Omega aligned full-length amino acid sequences. It seems that HECT proteins in Camelina sativa and Capsella rubella has close distance to Arabidopsis.
Further, we analyzed the data of 7 related genes of HECT ubiquitin protein liga s es from the TAIR (http://www.arabidopsis.org/). The table 1 lists the length and molecular weight of seven UPL genes encoded proteins. In Arabidopsis thaliana, chromosomes 1, 3 and 4 each contain two HECT genes, chromosome 5 contains only one HECT gene, but chromosome 2 does not contain the HECT gene ( Table 1).
The Arabidopsis thaliana genome contains seven genes encoding different members of HECT E3s (called UPL1-UPL7) [10,40]. Depending on the location of exon / intron genes, the speci icity of a particular protein, the sequence similarity of the HECT domain and the protein structure, the UPL proteins can be subdivided into four subfamilies [40]. According to the results of phylogenetic tree, plant UPLs we can split into six major subfamilies, I to VI [34]. And in Arabidopsis thaliana, Subfamily I includes two genes UPL3 and UPL4, subfamily II includes UPL7, subfamily III includes UPL6, subfamily V includes UPL1, UPL2 and UPL8 (not present in Arabidopsis thaliana), and UPL5 is subfamily VI but Subfamily IV does not exist in Angiosperms.

Structure of HECT genes in Arabidopsis thaliana
To further analyze the information of Arabidopsis thaliana UPLs, we compared the complete genome and coding sequences of seven AtUPLs using GSDS analysis software [46]. The number of introns in AtUPLs is diverse, and most AtUPLs contain more than ten introns (Figure 3). Through the analysis found that the average exon length of the subfamily V is longer than the average exon length of the other four subfamilies. The UPLs in the same subfamily is very similar, and the difference being mainly in the length of exons and introns.

Domain architecture of HECT E3s in Arabidopsis thaliana
To understand the function of the HECT ubiquitin ligase in Arabidopsis thaliana, besides the HECT domain other domains were examined. The AtUPL protein sequence was processed using the InterPro and Pfam database [47,48]. We found some domains such as Armadillo-repeat domain (ARM), IQ domains, Ubiquitin-associated domain (UBA), Ub-interacting motif (UIM), and several domains of unknown function upstream of the HECT domain (UDF) [40,49,50] (Figure 4).
Further analysis showed that the subfamily members usually contain other characteristic protein domains in addition to the HECT domain. Subfamily I includes the  UPL3 and UPL4 genes, both of which encode proteins that contain Armadillo repeats. And the UPL3 and UPL4 genes also encode about 200-kDa protein, respectively, and the protein sequences of UPL4 and UPL3 have a 54% similarity [40]. Subfamily II, UPL7 and subfamily III, UPL6 members of the encoded proteins have an IQ domain [40,51]. Subfamilies V members UPL1 and UPL2 posttranslational proteins have a UBA domain, ARM repeats, and also contain three domains of unknown function. In Arabidopsis thaliana UPL5 belonging to subfamily VI and encodes proteins contains ubiquitin domains. Analysis of the subfamily V shows that UPL1 and UPL2 proteins contain essentially identical domains, suggesting that both genes may be produced by UPL1/2 gene duplication [34]. Further studies revealed that UPL1 and UPL2 encode approximately 405-kDa HECT E3, respectively, in Arabidopsis thaliana and demonstrate their ligase activity in vitro [40,52].

The functions of HECT ubiquitin protein ligases family
As we known, ubiquitin can form abundant types of ubiquitin chains in virtue of its own seven Lys residues; and in other hand, the corresponding substrate can be monoubiquitinated or polyubiquitinated. The difference in the manner in which ubiquitin is transferred may result in residues attached to one or more Lys in the target protein.
There is growing evidence that the different types of ubiquitination and the formation of ubiquitin chains also determine the fate of the corresponding proteins. The ubiquitin protein may be degraded by the 26S proteasome, or participate in transcriptional regulation, or be involved in chromatin remodeling, etc.
Some resea r c hes show that HECT E3 can mimic protein polyubiquitination in many different ways. The study found that HuE6AP can be assembled through its Cys to form a special Lys48-linked chain. HuKIAA10 can construct Lys48-and Lys29-linked chains [53]. In addition, it was also found that the HECT domain of ScRsp5 (Saccharomyces cerevisiae orthologue of NEDD4) can form a Lys63-linked chains [54]. Which signify  that there are diverse roles of HECT E3s in organism. In other hand, it found that most of the expression level of AtUPL genes i s very high, and in almost all tissues, indicating that they may have a role similar to the housekeeping genes [34]. Up to now i t has known that HECT E3s in Arabidopsis thaliana are involved in several life activities ( Figure 5).

Role of HECT ubiquitin ligases family in protein degradation
The earliest and most researches about ubi q u i t in were focused on the degradation function by 26S proteasome. When the cor r esponding protein is normally degraded by the 26S proteasome, it is a polyubiquitin label linked by Lys48. [55,56]. In addition, Recent indings have shown that speci ic proteins can also be degraded by proteasomes when all Lys residues in the corresponding protein form polyubiquitin chains (except for Lys63) [19]. Accumulating evidence suggests that most of the members of HECT E3s are involved in protein degradation. For inst a n c e, HuE6AP, encoded by the UBE3A gene, speci ically synthesize unanchored Lys 48-linked chains [53], which invo l ve in proteasomal degradation of ampli ied in breast cancer 1 (AIB1) [56]. HuE6AP can also promote the ubiquitination and proteasomal degradation of the promyelo c ytic leukemia (PML) tumor suppressor. The degradation of this gene is crucial for the formation of PML nuclear bodies (NBs) [57]; HuHUWE1 is also a HECT ubiquitin ligase, which is thought to target the ubiquitination and consequent degradation of Mcl1, p53 and c-Myc [58-60. Few years later, it found that it also ubiquitinates the N-Myc oncoprotein through Lys 48-mediated linkages and targets it for destruction by the proteasome [61].
It also kno w n that the Arabidopsis HECT E3s members play some roles in protein degradation ( Figure 5). Despite there is no direct evidence support UPL3 can ubiquitin GL3 and EGL3, which are two bHLH transcription factors that positively regulate trichome development and lavonoid biosynthesis in Arabidopsis. Also found that, UPL3 can mediate the degradation of these two transcription factors [62]; UPL5 has been reported to utilize its leucine zipper domain to interact with tran s c r i p t i o n factor WRKY53 and ubiquitinated it for degradation by the proteasome [63].

Role of HECT ubiquitin ligases family in transc r i p t i o n regulation
There are two keys in transcription regulation, the transcription factors and epigenetics. The later key regulates gene transcription through alter ation in chromatin structure and gene accessibility by modulating DNA methylation, histone methylation, histone ubiquitination, etc. Recent studies show that ubiquitination is involved in transcription regulation through degrading transcription factors and Chromatin- modifying proteins or ubiquitinating histones. An example of this is the RING E3 AtJMJ24, which regulates transcription via targeting and ubiquitinating a plant-speci ic DNA methyltransferase CHROMOMETHYLASE 3(CMT3) and then for proteasomal degradation in Arabidopsis [64]; Another exam p le of transcription regulation by E3s is a multi-domain E3 UHRF2, which can stabilize the acetyltransferase AtTIP60 and regulate H3K9ac and H3K14ac via RING inger domain [65]; In addition, PARAQUAT TOLERANCE3(PQT3), encoding an E3 ubiquitin ligase, recognize transcription factors PRMT4b(PROTEIN METHYLTRANSFERASE4b) for targeted degradation via 26S proteasome [66]. While PRMT4b can upregulate the expression of APX1 and GPX1, encoding two key enzymes against oxidative stress by histone methylation. At the same time, interestingly, it found new functions for Lys29-linked chains in epigenetics. And this is found in the identi ication of Lys29/Lys33-speci icity in the deubiqu i tinase TRABID. TRABID can aggregate its own Lys29 / Lys33 to form the corresponding polyubiquitin chains in the cell [67,68] and regulates epigenetic by targeting the histone demethylase Jmjd2d, which it appears to deubiquitinate and stabilize, so that Jmjd2d can act on the interleukin gene promoters to release repression [69].
However, it still mainly focus on the degradation of transcription factor or other factors in the functional research of HECT E3s in transcription regulation. Such as, ScRsp5 regulates transcription by binding to Rpo21 (also known as Rpb1), which is the large subunit of RNA polymerase (Pol) II [70,71]; HuITCH, a member of Nedd4 family, controls the levels of its substrate JUNB, which regulates the transcription of the interleukin 4 (IL4) gene [72]. UPL5 can degrade WRKY53 to regulate Arabidopsis leaf senescence. It has also been found that WRKY53 can play different roles upstream of other WRKY factors ( Figure 5). And interestingly, through experiments, strong YFP luorescence signals can be detected in the nucleus, and it is found that UPL3 interacts with GL3 or EGL3 [62], which suggests that upl3 may play a role in transcription regulation by interacting with GL3 or EGL3.

Other functions of HECT ubiquitin ligases family
In addition to protein degradation and transcriptional regulation, there are other functions of ubiquitination, including DNA repair, cell-cycle progression and signal transduction. And studies show that unanchored ubiquitin and ubiquitin chains, with or without modi ications, can function as second messengers in cells.
But about the function and mechanisms of Arabidopsis HECT E3s members are still mysteries. upl3 mutants show that some of the nucleus DNA content became 64 C, which indicate UPL3 can limit the number of DNA replication, but it is not known how to limit it. And previous studies also found that UPL3 and UPL 4 synergistically regulate gibberellin signaling [73]. Current results also s h ow that UPL3 and UPL4 highly expressed in senescence leaves and the senescence related genes such as AHK3 `D LS1`AT G 5 and NEET were upregulated in the upl3 and upl4 mutants, respectively [74]. But the mechanisms of all of these phenomenon are not known.

Conclusion and Speculation
The ubiquitin-proteasome pathway is a major process for the degradation of some short-lived proteins in eukaryotes, and plays a key effect in the regulation of protein levels and gene expression regulation [6,8]. Although the E3s determines most of the substrate speci icity for this pathway, the target and function of how to select for ubiquitination of such enzymes in plants remains unclear. HECT E3s is an important part of the ubiquitination mechanism. Since the discovery of E6-AP [35], although we have made great progress in our work on HECT E3, there are still many unanswered questions, especially in plants.
The UPL genes of the same subfamily share many similarities, such as gene structure, number of exons and introns (Figure 3). In Arabidopsis thaliana, other domains in the N-terminal region of AtUPLs, including ARM, UBA, UIM, etc may involve interactions with speci ic substrates (Figure 4). The UBA domain is common in a variety of proteins that are involved in Ub metabolism, therefore, it may be important for Ub binding [49]. The UIM domain, as a binding site for the polyubiquitin chain that interacted with Ub, is detected in other proteins [50,75]. Recent research has found that ARM repeat proteins play a major role in many abiotic stress and reproductive development processes [76,77]. Therefore, the detail mechanism and function will expect to be demonstrated in future.
The study of HECT domains is crucial for further understanding of UPLs in plants. UPL3 has been shown to be related to the growth of trichome in Arabidopsis thaliana plants [40]. By analyzing the expression patterns, most of the UPLs were highly expressed in cauline leaves [78,79]. Therefore, we speculated that in the process of Arabidopsis thaliana senescence, the HECT gene in senescent leaves may be involved in the ubiquitination of related some proteins, thereby regulating leaf senescence. Though only UPL5 was reported to regulate WRKY53 protein degradation and WRKY53 expression and affect leaf senescence [63], other UPLs highly expressed in the senescence leaves by e-brome data (http://www.bar.utoronto.ca/). Therefore, studying HECT gene function in the regulation of Arabidop sis plant ageing will become a new research direction.