Recently Dr. Yu Xue’s group, at the Huazhong University of Science and Technology in Wuhan, China, developed a database designated as Eukaryotic Writers, Erasers and Readers protein of Histone Acetylation and Methylation system Database (WERAM). WERAM is a comprehensive database containing integrated information on the writers, erasers, and readers of histone acetylation and methylation. Namely writers are the enzymes that catalyze acetylation and methylation, the erasers are the enzymes that remove these marks, and the readers are proteins that recognize and interact with these acetylated and methylated sites. This database comprises integrated nucleotide and protein information on ~20,000 acetylation and methylation regulators (writers, erasers and readers) in 148 eukaryotic species.
Xue Y, et al., started their project by searching the scientific literature published since 2011 onward. From here, they manually curated 248 acetylation regulators and 336 methylation regulators, for a total of 584 experimentally identified acetylation and methylation regulators in 8 species (H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, A. thaliana, S. pombe, and S. cerevisiae). They also gathered approximately 900 site-specific histone-histone regulator relations in the 8 species. Their protein domain information was assessed through the UniProt database.
“We developed the WERAM database mainly using three proteome databases,” explained Dr. Xue, “including Ensembl for human and animals, Ensembl Plants, and Ensembl Fungi. These databases were well organized to provide non-redundant data sets of nucleotide and protein sequences.”
Based on experimental evidence, the acetylation regulators were classified into 15 families: 8 histone acetyltransferase (HAT), 5 histone deacetylase (HDAC), and 2 acetyl-reader families. Whereas the methylation regulators were classified into 32 families: 10 histone methyltransferase (HMT), 8 histone demethylase (HDM) and 14 methyl-reader families.
Besides the experimentally identified regulators, they aligned the catalytic domain sequences of different members of a family, and constructed Hidden Markov models (HMMs) to computationally predict and identify other histone regulators genome-wide. They developed HMM profiles for 13 acetylation families and for 30 methylation families. For the 4 families without HMM profiles, they performed searches based on orthology. Altogether, they identified and integrated 20,033 histone regulators, comprising 1337 HATs, 2504 HDACs, 3901 acetyl-readers, 4409 HMTs, 1610 HDMs, and 10949 methyl-readers, from 148 eukaryotic species. These 148 eukaryotic species include 68 animals, 39 plants and 41 fungi.
This comprehensive tool can be accessed freely at WERAM. Researchers can search this user-friendly database by species or by classification to obtain information on nucleotide and protein sequences. For example, if interested in SET1, an HMT enzyme, in “browse by classification”, select “histone methylation”, then pick “HMT”, followed by “SET1”, and the user is able to see family information across 148 eukaryotic organisms. There are other existing databases for histone modifications and regulators, including HHMD, HIstome, dbHiMo, and HistoneDB, however these have mainly focused on humans or fungi. To the best of our knowledge, this is the first integrative database for acetylation and methylation regulators in a wide range of eukaryotes.
Dr. Xue shared with us that their future plans for expansion of the database include “adding multiple layers of information for integrated data [such as] genomic variations, gene expression, epigenetic data, protein expression, post-translational modifications, and associated diseases if available.”
The members of this group have also developed other tools, some of which include information on non-histone proteins as well, such as GPS-PAIL for predicting acetyltransferase-specific acetylation sites, for which they also collected several hundreds of acetylation sites in non-histone proteins, and GPS-MSP to distinguish different types of methylation modifications in proteins.
Referring to the implications of WERAM, Dr. Xue says that “WERAM provides a comprehensive resource for histone regulators and site-specific modifications. Using the WERAM database, site-specific modification antibodies can be developed for ChIP-Seq analysis to demonstrate the functional impact of histone modifications. Also, some other technologies, such as Cas9, can be used to fuse with histone regulators, to regulate gene expression in a precise manner.”