WebSite

  • PreDREM is a collection of regulatory DNA motifs and motif modules predicted in DNase I hypersensitive sites (DHSs) from 349 human cell or tissue samples using SIOMICS. This database contain motifs predicted in each of the 349 datasets and 3894 non-redundant motifs that are resulted from the clustering of the motifs discovered in the 349 datasets.
  • MERCED is a server about systematic discovery of cis-regulatory elements in C.reinhardtii. Recently, large-scale genomic data in microalgal species have become available,which enable the development of efficient computational methods to systematically identify CREs and characterize their roles in microalgae gene regulation. We performed in-silico CRE identification at the whole genome level in C. reinhardtii using a comparative-genomics-based method. Both the CRE identification results and the software are available at this server.
  • GreenFuel is an knowledge dissemination website about microalgae - the next biofuel producer.This website mainly focuses on microalgae engineering and microalgae gene regulation.
  • Epigenetics is an educational website about the study of epigenetics. Epigenetics quite literally means "above the genome" because it explains how a cell's phenotype can be altered without any modification to the DNA sequence. The main topics discussed are gene regulation, epigenetic mechanisms, inheritance, and disease.
  • RSMIRT is a robust set of miRNA TSSs, extracted carefully from the recently published studies.

Tools

  • BaFulow is a novel algorithm for reconstructing haplotypes of bacterial populations that have low mutation frequency. Given a set of short reads corresponding to a bacterial population containing different haplotypes, BaFulow is capable of distinguishing haplotypes from each other and provides their sequence and coverage depth.
  • D-miRT is a deep learning-based tool for computational prediction of condition-specific miRNA transcription start sites.
  • PETModule is a software developed to find enhancer target gene (ETG) pairs through a motif module based approach. 
  • MDPS is a tool to characterize position-wise pairing patterns of miRNA–target interactions based on Markovian models. 2020.
  • TarPmiR is a software for predicting miRNA target site from CLASH (cross-linking ligation and sequencing of hybrids) data.
  • MBMC is a software for binning metagenomic reads from environmental shotgun sequencing projects.
  • MDPS is a tool to characterize position-wise pairing patterns of miRNA–target interactions based on Markovian models. 2020.
  • SIOMICS is a software developed to de novo identify motifs in large sequence datasets such as those from ChIP-seq experiments. 
  • FFN (Finding Features for Nucleosomes) is a pattern discovery and scoring algorithm to identify feature patterns that are differentially enriched in nucleosome-forming sequences and nucleosome-depletion sequences. Source code download.
  • rRNAFilter is a novel taxonomy-independent approach which can accurately and rapidly filter rRNA sequences from metatranscirptomes without the aid of reference rRNA databases. It not only has comparable accuracy with current similarity-based methods, but also much faster than these methods. 2016.
  • miRModule is a software for systematic discovery of miRNA modules from a set of predefined miRNA target sites
  • ChIPModule is a software tool for systematical discovery of transcription factors and their cofactors from ChIP-seq data. Given a ChIP-seq dataset and motifs of a large number of transcription factors, ChIPModule can efficiently identify groups of motifs, whose instances significantly co-occur in the ChIP-seq peak regions. By discovering groups of co-occurring motifs, ChIPModule enables the systematic study of cofactors in ChIP-seq peak regions. Windows version download, Unix version download, manual, result_E2F1, result_ESR1.
  • activeTF is to find a set of coordinately activated Transcription Factors (TF) from a given gene expression dataset.
  • CODENSE is a software package to mine coherent dense subgraphs from multiple biological networks. CODENSE is short for Mining Coherent Dense Subgraphs. By simplifying the problem of identifying coherent dense subgraphs across n graphs into a problem of identifying dense subgraphs in two special graphs: the summary graph and the second-order graph. CODENSE can efficiently mine frequent coherent dense subgraphs across large numbers of massive graphs.
  • MODES is short for Mining Overlapping DENSE Subgraphs. MODES is developed based on HCS (Mining Highly Connected Subgraphs) (Hartuv & Shamir, 2000), with two new features: (1) MODES is more efficient in identifying dense subgraphs; and more importantly, (2) MODES can discover overlapping subgraphs.
  • MOPAT (Motif Pair Tree) identifies CRMs through the identification of motif modules, groups of motifs co-occurring in multiple CRMs. It can identify ‘orthologous’ CRMs without multiple alignments. It can also find CRMs given a large number of known motifs. Unix version download, cygwin version download.
  • Tree Gibbs Sampler is a software for identifying motifs by simultaneously using the motif overrepresentation property and the motif evolutionary conservation property. It identifies motifs without depending on pre-aligned orthologous sequences, which makes it useful for the extraction of regulatory elements in multiple genomes of both closely related and distant species. Windows version download, Unix version download.
  • EPIP is a software to predict cell-specific interactions between enhancers and gene promoters from a given list of enhancer and gene locations. From the provided list of enhancers and genes of any of the 8 well-known cell lines GM12878, HELA, HMEC, HUVEC, IMR90, K562, KBM7 and NHEK; EPIP calculates 14 epigenetic features separately for each enhancer and gene promoter regions and also 3 additional mutual features. Based on these features, EPIP can make a decision if an enhancer and promoter pair interacts in that cell line. EPIP can also predict the EPIs (enhancer-promoter interactions) in a new cell-line if any of the following 14 epigenetic data for that cell line is provided as well; H3k4me1, H3k4me2, H3k4me3, H3k9ac, H3k27ac, H3k27me3, H3k36me3, H3k79me2, H4k20me1, Ctcf, DNaseI, Pol2, Rad21, and Smc3.