Abstract
Motivation: Understanding the deterministic factors for nucleosome-forming is important to gene regulation and cellular functions. Recent development of next-generation sequencing (NGS) techniques has made genome-wide nucleosome occupancy data available. Based on these data, various features including a number of DNA sequence features and structural features have been reported predictive for nucleosome forming. However, the contributions of various DNA features to nucleosome-forming in the same or different species remain unclear.
Results: We compiled 779 features and developed a pattern discovery and scoring method FFN (Finding Features for Nucleosomes) to identify features and their combinations that influence nucleosome-forming and inhibition. Applying the FFN to nucleosome occupancy data in yeast and human, we identified many feature combinations that are differentially enriched in nucleosome-forming and nucleosome-depletion sequences, many of which are indeed common to the two species. We found both sequence and structural features are important in nucleosome occupancy prediction. We also discovered that even for the same feature combinations, variations in feature values can give in different predictive power for nucleosome-forming. We conclude that the predictive power of these feature patterns for nucleosome occupancy is not only sequence-dependent, but also relevant to nucleosome location and gene transcriptional levels.