Skip to main content
Fig. 3 | BMC Plant Biology

Fig. 3

From: A k-mer grammar analysis to uncover maize regulatory architecture

Fig. 3

Low complexity regions do not provide relevant information to discriminate regulatory regions. a Annotation at a base pair level of the first 1000 bases pairs of the long intron in the maize gene ga2ox1 using sequence complexity (Entropy), scores from “bag-of-k-mers” models (Full and Filtered), and regulatory probabilities (Probability) from the “vector-k-mers” model. Sequence complexity and “bag-of-k-mers” scores were calculated using a 1bp sliding window of size k. Regulatory probabilities were calculated using a 1bp sliding window of 3*k to evaluate co-occurrence of groups of 3 k-mers. The evaluated region includes the KN1 ChIP-seq peaks as identified from two biological replicates in developing ears (the center of the peak for each replicate is indicated with a vertical dotted line). b For all the models tested with KN1 unbalanced holdout set the performance measured as area under the PR curve shows the best performance for the “bag-of-k-mers” models at k=8. c The PR curve for (k=8) “bag-of-k-mers” filtering low-complexity k-mers shows similar performance than full 8-mer vocabulary across all the regulatory regions for the different decision threshold

Back to article page