Skip to main content
Fig. 6 | BMC Plant Biology

Fig. 6

From: A k-mer grammar analysis to uncover maize regulatory architecture

Fig. 6

Local Sequence Context Defines Distinctive Groups ofk-mers Between Regulatory and Random Regions. a Schematic of aligning flanking contexts versus contrasting local sequence composition, as implemented in the “vector-k-mer” models in which k-mers that share a similar context would be represented by close vectors (vk-mers) in a geometric space. b The vector space obtained from core promoters (Vregulatory) and their corresponding control (Vrandom) define two different groups of closest k-mers (cosine similarity) to the ’CTATATA’ vector (vCTATATA). The group of closest k-mers in Vregulatory, when compared to the group formed in Vrandom, are more similar in sequence (shorter edit distance), and have in average more positive k-mer scores from the equivalent “bag-of-k-mers” model. This implies a semantic-like relationship between those k-mers in regulatory sequences versus random regions. c The group of k-mers closest in the Vregulatory space have similar positional preferences (blue solid lines) to CTATATA (black dotted line) in the region expected for the TATA element. d In addition, the group of k-mers closest in the Vrandom (red solid lines) do not show similar positional constraints to CTATATA (black dotted line) do not show positional preferences relative to the TSSs

Back to article page