Hi! I’m Peter!


  • Zimin Words and Bifixes

    One of the earliest contributions to the On-Line Encyclopedia of Integer Sequences (OEIS) was a family sequences counting the number of words that begin (or don’t begin) with a palindrome:

    • Let \(f_k(n)\) be the number of strings of length \(n\) over a \(k\)-letter alphabet that begin with a nontrivial palindrome” for various values of \(k\).
    • Let \(g_k(n)\) be the number of strings of length n over a \(k\)-letter alphabet that do not begin with a nontrivial palindrome.
    • Number of binary strings of length \(n\) that begin with an odd-length palindrome. (A254128)

    (If I had known better, I would have published fewer sequences in favor of a table, and I would have requested contiguous blocks of A-numbers.)

    I must have written some Python code to compute some small terms of this sequence, and I knew that \(g_k(n) = k^n – f_k(n)\), but I remember being at in my friend Q’s bedroom when the recursion hit me for \(f_k(n)\): $$f_k(n) = kf_k(n-1) + k^{\lceil n/2 \rceil} – f_k\big(\lceil \frac n 2 \rceil \big)$$

    “Bifix-free” words

    One sequence that I didn’t add to the OEIS was the “Number of binary strings of length n that begin with an even-length palindrome”—that’s because this was already in the Encyclopedia under a different name:

    A094536: Number of binary words of length n that are not “bifix-free”.

    0, 0, 2, 4, 10, 20, 44, 88, 182, 364, 740, 1480, 2980, 5960, …

    A “bifix” is a shared prefix and suffix, so a “bifix-free” word is one such that all prefixes are different from all suffixes. More concretely, if the word is \(\alpha_1\alpha_2 \dots \alpha_n\), then \((\alpha_1, \alpha_2, \dots, \alpha_k) \neq (\alpha_{n-k+1},\alpha_{n-k+2},\dots,\alpha_n)\) for all \(k \geq 1\).

    The reason why the number of binary words of length \(n\) that begin with an even length palindrome is equal to the number of binary words of length \(n\) that have a bifix is because we have a bijection between the two sets. In particular, find the shortest palindromic prefix, cut it in half, and stick the first half at the end of the word, backward. I’ve asked for a better bijection on Math Stack Exchange, so if you have any ideas, please share them with me!

    In 2019–2020, Daniel Gabric, Jeffrey Shallit wrote a paper closely related to this called Borders, Palindrome Prefixes, and Square Prefixes.

    Zimin words

    A Zimin word can be defined recursively, but I think it’s most suggestive to see some examples:

    • \(Z_1 = A\)
    • \(Z_2 = ABA\)
    • \(Z_3 = ABACABA\)
    • \(Z_n = Z_{n-1} X Z_{n-1}\)

    All Zimin words \(Z_n\) are examples of “unavoidable patterns”, because every sufficiently long string with letters in any finite alphabet contains a substring that matches the \(Z_n\) pattern.

    For example the word \(0100010010111000100111000111001\) contains a substring that matches the Zimin word \(Z_3\). Namely, let \(A = 100\), \(B = 0\), and \(C = 1011\), visualized here with each \(A\) emboldened: \( 0(\mathbf{100}\,0\,\mathbf{100}\,1011\,\mathbf{100}\,0\,\mathbf{100})111000111001\).

    Binary solo from Bret at 1:26.

    I’ve written a Ruby script that generates a random string of length 29 and uses a regular expression to find the first instance of a substring matching the pattern \(Z_3 = ABACABA\). You can run it on TIO, the impressive (and free!) tool from Dennis Mitchell.

    # Randomly generates a binary string of length 29.
    random_string = 29.times.map { [0,1].sample }.join("")
    p random_string
    # Finds the first Zimin word ABACABA
    p random_string.scan(/(.+)(.+)\1(.+)\1\2\1/)[0]
    # Pattern:             A   B   A C   A B A

    Why 29? Because all binary words of length 29 contain the pattern \(Z_3 = ABACABA\). However, Joshua Cooper and Danny Rorabaugh’s paper provides 48 words of length 28 that avoid that pattern (these and their reversals):



    The Zimin Word \(Z_2 = ABA\) and Bifixes

    The number of Zimin words of length \(n\) that match the pattern ABA is equal to the number of of words that begin with an odd-length palindrome. Analogously, the number of words with a bifix is equal to the number of words that begin with an even-length palindrome. The number of these agree when \(n\) is odd.

    I’ve added OEIS sequences A342510A342512 which relate to how numbers viewed as binary strings avoid—or fail to avoid—Zimin words. I asked users to implement this on Code Golf Stack Exchange.

  • My Favorite Sequences: A263135

    This is the fourth in my installment of My Favorite Sequences. This post discusses sequence A263135 which counts penny-to-penny connections among \(n\) pennies on the vertices of a hexagonal grid. I published this sequence in October 2015 when I was thinking about hexagonal-grid analogs to the “Not Equal” grid. The square-grid analog of this sequence is A123663.

    A263135: Placing Pennies

    The sequences A047932 and A263135 are about placing pennies on a hexagonal grid in such a way that maximizes the number of penny-to-penny contacts, which occurs when you place the pennies in a spiral. A047932, counts the contacts when the pennies are placed on the faces of the grid; A263135 counts the contacts with the pennies placed on the vertices.

    Pattern of placing pennies in A047932.
    Pattern of placing pennies in A263135.

    While spiral shapes maximize the number of penny-to-penny contacts, there are sometimes non-spiral shapes that have the same number of contacts. For example, in the case of the square grid, there are \(A100092(n)\) such ways to lay down \(n\) pennies on the square grid with the maximum number of connections. Problem 108 in my Open Problems Collection asks about generalizing this OEIS sequence to other settings such as the hexagonal grid.

    A047932(11) = 21
    A263135(22) = 27

    Comparing contacts

    Notice that the “face” pennies in A047932 can have a maximum of six neighbors, while the “vertex” pennies in A263135 can have a maximum of three. In the limit, most pennies are “interior” pennies with the maximum number of contacts, so \(A047932(n) \sim 3n\) and \(A263135(n) \sim \frac32n\).

    Looking at the comparative growth rates, it is natural to ask how the number of connections of \(n\) face pennies compares to the number of connections of \(2n\) vertex pennies. In October 2015 I made a conjecture on the OEIS that this difference grew like sequence A216256.

    Conjecture: For \(n > 0\), \[A263135(2n) – A047932(n) = \lceil\sqrt{3n – 3/4} – 1/2\rceil = A216256(n).\]

    I believe that the sequence A216256 on the right hand side appears to be the same as the sequence “n appears \(\displaystyle\left\lfloor \frac{2n+1}{3} \right\rfloor\) times,” but I’d have to crack open my Concrete Mathematics book to prove it.

    This is Problem 20 in my Open Problem Collection, and I’ve placed a small, $5 bounty on solving this conjecture—so if you have an idea of how to prove this, let me know in exchange for a latte! I’ve asked about this in my Math Stack Exchange question Circle-to-circle contacts on the hexagonal grid—so feel free to answer there or let me know on Twitter, @PeterKagey.

  • My Favorite Sequences: “Not Equal” Grid

    This is the third installment in a recurring series, My Favorite Sequences. This post discusses OEIS sequence A278299, a sequence that took over two years to compute enough terms to add to the OEIS with confidence that it was distinct.

    This sequence is discussed in Problem #23 of my Open Problems Collection, which asks for the smallest polyomino (by number of cells) whose cells you can color with \(n\) different colors such that any two different colors are adjacent somewhere in the polyomino. As illustrated below, when there are \(n=5\) colors (say, green, brown, blue, purple, and magenta) there is a \(13\)-cell polyomino which has a green cell adjacent to a blue cell and a purple cell adjacent to a brown cell and so on for every color combination. This is the smallest polyomino with the \(5\)-coloring property.

    Five colors of blocks, where any two different colors of blocks are adjacent somewhere in the polyomino.

    The Genesis: Unequal Chains

    The summer after my third undergraduate year, I decided to switch my major to Math and still try to graduate on time. Due to degree requirements, I had to go back and take some lower-division classes that I was a bit over-prepared for. One of these classes—and surely my favorite—was Bill Bogley‘s linear algebra class, where I half-way paid attention and half-way mused about other things.

    Bill wrote something simple on the board that sparked inspiration for me: $$a \neq b \neq c \neq a.$$ He wrote this to indicate that \(a\), \(b\), and \(c\) were all distinct, and this got me thinking: if we have to write a string of four variables in order to say that three variables are distinct, how many would we have to write down to say that four variables were distinct? It turns out that \(8\) will do the trick, with one redundancy: $$a\neq b \neq c \neq d \neq b \color{red}{\neq} c \neq a.$$ Five variables? \(11\): $$a_1 \neq a_2 \neq a_3 \neq a_4 \neq a_5 \neq a_3 \neq a_1 \neq a_4 \neq a_2 \neq a_5.$$ What about \(n\) variables?

    My colleague and the then-President of the OSU Math Club, Tommy Pitts, made quick work of this problem. He pointed out that “not equal” is a symmetric, non-transitive, non-reflexive relation. This means that we can model this with a complete graph on \(n\) vertices, where each edge is a relation. Then the number of variables needed in the expression is the number of edges in the complete graph, plus the minimum number of Eulerian paths that we can split the graph into. Searching for this in the OEIS yields sequence A053439. $$A053439^*(n) = \begin{cases} \binom{n}{2} + 1 & n \text{ is odd} \\ \binom{n}{2} + \frac n 2 & n \text{ is even}\end{cases}$$

    A Generalization: Unequal Chainmail

    This was around 2014, at which time I was writing letters to my friend Alec Jones whenever I—rather frequently!—stumbled upon a new math problem that interested me. In the exchange of letters, he suggested a 2D version of this puzzle. Write the \(n\) variables in the square grid, and say that two variables are unequal if they’re adjacent.

    While Tommy solved the 1D version of the problem quickly, the 2D version was much more stubborn! However we were able to make some progress. We found some upper bounds (e.g. the 1D solution) and some lower bounds, and we were able to prove that some small configurations were optimal. Finally, in November 2016, we had ten terms: enough to prove that this sequence was not in the OEIS. We added it as A278299.

    \(a(n)\) is the tile count of the smallest polyomino with an \(n\)-coloring such that every color is adjacent to every other distinct color at least once.

    OEIS sequence A278299.

    (In May 2019, Alec’s student Ryan Lee found the \(11\)th term: \(A278299(11) = 34\). \(A278299(12)\) is still unknown.)

    A screenshot from my game illustrating the largest known term: \(A278299(14) = 56\). Every number is connected to every other number. The red edges refer to redundant connections.

    We found these terms by establishing some lower bounds (as explained below) and then implementing a Javascript game (which you can play here) with a Ruby on Rails backend to allow people to submit their hand-crafted attempts. Each solution was constructive proof of an upper bound, so when a user submitted a solution that matched the lower bound, we were able to confirm that term of the sequence.

    (One heuristic for making minimal configurations is to start with the construction in OEIS sequence A260643 and add cells as necessary in an ad hoc fashion.)

    Lower bounds

    There are a few different ways of proving lower bounds.

    • We know that there needs to be at least \(\binom{n}{2}\) relations, one between each pair of variables. OEIS sequence A123663 gives the “number of shared edges in a spiral of n unit squares,” which can be used to compute a lower bound: $$A039823(n) = \left\lceil \frac{n^2+n+2}{4}\right\rceil$$
    • Every number needs to be in contact with at least (n-1) other numbers, and each occurrence can be in contact with at most (4) others. So each number needs to occur at least \(\lceil \frac{n-1}{4}\rceil\) times, for a total of \(n\lceil \frac{n-1}{4}\rceil\) occurrences. This bound is usually weaker than the above bound.
    • For the cases of \(n = 5\) and \(n=9\), the lower bounds were proved using ad hoc methods, by looking at how many cells would need to have a given number of neighbors.

    Upper Bounds

    Besides the upper bound that comes from the 1-dimensional version of the problem, that only upper bounds that I know of come from hand-crafted submissions on my Javascript game on my website.

    Do you have any ideas for an explicit and efficient algorithm for constructing such solutions? If so, let me know on Twitter @PeterKagey.


    The lower and upper bounds show that this is asymptotically bounded between \(n^2/4\) and \(n^2/2\). It’s possible that this doesn’t have a limit at all, but it would be interesting to bound the liminf and limsup further. My intuition is that \(n^2/4\) is the right answer, can you prove or disprove this?


    • We could play this game on the triangular grid, or in the 3-dimensional cubic grid. Do you have ideas of other graphs that you could do this on?
    • This game came from Tommy’s analysis of looking at “not equal to” as a symmetric, non-reflexive, non-transitive relation. Can you do a similar analysis on other kinds of relations?
    • Is there a good way of defining what it means for two solutions to be the same? For a given number of variables, how many essentially different solutions exist? (Related: Open problem #108.)
    • What if we think of left-right connections as being different from up-down connections, and want both? Or what if we want each variable \(x\) to be neighbors with another \(x\)?

    If you have ideas about these questions or any questions of your own, please share them with me by leaving a comment or letting me know on Twitter, @PeterKagey!