Talk:Thue–Morse sequence

A fact from Thue–Morse sequence appeared on Wikipedia's Main Page in the Did you know column on 28 February 2004. The text of the entry was as follows:

Did you know... that the Thue-Morse sequence has applications in number theory, combinatorics, differential geometry and chess?

A record of the entry may be seen at Wikipedia:Recent additions/2004/February.

Wikipedia

Unclear code, my implementation

To me, the pseudocode was not very clear, especially considering indexOfHighestOneBit was neither implemented nor explained.

Here is my implementation in Python, which uses the "When counting in binary, the digit sum modulo 2 is the Thue–Morse sequence" fact. itertools.count() just counts 0, 1, 2, 3... forever (this could also be written as a while loop to make it into pseudocode). bit_count() calculates the Hamming weight, and & 1 is like taking "modulo 2"/parity.

 import itertools
 def thue_morse():
   for n in itertools.count():
       yield n.bit_count() & 1
 for b in thue_morse():
     print(b, end=", ")

2605:A601:A621:3500:9C07:ED35:CE4A:8532 (talk) 04:50, 3 August 2023 (UTC)[reply]

(History section and John Colosimo)

Hi,

(That's my 1st wikipedia entry so - my apologies if something is wrong...)

In the 'History' section, user 'colosimo' states:

'Also, the sequence was rediscovered in 2000 by 8th grade student John Colosimo, who used it as the perfect order in which participants in turn-taking games could most fairly take turns. It especially applies in games where the players move along a path to reach a desired goal, eliminating any advantage potentially gained by either player by taking the first turn.'

An interesting application for the Thue-Morse Sequence. Congratulations. However, I searched the web for an evidence of this statement without finding one. There are may other examples of people who 'happened' to re-discovered the Thue-Morse Sequence, so I suggest either to give an evidence (publication) that John Colosimo indeed re-discovered the Thue-Morse Sequence, or to delete the section.

Best regards

--80.171.178.236 16:23, 20 December 2006 (UTC) Janosch[reply]

\lim _{n\rightarrow \infty }a(n)

is indeed balanced, though one must concede that most games consist of finite turns, and of those, most are predisposed to granting an unfair advantage to one side or the other, even when attempting to offset that advantage with Thue-Morse, due to the fact that the number of turns tends to most often fall in a fairly narrow band (which can be trivially proven by observing that in a game which typically finishes in a single turn, the first player will have a significant advantage!).

Even in situations where the number of turns is distributed in such a way that it offsets this advantage, many games still give undue benefit to the first player (for example, in Monopoly, if the best properties in the game were the light-blue properties—they aren't—which are available within a single move of the starting point, the first player would have the opportunity to land on such a property and claim it, whereas the second player would have a reduced chance to claim a light-blue property because one of the properties on which he might land has already been claimed). Nonetheless, the described reasoning is much the same as I employed when I discovered Thue-Morse at a very young age. It's worth noting that the first term of Thue-Morse must be discarded when you are attempting to use it to "balance" in this manner. Jouster (whisper) 18:56, 28 November 2007 (UTC)[reply]

I believe this sequence is well known to very many people having the slightest degree of Obsessive Compulsive Disorder. That is because it seems like the most "fair" sequence of things. For a person with OCD, it is often important that if he first turns his head to the right, then he must also turn it to the left. But because having the head turned first is more valuable than last, he must then turn the head to the left again. Then to the right, since the number must be equal. And then the generalization is easy to find.

"It especially applies in games where the players move along a path to reach a desired goal, eliminating any advantage potentially gained by either player by taking the first turn." - Provided that advantages are equally distributed throughout the game, and that the path is infinite... In most games, none of these premises apply, but certainly, in many games these can be taken as approximations. Then the Thue-Morse sequence may balance out most of the advantage. And even when the premises do not apply, the sequence may reduce the unfairness. Consider the situation in which the children are to divide into teams for an informal soccer match. First, the two top players are appointed for choosing. Then they choose one player at at time, following the Thue-Morse sequence. This will balance out part of the advantage of choosing first. Elias Hasle —Preceding unsigned comment added by 77.40.128.194 (talk) 09:47, 19 October 2009 (UTC)[reply]

I think my addition addresses several of the points made above. It basically provides the mathematical underpinning for the observations of "8th grade student John Colosimo." And it dates to my own OCD behavior governing the order in which I stepped over sidewalk cracks starting around age 9 (in 1959). Robert Richman 01:36, 16 January 2012 (UTC) — Preceding unsigned comment added by Rmrichman (talk • contribs)

Image:Morse-Thue sequence.gif

Just FYI.--80.136.165.54 22:06, 8 July 2007 (UTC)[reply]

BTW: "Since Thue published in German, his work was ignored at first" is nonsense, Einstein also published in German. At that time, you could not just ignore Hilbert's publications because they were in German, or Poincaré's because they were French. (Actually, there is still quite a lot of mathematics being published in French.)--80.136.184.224 09:09, 9 July 2007 (UTC)[reply]

Spaces

This article contains the worst explanation of a sequence I have ever seen. The beginning of the article shows the first few terms of the sequence with spaces in between some of the bits, making it seem like the first element is 0, the next is 10, etc. when in fact each element of the sequence is just one bit. I have no clue why someone thought this would make the definition comprehensible. I had to read the article several times to figure out what was meant. I demand this be rewritten.— Preceding unsigned comment added by 76.183.88.237 (talk) 04:36, 5 April 2012 (UTC)[reply]

The clue is in the words "binary sequence" in the first sentence, which links to bitstream. However, since using spaces to divide up the sequence into blocks seems to have confused you, I have removed them. Gandalf61 (talk) 08:18, 5 April 2012 (UTC)[reply]

Article conflates three separate things

The article uses the term "Thue-Morse sequence" to mean at least three distinct things:

a) the sequence of finite strings 0, 01, 0110, 01101001, 0110100110010110, . . .

b) the successive overlaying of each of those finite strings on the first half of the next, to make one (semi-)infinite string 0110100110010110 . . .

c) the sequence of binary digits occurring in b).

It would be a very good idea if these were disambiguated by being very clear in the article just which one of these is being referred to in any given statement.

The article with its current confusion is very sloppy and hence inappropriate for an encyclopedia article.

I hope someone more knowledgeable than I will fix this.Daqu (talk) 08:23, 7 July 2014 (UTC)[reply]

acoustic representation

acoustic representation of the Thue-Morse sequence as a series of square waves

i added this audio to the article which was removed with the comment "can't see that as being very useful, and it's somewhat unclear how the sequence was being converted to sound anyway". as for the usefulness, i personally wondered what it sounded like and i think it has some musical potential (though, maybe in a less raw form..). as for the second point: 0 is represented by silence, 1 by a square wave. the commons-description shows the method i used. --sofias. (talk) 21:22, 8 April 2019 (UTC)[reply]

From Wikipedia:Image use policy#Adding images to articles, "The purpose of an image is to increase readers' understanding of the article's subject matter, usually by directly depicting people, things, activities, and concepts described in the article. ...". First, I realize this is talking about images and not audio clips, but there doesn't seem to be the same level of policy/guideline detail set out for audio, maybe because it tends to serve a different purpose in articles. But this case isn't about adding a clip of a song to an article about a song, or something like that, so I think the point that the above quote is making is still worth applying. On a side note, without bothering to import this thing and actually look at the waveform, I can't be sure, but there should be periods of sound of equal length with those of silence, and to me, the tones sound longer than the silences. I could be wrong, but there's zero documentation for the utility you pointed to, and that's kind of a problem if it can't be verified. Anyway, that's all kind of beside the point, because even if it were correct, I can't see that it's useful to include. –Deacon Vorbis (carbon • videos) 22:04, 8 April 2019 (UTC)[reply]

Off-topic, but if you want to discuss here, please please have the common courtesy to use capitalization. This isn't just an informal chat room or something, and it does make reading a lot easier. Thanks, –Deacon Vorbis (carbon • videos) 22:11, 8 April 2019 (UTC)[reply]

slower version

I think it gives an impression about what an aperiodic yet highly regular sequence "feels" like, it's a sonification. The periods representing an isolated "1" are as long as the periods representing an isolated "0, the periods representing "11" and "00" are twice as long, accordingly. Perhaps the speed is too fast to pick this up clearly, i made a slower version. I think you are the first person who expressed this preference for capitalization to me. For me it is visual noise, but i try to remember your preference and hope i do it decently. --sofias. (talk) 23:43, 8 April 2019 (UTC)[reply]

Thue-Morse sequence curves

If $T_{k}=1$ : turn right 90°. Then go forward. Can anyone prove that the instructions I gave draw

?

A map of how frequently a point is visited is given here (larger dot = greater frequency)

— Preceding unsigned comment added by EZ132 (talk • contribs) 21:04, 14 August 2019 (UTC)[reply]

More than two? Code?

This article has info on extensions to sharing fairly between more than two and Python code. Is it worth including? --Paddy (talk) 12:07, 11 September 2019 (UTC)[reply]

That isn't an article so much as a blog post, and so it seems to fail to be a WP:RS. –Deacon Vorbis (carbon • videos) 12:28, 11 September 2019 (UTC)[reply]

renaming the page Prouhet–Thue–Morse

Prouhet was the first to study it. Shouldn't this page name rely on the full 3 names, i.e., Prouhet–Thue–Morse, as it is already the case in France ? ( cf fr.wikipedia ) — Preceding unsigned comment added by 2A01:E0A:225:9A60:80A3:4C2D:8276:F28 (talk) 12:00, 21 January 2022 (UTC)[reply]

Original research?

article might have original research, and I am actually concerned that the writing might be contrived to direct credit to particular researchers.Rich (talk) 21:46, 27 April 2022 (UTC)[reply]

thue_morse_bits() code

We can construct the sequence in forward rather than reverse order.

def thue_morse_bits(n):
    """Return a string of the first 2**n bits of the Thue-Morse sequence."""
    bits = 0
    for i in range(n):
        bits = ((bits + 1) << (1 << i)) - (bits + 1)
    return f"{bits:0{1<<n}b}"

[thue_morse_bits(n) for n in range(6)]

Should we do that? This makes use of the fact that bits can be complemented with (1 << (1 << i)) - 1 - bits. — $Q$ uantling (talk | contribs) 17:46, 25 September 2024 (UTC)[reply]

We should have at most one implementation and I lean towards having zero, especially as without sources it appears to be WP:OR. I think the other order is more natural so I would prefer not to replace it with this one. —David Eppstein (talk) 18:05, 25 September 2024 (UTC)[reply]

I think of it as WP:CALC rather than WP:OR, but without a good source we haven't verified that it is noteworthy, so we get to the same answer if we don't have consensus: don't do it. Let's see if I can find some worthy source. —

Q

uantling (talk | contribs) 18:13, 25 September 2024 (UTC)[reply]

First source I would search is Arndt's Matters Computational, already used for the other code. If a bit-hacking trick isn't in that source, it probably doesn't have a published source. —David Eppstein (talk) 19:56, 25 September 2024 (UTC)[reply]

Thank you, Arndt is a great addition to my knowledge base. I'll look through it.

Although not justifying our thue_morse_bits program itself, but addressing the reverse vs. forward list of bits: there's something called the parity number, which is a decimal point ("bit point"??) in front of the Thue–Morse sequence of bits, with sequence bits appearing left to right. So, a forward result returned by thue_morse_bits divided by an appropriate power of 2 is a rational approximation of that parity number. That's a small argument in favor of returning the forward pattern. —

Q

uantling (talk | contribs) 21:41, 25 September 2024 (UTC)[reply]

Okay, I admit that this is WP:OR or fails WP:NOTE, until we find some sort of source, but I have it down to four simple operations per iteration for the forward order.

def thue_morse_bits(n):
    """Return an int of the first 2**n bits of the Thue-Morse sequence."""
    bits = 0
    for i in range(n):
        bits = ~bits
        bits = bits - (bits << (1 << i))
    return bits

[f"{thue_morse_bits(n):0{1<<n}b}" for n in range(6)]

—

Q

uantling (talk | contribs) 17:29, 26 September 2024 (UTC)[reply]

I did not find a source. :-( Do we have consensus that this forward-string-producing approach is nonetheless okay under WP:CALC? @David Eppstein: Based upon the above conversation, I am thinking that we do not have consensus, but I don't want to go home without first double checking. —

Q

uantling (talk | contribs) 14:58, 4 October 2024 (UTC)[reply]

I think the code is interesting but that it is original research. WP:CALC is for simple numerical calculations like converting pounds to kilograms. It could reasonably apply to producing simple examples of the output of an algorithm. For instance, if we had a source for this algorithm, we could use CALC to justify saying what its output is for n=1,2,3,4. I do not think CALC applies to the description of algorithms themselves. —David Eppstein (talk) 20:54, 4 October 2024 (UTC)[reply]

The WP:CALC section includes 'Mathematical literacy may be necessary to follow a "routine" calculation, particularly for articles on mathematics or in the hard sciences.' which I take to mean that items that are easy for folks with some literacy to understand and verify qualify for this exception. If somehow that changes your "no" to a "yes", please let me know. But having already double checked, I will interpret your silence as a "no". Thank you —

Q

uantling (talk | contribs) 21:08, 4 October 2024 (UTC)[reply]

Is the "fast" implementation fast?

As far as I can tell, the given "fast" implementation isn't notably fast. The correct measure of the length $n$ of a binary sequence is the number of bits of that sequence. By this metric, the given implementation produces a sequence of length $n$ in $O(n^{2})$ time: the traversal implied by $n\oplus (n-1)$ (written as (n ^ (n - 1)) in the code) is $O(n)$ , not constant. The "doubling" Python implementation given later is also $O(n^{2})$ , I think, so is no slower.

Somebody check me on this, of course.

Given this, I suggest moving both implementations into a new Implementation section and calling it a day. Bart Massey (talk) 16:59, 31 March 2025 (UTC)[reply]

If you are measuring things in bit complexity, the "fast" implementation takes time Theta(n log n) while the "doubling" implementation takes time Theta(n^2). The point you are missing is that to generate the first n items in the sequence, the fast implementation uses numbers whose value is at most n while the doubling implementation uses numbers with n bits (value more like 2^n). —David Eppstein (talk) 17:18, 31 March 2025 (UTC)[reply]

I'm confused by this analysis: there's a Gauss's sum thing going on here, where

\sum _{i=1..n}{i}=O(n^{2})

. See also my benchmarking comment. Bart Massey (talk) 18:15, 31 March 2025 (UTC)[reply]

Sorry, misunderstood the "fast" code. The given function computes only a single bit of the sequence, so it looks like

O(n^{3})

to generate a sequence of length

n

using this method.

This is consistent with my benchmarking https://gist.github.com/BartMassey/c2680711842103faa0ae884aa06a5868 which shows about 10s for the "fast" Thue-Morse and negligible time for "doubling" Thue-Morse when n = 1M. That benchmark code has some small bug, but it is all I have time for now. Bart Massey (talk) 17:39, 31 March 2025 (UTC)[reply]

~~1M is in the range where the difference between log n word size for the "fast" method and n word size for the "doubling" method is insignificant. —David Eppstein (talk) 17:51, 31 March 2025 (UTC)~~[reply]

To produce the

n

-th bit requires a computation on

lg(n)

bits because that is how many bits it takes to represent the number

n

. I compute that the cumulative runtime to compute the first

n

bits is therefore

\sum O (lg n) = O (n lg n)

. —

Q

uantling (talk | contribs) 17:59, 31 March 2025 (UTC)[reply]

I think I missed something in my previous analysis. The "fast" method performs O(n) bit operations on words of O(log n) bits each, and the "doubling" method performs O(log n) bit operations on words of O(n) bits each. But the doubling method is actually only O(n) in bit complexity, not O(n log n), because the word size it needs increases in a geometric series and it only does O(1) bit operations on the largest word size.

Where these methods really differ is in space usage. The "fast" method also uses only a bounded number of machine words for space. The "doubling" method uses enough words to store n bits. My expectation would be that in a language where performance comparisons are meaningful (not Python) and for big enough n for memory hierarchy issues to matter, the "fast" method will be much faster because it does not need to access main memory to store all those bits. Try n=2^40 or so, implemented in C? It should be possible with the fast method but start bogging down with the doubling method —David Eppstein (talk) 18:29, 31 March 2025 (UTC)[reply]

The slow method — mod-2 counting the number of set bits in the binary expansion of

n

, for each

n

independently — is

O (n lg n)

time and

O (lg n)

space (to hold

n

), which is the same as the fast method (except for the multiplicative constant). I guess this just shows that big-

O

analysis can hide a lot. —

Q

uantling (talk | contribs) 18:59, 31 March 2025 (UTC)[reply]

This suggests that there is an O(n) bit complexity O(log n) space method: just do the naive binary increment algorithm and keep track of the parity of how many bits there are. But of course we would need published sources. —David Eppstein (talk) 19:06, 31 March 2025 (UTC)[reply]

In the discussion of complexity, it is important to keep in mind that

n

is conventionally the size of an integer in bits. So it is not possible to compute the

n

-element sequence in less space than

\Omega (n)

by any offline method.

An online method (that returns a single specified bit of the sequence) could conceivably return a single bit of the sequence in less space. The "fast" algorithm given is online, but I don't think it is especially space-efficient: computing bit

i

of the sequence requires

\theta (i)

space for its intermediate computation as far as I can tell. It also seems to require

\theta (i^{2})

time due to the for loop and exclusive-or.

The "fast" algorithm could be modified to be a

\theta (n^{2})

time

\theta (n)

space offline algorithm for computing a whole sequence of length

n

by replacing the yield with an adjustment of the value variable: I don't see how to do better.

The "doubling" method uses at most

O(n)

"extra" space and time: one computes to the next power of 2 and then truncates. Thus, doubling appears after some thought to be a

\theta (n\lg n)

time

\theta (n)

space offline algorithm: each doubling requires linear time in the size of the sequence to be doubled.

Let me check this reasoning with a colleague and get back to you all. Do take a look at my benchmark if you want to see how this plays out in Python.

Thanks much for the feedback! Bart Massey (talk) 21:00, 31 March 2025 (UTC)[reply]

@Bart Massey I am having trouble following your logic; it could be me or it could be your logic. I don't see any quadratic (or worse) runtime or space requirements.

If we have to have space to store all

n

output bits then storage will be

Ω(n)

bits. If we don't have to store them all simultaneously, but do have to store the input variable

n

then storage will be

Ω(lg n)

bits. If we don't have to pay to store the input, we might have to store a pointer into the input bit array as a loop variable, which would require

Ω(lg lg n)

bits — but maybe we can avoid the looping variable by simply marching along a Turing Machine tape.

Runtime for the computation of any one output bit (without relying on having computed others) is going to be

Ω(lg n)

operations on bits because every bit of the binary representation of

n

needs to be examined. However, if we are storing all outputs simultaneously then we can reuse what we have already computed, and double the length of the output in time

O (n)

; this results in a total runtime of

O (n) + O (n /2) + O (n /4) + O (n /8) + ... = O (n)

. —

Q

uantling (talk | contribs) 21:23, 31 March 2025 (UTC)[reply]

You are right that any offline algorithm must use

\Omega (n)

bits to store the output. Your analysis of the online problem also looks good. However, the given algorithm for that problem (the "fast" algorithm) does seem to me to use

\theta (i)

bits and

\theta (i^{2})

time to compute the

i

-th bit.

You are also right that I botched the runtime analysis of the "doubling" algorithm: your bound of

O(n)

time is correct (and I think tight). Thanks much for the more careful analysis. Bart Massey (talk) 04:46, 1 April 2025 (UTC)[reply]