An algorithm for shuffling playlists

written by Ruud van Asseldonk
published 22 July, 2023

It is a common observation that for shuffling playlists, humans don’t want an actual shuffle, because that might play the same artist twice in a row, which doesn’t feel random. This topic has been discussed by others, including famously by Spotify, which in turn references Martin Fiedler’s algorithm. However, neither algorithm is optimal, in the sense that they might play an artist consecutively in cases where this was possible to avoid. I’ve been thinking about this for Musium, the music player that I am building. In this post I want to outline an algorithm that is optimal in the above sense.

Notation

In this post a playlist is an ordered list of tracks. For example, AAABBC is a playlist with six tracks: three by artist A, then two by artist B, and then one by artist C. The shuffling algorithm returns a permutation of this list; it specifies the order of the artists. At this point we treat tracks by the same artist as indistinguishable. We’ll address that later.

Interleaving two artists

Suppose we have tracks from just two artists, A and B, where A has more tracks than B. Then we can divide A’s tracks into equally sized groups, and insert B’s tracks in between them. This way we never play B consecutively, we break up as many of A’s consecutive plays as we can, and we minimize the longest span of A’s in a row.

Let’s state that formally. We are going to shuffle a playlist with two artists, A with n tracks, and B with m tracks:

Partition on artist, so we have list x of length n and list y of length m, with n ≥ m.
Split x into m + 1 equal parts. If m + 1 does not divide n, the parts that get one more element can be selected randomly.
Interleave the m + 1 parts of x with y’s elements. If m = n, we just interleave the two lists (we can split x into at most m parts), but we can flip a coin about which list goes first.

The resulting permutation is optimal in the following sense:

Artist B will have no consecutive tracks, and if possible A will not either.
More generally, for any positive integer k, the number of occurrences of k consecutive tracks by the same artist is minimal.

We can apply this interleaving more generally, we don’t need to limit ourselves to inputs x and y that consist of only one artist each. The procedure can interleave any two lists x and y. We will need this later, so let’s call the procedure interleave. It interleaves the smaller list into the larger one.

Multiple artists

Now that we can shuffle two artists, we can extend the idea to more artists. Let’s look at an example first. Say we have four tracks by artist A, two by B, and one by C. Then the three optimal shuffles are ABABACA, ABACABA, and ACABABA. Among B and C we have some freedom, but we need all the B’s and C’s together to go in between the A’s. What we did is to first interleave B and C, and then we interleaved the result with A.

A promising approach then, is to incrementally build up the result from an empty list. At every step, we interleave an artist with the least number of tracks with the result, until we incorporated all artists. This works well in most cases, and I originally thought that this algorithm would produce optimal shuffles. However, when I set out to prove that, I discovered a counterexample.

Take 2 tracks of artist A, 4 tracks of B, and 4 tracks of C. If we interleave A and B first, then A partitions the B’s into three groups, two of size one, and one of size two. For example BBABAB. This list has length 6, so next we interleave the C’s into it, but the four C’s are not quite enough to only create groups of size 1, there will be one group of size 2, and it might contain a BB. We do have enough C’s to break up all the occurences of BB, but we need to be more careful about where we use them. We can fix the counterexample by interleaving B and C first, and then interleaving A into it. But it turns out that there exist counterexamples that cannot be fixed by interleaving in a different order. For example, 4, 8, and 10 tracks. No matter which two artists we interleave first, it is not possible for interleave to guarantee an optimal shuffle.

To tackle the counterexample we can define intersperse which merges two lists x and y of length n and m. It proceeds as follows:

Break x up into spans at every place where an artist plays twice.
While we have fewer than m + 1 spans, break up spans randomly until we have m + 1.
Interleave the spans with elements of y.

Note that in general, step 1 may produce more than m + 1 spans, and then the procedure doesn’t work. But we are only going to use intersperse in situations where this does not happen.

Incremental merging for optimal shuffles

Now we can put the pieces together and fix the issue that plagued our first attempt at incremental interleaving. Define the merge-shuffle procedure as follows:

Partition on artist, and order the partitions by ascending size. Break ties randomly.
Initialize r to an empty list.
Take the next partition and merge it into r. Say the partition to merge has length n, and r has length m. When n ≥ m, use interleave. When n < m, use intersperse.
Repeat until we merged all partitions, r is the result.

Why is intersperse safe to use when n < m? This happens because r can have at most n consecutive tracks, and that property holds at every step of the procedure. The proof below makes it a bit clearer why this happens, but for the purpose of implementing the algorithm, we can take it as a fact that intersperse will not fail.

This algorithm produces optimal shuffles, which we will formalize later. The only case where it places the same artist consecutively is when this is impossible to avoid, because we don’t have enough other tracks to break up the consecutive plays. This happens when the last step is an interleave and n > m + 1, for example in AABA.

Extension to albums

In a situation like AABA, we cannot avoid playing artist A multiple times in a row. But if two of A’s tracks are from a different album, then we can at least avoid playing tracks from the same album in a row. The process is the same as before, but instead of partitioning on artist we partition on album. The merge-shuffle preserves the relative order of tracks in the partitions that it merges, so to shuffle a playlist, we can use merge-shuffle at two levels:

Partition the playlist into albums.
Shuffle every album individually using a regular shuffle.
For every album artist, use merge-shuffle to merge albums into a partition per artist.
Use merge-shuffle to merge the artist partitions into a final shuffled playlist.

Optimality proof

Let’s first formalize what we are trying to prove.

Definition: For a positive integer k, the k-badness of a playlist is the number of times that the same artist occurs k times in a row. For example, the 2-badness of AAABBC is 3: AA occurs at index 0 and 1, and BB occurs at index 3.

Definition: Let x and y be permutations of the same playlist. We say that x is better than y if both of the following hold:

There exists a positive integer k for which x has a lower k-badness than y.
For every positive integer k, x has no greater k-badness than y.

This defines a partial order on playlists. Note that this is not a total order! For example, AAABAABAABAAB has a lower 3-badness than AAABAAABABABA, but a higher 2-badness.

Definition: A permutation of a playlist is called an optimal shuffle if there exists no better permutation. For example, AAABB is not an optimal shuffle, because its permutation ABBAA has a 2-badness of 2, lower than AAABB’s 2-badness of 3, and it also has a lower 3-badness. An example of a shuffle that is optimal is BBCB. It has a 2-badness of 1, and of its four permutations (CBBB, BCBB, BBCB, and BBBC), none achieve a lower 2-badness.

Theorem: The merge-shuffle algorithm returns optimal shuffles. Moreover, the 2-badness of the result is at most n – 1, where n is the track count for the artist that occurs most often.
Proof: The proof will be by induction on s, the number of artists.

Base case: With a single artist, there is nothing to interleave or intersperse, and the result has a 2-badness of n – 1, where n is the size of the input.

Induction step: Let the number of artists s be given. Assume that for fewer than s artists, our algorithm produces optimal shuffles. Say we have n tracks by A, and m tracks by different artists (not necessarily all distinct). Assume that no other artist occurs more than n times. Let x be an optimal shuffle of the m tracks with a 2-badness of at most n – 1. We can distinguish three cases:

When n > m, we interleave the tracks from x between the A’s. We break up the list of A’s into spans of size k = ⌈n / (m + 1)⌉ and possibly of size k – 1. It is not possible to build a permutation with lower k-badness without breaking the list into more than m + 1 spans. This also holds for the 2-badness, 3-badness, … up to k, so the permutation is optimal. Moreover, the 2-badness is less than n – 1: a list of all A’s would have 2-badness n – 1, and interleaving tracks from x can only reduce this further.
When n = m we can interleave A’s with tracks from x. The result has a k-badness of zero for all k ≥ 2, which is optimal.
When n < m, we use intersperse. The A’s are used in between spans, so there is no badness due to A. The 2-badness of x is at most n – 1, but we have n A’s, which is enough to eliminate all 2-badness entirely. Therefore the resulting shuffle is optimal.

This concludes the proof. ∎

Completeness

We now have an algorithm that generates optimal shuffles, for some very specific definition of optimal. We might call this soundness: the algorithm never generates shuffles that are not optimal. However, we haven’t shown what we might call completeness or surjectivity: for a given playlist, can the algorithm output every possible optimal shuffle? And if the answer is yes, do each of the shuffles have equal probability of being generated, or is the algorithm biased in some way?

It turns out that merge-shuffle is not complete. For example, for the input AAABBC it would only output ABACAB or its mirror image BACABA, but not ACABAB, even though all of them are optimal under the current definition.

More even shuffles

An arguably desirable property of shuffles, is to have an artist’s tracks occur roughly uniformly throughout the playlist. The current definition of optimal penalizes consecutive tracks by the same artist, but as long as that does not happen, it doesn’t care where in the playlist tracks occur. For instance, ABABCDCD and ABCDABCD are both optimal, even though artists are clustered in the former and spaced more evenly in the latter. To capture this, we might extend the definition of badness to negative integers, where the (–k)-badness of a playlist is the number of times that the complement of an artist occurs k times in a row.

Let’s look at an example. Consider A in ABABCDCD. We can highlight its complement as ABABCDCD, the complement of B as ABABCDCD, etc. These two complements have a 2-badness of 4 and 3 respectively, and by symmetry we also get 4 and 3 from the complements ABABCDCD and ABABCDCD of C and D. Therefore the (–2)-badness of ABABCDCD is 14. Through similar reasoning we can count that the (–3)-badness is 10, the (–4)-badness is 6, and the (–5)-badness is 2.

Now let’s look at ABCDABCD. Here the complement of A is ABCDABCD, the complement of B is ABCDABCD, the complement of C is ABCDABCD, and the complement of D is ABCDABCD. The (–2)-badness again comes out to 14, but the (–3)-badness is reduced to 6, and (–4)-badness does not occur at all. If we extend our definition of better and optimal permutations to include negative values of k, then ABCDABCD would be a better permutation than ABABCDCD.

Even though we saw already that our merge-shuffle avoids generating some uneven shuffles, it is by no means sound under this new extended definition of optimal. For instance, it would output permutations worse than ABCDABCD. To design an algorithm that generates optimal shuffles for the extended definition of optimal would be an interesting topic for future work. Especially if the algorithm can be made complete and unbiased, in the sense that it outputs all possible optimal shuffles with equal probability.

In practice

I implemented the merge-shuffle algorithm in Musium. It works reasonably well in practice, although I find the shuffles are sometimes lacking variation when I re-shuffle. This is no surprise because for many inputs, the order of the artists is deterministic. There are ways to alleviate this by sacrificing some evenness, for instance by allowing to split the larger list into only m parts rather than m + 1 when the smaller list has m tracks. I also wonder if it would help to always merge the two smallest lists instead of always merging into the intermediate result. Perhaps the result would no longer be optimal according to the definition in this post, but although this definition is good for nerd-sniping, it is not so obvious that it captures what humans care about well. Arriving at a good shuffling algorithm will require some more experimentation.

Aside from the algorithm itself, I implemented a fuzzer that searches for counterexamples to the proof in this post. After implementing the full algorithm outlined in this post, the fuzzer has not found any counterexamples. The value of this is twofold: it increases my confidence that the implementation is correct, but it also increases my confidence that the proof is correct.

Performance-wise the algorithm is fast enough to be a non-issue. Although the complexity in the number of track moves is worst-case quadratic (achieved when all artists are distinct, because we build n lists of size n/2 on average) in practice shuffling even thousands of tracks takes mere microseconds, so unless you try to shuffle your entire music library, this is unlikely to be a problem.

Conclusion

In this post I gave a new algorithm for shuffling playlists. This algorithm avoids playing the same artist twice in a row when this is possible to avoid, unlike other algorithms such as the one introduced by Spotify. I also proved this, which was a valuable exercise, because it uncovered a flaw in an earlier version of the algorithm. The further I got writing this post though, the clearer it became to me how ad-hoc the algorithm really is, and the more I used the shuffle myself, the less clear it became to me that the current definition of optimal is a good one. I could keep on refining things for a long time, but then I would never publish anything, so this is where I draw the line for now. The merge-shuffle works well enough in practice. Maybe I’ll revisit it some other time.