Optimal Team Size for an Exploration-Exploitation Problem

Not every team project are the same, but I want to argue that most of them have two phases. First comes a moment of creative insight that reframes the problem or opens a new path. After that, the team moves into execution: applying, testing, writing, and polishing, often in parallel.

Consider a high-risk, high-reward research effort in which everyone on the team must publish a paper, but no one can do so until someone unlocks the big idea that makes the project possible.

This two-stage structure mirrors the classic distinction in organizational learning: exploration versus exploitation.

Here I want to explore, with a simple mathematical model, the scaling properties of a task involving these two processes.

A Simple Model

Suppose you have a group of \( N \) researchers working together. The goal is for each of them to produce something, for example, a publication. However, this is a hard task. To produce a publication they need to have a good new idea. Thus, any publication comes only after a first phase of innovation, followed by putting the new idea/technique in practice. Therefore, we can say that we need first to perform exploration and then exploitation. We assume that the exploration phase (innovation) is the most difficult part of the project, and the exploitation phase (assembling results and publishing for instance) can only start after a successful innovation.

Then it cames the question: What is the optimal group size for finishing all steps of a project?

Assuming that everyone in the group has the same basic skill, we can model the time until one person has the innovative breakthrough as an exponential random variable with rate \( 1/d_1 \), so the probability density for one researcher to innovate at time \( t \) is:

\[ f(t) = \frac{1}{d_1} e^{-t/d_1} \]

Similarly, we model the time it takes each researcher to produce their own publishable result after the innovation as an exponential random variable with rate \( 1/d_2 \):

\[ g(t) = \frac{1}{d_2} e^{-t/d_2} \]

The expected total time to finish the project is the sum of:

\( T_C \): the time until the first innovation,
\( T_L \): the time until the last of the remaining \( N-1 \) researchers finishes writing.

Therefore, \( T_C \) is calculated as the minimum of \( N \) random variables drawn from the \( f(t) \) PDF, thus \( T_C = min \{t_1, t_2, t_3,...,t_N\} \).

Since the PDF \( f_{min} \) for the minimum of \( N \) i.i.d. is given by

\[ f_{\min}(t) = N f(t) F(t)^{N-1} = \frac{n}{d_1} e^{-t/d_1} \left(e^{-t/d_1} \right)^{N-1} \]

Where \( F(t) = \int_t^\infty f(t') dt' \) is the survival probability, and the last term means that one needs that \( (N-1) \) random samples are less than \( t \).

Thus, the expected time to have our first innovation is \( \mathbb{E}[T_C] = \int_0^\infty t' f_{\min}(t') dt' = \frac{d_1}{N}\)

Now, for \( T_L\) we need to wait for all the remaining \( N-1 \) persons finishing the task. Thus, we have to calculate the maximum of \( N-1 \) random variables from the PDF \( g(t) \), therefore \( T_L = max \{t_1, t_2, t_3,...,t_{N-1}\} \).

The PDF for the maximum of \( N-1 \) i.i.d. exponential variables with mean \( d_2 \) is:

\[ f_{\max}(t) = (N-1) g(t) \big(1-G(t) \big)^{N-1} = \frac{N-1}{d_2} e^{-t/d_2} \left(1 - e^{-t/d_2} \right)^{N-2} \]

Here, again, the logic is similar as before. You have the probability of obtaining value \( t \) once, multiplied by the probability that all the remaining \( N-2 \) are smaller than \( t \), given by the term with \( G(t) = \int_t^\infty g(t') dt' \), and all of that can happen \( N-1 \) times.

Then the expected time for the last person to finish is

\[ \mathbb{E}[T_L(N)] = \int_0^\infty t \cdot f_{\max}(t) dt \approx d_2 (\ln(N - 1) + \gamma) \]

where \(\gamma\) is the Euler–Mascheroni constant. ¹

So the expected total time is:

\[ \mathbb{E}[T_{\text{total}}(N)] \;=\; \underbrace{\frac{d_1}{N}}_{\text{faster with more people}}\; +\; \underbrace{d_2 \big(\ln(N - 1) + \gamma\big)}_{\text{slower with more people}} \]

The first term decreases with \(N\): more people speed up discovery. The second term grows (slowly) with \(N\): more papers must be finished, and the last one sets the final timeline. Their trade‑off creates a non‑monotonic dependence² on team size, implying an optimal \(N\) that balances faster discovery against longer execution tails. The optimal number of people that minimizes the total time is \( N_{\text{opt}} \approx d_1/d_2\), where we assumed that \(d_1 \gg d_2\) since the inovation/exploration is the most difficult process.

The inspiration for this post, comes actually from a different application. It is a result of two works on folding of microstructures. If you are curious you can check Communications Physics (2020) and The European Physical Journal E (2021).

¹ The exact solution is actually \(\mathbb{E}[T_L]=d_2\sum_{i=1}^{N-1}1/i\), the logarithmic approximation is valid for large \( N\). ↩︎

² Different models (\(f \), and \(g\)) will lead to different dependences with \( N\) for each process, however it is possible to prove that the total time will always be non-monotonic. ↩︎