第29篇

section{Expected Value of the Length of a Random Divisor}%29
markboth{Articles}{Expected Value of the Length of a Random Divisor}

vspace{4.2cm}

subsection{Introduction and Preliminaries}
An understanding of sequences and skills in their manipulation are essential
in order to obtain a practical and thorough knowledge of mathematics.
In this paper, we will explore a particular probabilistic sequence called the {it random
divisor sequence}.

{f Definition 1.}
A random divisor sequence, or an RDS, is a sequence of positive
integers
${a_1,a_2,ldots ,a_{k-1},a_k=1}$
with first term $a_1$ and last term $a_k=1$ that satisfies the following conditions:

1. $a_1$ is a positive integer.

2. $a_{m+1}$ is a randomly chosen positive divisor of am for every positive integer
$m$, given that $a_m e 1$.

3. If for some positive integer $k$, some term $a_k=1$, then the RDS terminates
at $a_k$ (there will be no other terms after $a_k$).

This sequence, never previously formally defined, is a generalization of a
sequence introduced in USAMTS 2/3/17 [1] by Matthew Crawford.

{f Example 1.}
Possible RDSs include
${1}$, ${125,5,5,1}$ and ${24,24,6,2,1}$.

It is clear from the definition of an RDS that, given a fixed first term, some
RDSs are more likely to occur than others.
For example, consider the RDS that begins with 2.
The next term following a 2 has a $dsf{1}{2}$
chance of being 2 and
a $dsf{1}{2}$ chance of being 1.
Since any RDS stops when a term becomes 1, given the
fixed first term 2, the RDS ${2,1}$ has a $dsf{1}{2}$
probability of occurring, whereas
${2,2,2,2,2,2,2,1}$ has a $dsf{1}{2^7}=dsf{1}{128}$
probability of occurrence.
So, among RDSs starting with 2, ${2,1}$ is more likely to occur than
${2,2,2,2,2,2,2,1}$.
This understanding of probability is important in understanding the expected value
of the length of a sequence, given a fixed first element.
We will thus define three terms that rigorously develop such concepts.

{f Definition 2.}
The probability of occurrence of an RDS ${a_1,ldots ,a_k=1}$
is defined as the probability that given a fixed first term $a_1$,
the RDS ${a_1,ldots ,a_k=1}$
will appear.

{f Definition 3.}
The length of an RDS
${a_1,ldots ,a_k=1}$ is $k$,
the number of terms in the sequence.

{f Definition 4.}
Let the set of possible probabilities of occurrences of an RDS
with fixed first term
$n$ be $p_1,p_2,p_3,ldots $ with $L_1,L_2,L_3,ldots $
being the corresponding lengths of those sequences.
$E(n)$ is then defined as the expected value of
the length of the sequence starting with $n$, that is:
$$E(n)=sum_i L_ip_i.$$

We now present a recursive formula for finding $E(n)$ for any positive integer
$n$ greater than 1.
This recursive formula will then be used to derive a more
computationally efficient explicit formula for $E(p^alpha )$,
where $p$ is a prime and $alpha $ is a nonnegative integer.

subsection{A General Recursive Algorithm for finding $E(n)$}

Let us examine a random divisor sequence starting with first term $n$.
We now present a formula in Theorem 1 below that avoids the cumbersome definition
of $E(n)$ above and forms a clever way of writing $E(n)$ recursively.
This formula was discussed briefly in [2], but we will now give a more general form
of it.
Our proof of this formula below is also significantly more rigorous than
the proof found in [2].

{f Theorem 1.}
{it Let $n$ be a positive integer greater than $1$.
Then
$$E(n)=1+dsf{1}{ au(n)}sum_{d|n}E(d),$$
where $ au (n)$ represents the familiar number theoretic function of the number of
positive divisors of $n$, and we assume $d$ is positive.
}

{f Proof.}
Let
$d_1=1$, $d_2,ldots ,d_{ au(n)-1},d_{ au(n)}=n$ be $ au(n)$
positive divisors of $n$ in increasing order ($d_k<d_{k+1}$ for every nonnegative integer $k$)
with corresponding expected values
$E(d_1),E(d_2),ldots ,E(d_{ au(n)-1}),E(d_{ au(n)})$.
Now after the first term $n$, there will be a second term $m$, since the first term is not 1 and
the RDS will thus not terminate (thus giving rise to the condition $n > 1$).
Let the RDSs beginning with the fixed first term $m$ have probabilities of occurrence
$q_1,q_2,q_3,ldots $ with $J_1,J_2,J_3,ldots $
being the corresponding lengths of these RDSs.
The corresponding RDSs beginning with the fixed first term $n$ and
fixed second term $m$ thus have probabilities of occurrence
$q_1,q_2,q_3,ldots $ with $J_1+1$, $J_2+1$, $J_3+1,ldots $
being the corresponding lengths of those sequences.
This is true because the addition of the extra term (the first term: $n$) only
changes the length of the RDS, not the probability of occurrence of the RDS,
and because in our case, the probability of occurrence is solely dependent on $m$.
So, by Definition 4,
$$E(m)=sum_i J_iq_i.$$

We will now write $E(m)$ differently.
Since $m$ will be one of $d_1,ldots ,d_{ au(n)}$
with probability
$dsf{1}{ au(n)}$,
$E(m)$ will be the average of
$E(d_1),ldots ,E(d_{ au(n)})$, or
$$E(m)=dsf{1}{ au(n)}sum_{i=1}^{ au(n)}E(d_i)=dsf{1}{ au(n)}sum_{d|n}E(d).$$

The second equality holds since
$d_1,d_2,ldots ,d_{ au(n)-1},d_{ au(n)}$
are the positive divisors of $n$.

Again, by Definition 4,
$$E(n)=sum_i (J_i+1)(q_i)=sum_i (J_iq_i+q_i)
=sum_i J_iq_i+sum_i q_i=E(m)+sum_i q_i.$$

But
$$E(m)=dsf{1}{ au(n)}sum_{d|n}E(d)
qmbox{and}q
sum_i q_i=1,$$
because the sum of all probabilities of occurrence of a RDS sequence with first
term $m$ is equal to 1.
Thus, substituting back into the expression for $E(n)$,
we have
$$E(n)=1+dsf{1}{ au(n)}sum_{d|n}E(d).$$

This is what we wanted to prove.
Using this formula, we can easily obtain $E(n)$ recursively.
To show this, we let the prime factorization of
$n=dsprod_{i=1}^k p_i^{alpha _i}$,
with $p_1,p_2,ldots ,p_k$ primes and $alpha _1,alpha _2,ldots ,alpha _k$
nonnegative integers.
Note that $E(1)=1$,
as an RDS terminates when it has a term equal to 1.
We can thus solve for
$E(p_i)$, $1le ile k$, since
$E(p_i)=1+dsf{1}{2}(E(p_i)+E(1))=1+dsf{1}{2}(E(p_i)+1)$,
which can be solved for $E(p_i)$.
If we can solve for $E(p_i)$, we can also solve recursively for a $E(d)$, where $d$
is a positive divisor of $n$ that is only a product of primes.
Continuing on in the same manner, we can find $E(n)$ recursively.

We present the following example:

{f Example 2.}
Find $E(75)$.

{f Solution.}
$E(75) = 1 + dsf{1}{6}(E(1) + E(3) + E(5) + E(15) + E(25) + E(75))$.
egin{align*}
& E(1) = 1\
& E(3) = 1 + dsf{1}{2}(E(1) + E(3)) = 1 + dsf{1}{2}(E(3) + 1) Ri E(3) = 3\
& E(5) = 1 + dsf{1}{2} (E(1) + E(5)) = 1 + dsf{1}{2}(E(5) + 1) Ri E(5) = 3\
& E(15) = 1 + dsf{1}{4}(E(1) + E(3) + E(5) + E(15)) Ri E(15) = dsf{11}{3}\
& E(25) = 1+ dsf{1}{3}(E(1)+E(5)+E(25)) = 1+ dsf{1}{3}(1+3+E(25)) Ri E(25) = dsf{7}{2}.
end{align*}

Thus
egin{align*}
E(75)
&= 1 +dsf{1}{6}(E(1) + E(3) + E(5) + E(15) + E(25) + E(75))\
&=1+dsf{1}{6}left(1+3+3+dsf{11}{3}+dsf{7}{2}+E(75) ight)
Ri E(75)=dsf{121}{30}.
end{align*}

However, using the formula in Theorem 1 to find $E(n)$ is computationally
inefficient.
As it is recursive, we would have to find
$$E(d_2),E(d_2),ldots ,E(d_{ au(n)-1}),E(d_{ au(n)})$$
in order to compute $E(n)$ (the value of $E(d_1)=E(1)=1$ is known already).
Thus, we must perform $ au(n)-1$ computations to find $E(n)$.
To help begin addressing the shortcoming, we develop a computationally efficient explicit
formula for the fundamental case $E(p^alpha )$,
where $p$ is a prime and $alpha $ is a nonzero integer.

subsection{An Explicit Formula for $E(p^alpha )$}

We now concern ourselves with finding $E(n)$ for a most fundamental case of
numbers: powers of primes.
The following formerly unproven result is useful
for its computational efficiency.

{f Theorem 2.}
{it Let $p$ be a prime and $alpha $ be a nonnegative integer.
Then
$$E(p^alpha )=
left{a{lll}
1 & mbox{if} & alpha =0\
2+dssum_{i=1}^alpha dsf{1}{i} & mbox{if} & alpha >0.
ea ight.$$

}

{f Proof.}
{f Case 1.}
$alpha =0$.
If $alpha =0$, then $E(p^alpha )=E(p^0)=E(1)=1$
by the work above, which proves the first case.

{f Case 2.}
$alpha =1$.
For $alpha =1$,
we will apply our recursive formula for $E(n)$.
Note that $ au(p)=2$, since $p$ has only the positive divisors 1 and $p$.

Thus,
$$E(p)=1+dsf{1}{2}(E(1)+E(p))Ri
dsf{E(p)}{2}=1+dsf{E(1)}{2}Ri
E(p)=2+E(1)=3=2+sum_{i=1}^1 dsf{1}{i},$$
as desired.

{f Case 3.}
$alpha >1$.
By the previous section, we know that
$$E(n)=1+dsf{1}{ au(n)}sum_{d|n}E(d).$$

Consider the number $p^alpha $, with $p$ a prime and $alpha $ a positive integer.
It has the positive divisors
$1,p,p^2,ldots ,p^alpha $, so $ au(p^alpha )=alpha +1$.
Using our formula for $E(n)$ developed in Theorem 1, we see that, for a prime $p$ and
$alpha >0$,
$$E(p^alpha )=1+dsf{1}{alpha +1}(1+E(p)+E(p^2)+ldots +E(p^{alpha -2})
+E(p^{alpha -1})+E(p^alpha )).$$

Isolating the $E(p^alpha )$ term on the left-hand side, it becomes clear that
$$E(p^alpha )=1+dsf{E(p^alpha )}{alpha +1}+dsf{1}{alpha +1}
(1+E(p)+E(p^2)+ldots +E(p^{alpha -1}))$$
$$dsf{alpha E(p^alpha )}{alpha +1}=1+dsf{1}{alpha +1}(1+E(p)+E(p^2)+ldots +
E(p^{alpha -1}))$$
$$E(p^alpha )=dsf{1}{alpha }+1+dsf{1}{alpha }(1+E(p)+E(p^2)+ldots +E(p^{alpha -1})).$$

But
$$E(p^{alpha -1})=1+dsf{1}{alpha }(1+E(p)+E(p^2)+ldots +E(p^{alpha -2})
+E(p^{alpha -1})),$$
because the positive divisors of
$p^{alpha -1}$ are $1,p,p^2,ldots ,p^{alpha -1}$ and $ au(p)=alpha $.
Substituting the value of $E(p^{alpha -1})$ into our expression for $E(p^alpha )$,
we see that for
$alpha >1$, $E(p^alpha )=E(p^{alpha -1})+dsf{1}{alpha }$.
This is clearly a cleaner recursion than the one previously given.
However, we still have not obtained an explicit formula.
But if we actually apply our recursive formula, we see the following pattern
for $alpha >1$:
egin{align*}
& E(p^2)=E(p)+dsf{1}{2}=3+dsf{1}{2}\
& E(p^3)=E(p^2)+dsf{1}{3}=3+dsf{1}{2}+dsf{1}{3}\
& E(p^4)=E(p^3)+dsf{1}{4}=3+dsf{1}{2}+dsf{1}{3}+dsf{1}{4}\
& ldots ldots \
& E(p^alpha )=3+dssum_{i=2}^alpha dsf{1}{i}=2+dssum_{i=1}^alpha dsf{1}{i},
mbox{ by shifting indices}.
end{align*}

To rigorously prove this assertion of the value of $E(p^alpha )$, we turn to induction.
The following proof of this lemma is essentially a formalization of the
pattern identified above.

{f Lemma 1.}
{it If $E(p)=3$ and $E(p^alpha )=E(p^{alpha -1})+dsf{1}{alpha }$, then
$$E(p^alpha )=2+sum_{i=1}^alpha dsf{1}{i},$$
for $alpha >1$.

}

{f Proof.}
For $alpha =2$,
$E(p^2)=E(p)+dsf{1}{2}=3+dsf{1}{2}=2+dssum_{i=1}^2 dsf{1}{i}$,
as desired.

Now suppose that for
$alpha -1>1$, $E(p^{alpha -1})=2+dssum_{i=1}^{alpha -1}dsf{1}{i}$.
Then,
$$E(p^alpha )=E(p^{alpha -1})+dsf{1}{alpha }$$
and we are done.
By transitivity, we have also proven Case 3.
The completion of the proof of this final case also completes the proof of Theorem 2.

It is interesting to note that as $alpha $ monotonically increases without bound,
the difference between successive terms of the sequence $E(p^alpha )$ monotonically
decreases, yet $E(p^alpha )$ will never converge.
This, of course, is a simple consequence
of the application of the integral test on the harmonic series.
Whether
$Eleft(dsprod_{i=1}^k p_i^{alpha _i} ight)$,
with $p_i$ prime and $alpha _i$ a nonnegative integer for each $i$ such
that $1le ile k$, converges or diverges as $alpha _i o infty $ can be seen in the following.

Fix a prime $p$ and define the map $pi $ from integers to powers of $p$ by letting
$pi (n)$ be the highest power of $p$ that divides $n$.
For a term $a_i$ in a RDS, $pi (a_{i+1})$
will be a divisor of $pi (a_i)$ and each divisor of $pi (a_i)$ is equally likely to occur.
Thus applying $pi $ term-by-term to a RDS sequence for $a_1$ will give an RDS
sequence for $pi (a_1)$ (except it will not necessarily be truncated at the first
occurrence of 1).
Thus
$E(n)ge E(pi (n))$.
Hence for $n=dsprod_{i=1}^k p_i^{alpha _i}$ we have
$$Eleft(prod_{i=1}^k p_i^{alpha _i} ight)ge max_{1le ile k}E(p_i^{alpha _i})$$
which diverges.

Similar arguments to the ones given would show that for distinct primes
$p_1,ldots ,p_r$, $rge 1$,
$$E(p_1p_2ldots p_r)=1+sum_{k=1}^r dsinom{r}{k}dsf{(-1)^{k-1}2^k}{2^k-1}
=2+sum_{m=1}^infty (1-(1-2^{-m})^r).$$

Specifically, one can check that both sides of the equality satisfy the recursion
$f_0=1$ and
$$f_r=1+dsf{1}{2^r}sum_{k=0}^r dsinom{r}{k}f_k.$$

More sophisticated arguments give for $n=dsprod_{i=1}^k p_i^{alpha _i}$
that
egin{align*}
E(n)
& =1+sum_{k=0}^infty left(1-prod_{i=1}^r sum_{s=0}^{alpha _i}(-1)^s
dsinom{alpha _i}{s}(s+1)^{-k} ight)\
& =1-sum_{d|n,d e 1}dsf{ au(d)}{ au(d)-1}prod_{i=1}^r
(-1)^{v_{p_i}(d)}
dsinom{alpha _i}{v_{p_i}(d)},
end{align*}
where $v_p(m)$ is the number of times the prime $p$ divides $m$ and $ au (m)$ is the
number of divisors of $m$.

{f Proof.}
We first consider the case of a RDS beginning at
$a_1=p^alpha $.
We want to derive formulas for the probability that
$a_k=p^eta $.
Let $M$ be the $(alpha +1) imes (alpha +1)$
matrix with $(i,j)$-entry $1/(j+1)$ for $0le ile jle alpha $ and zero otherwise.
This is the transition matrix for the random walk corresponding to a RDS.
If $v_k$ is the column vector with $eta $-entry the probability that
$a_k=p^eta $, $0le eta le alpha $, then
$v_{k+1}=Mv_k$.
Hence $v_k=M^{k-1}v_1$.

Let $u_s$ be the column vector with $eta $-entry $(-1)^{s-eta }dsinom{s}{eta }$
(hence zero entries for $eta >s$).
Then the $eta $ entry of $Mu_s$ is
$$sum_{r=eta }^s dsf{(-1)^{s-r}}{r+1}dsinom{s}{r}
=dsf{1}{s+1}sum_{r=eta }^s (-1)^{(s+1)-(r+1)}dsinom{s+1}{r+1}
=dsf{(-1)^{s-eta }}{s+1}dsinom{s}{eta }.$$

That is, $u_s$ is an eigenvector of $M$ with eigenvalue $1/(s + 1)$.
Also note that the $eta $ entry of $dssum_{s=0}^alpha dsinom{alpha }{s}u_s$
is
$$sum_{s=eta }^alpha (-1)^{s-eta }dsinom{alpha }{s}dsinom{s}{eta }
=dsinom{alpha }{eta }sum_{s=eta }^alpha (-1)^{s-eta }
dsinom{alpha -eta }{s-eta }
=left{a{ll}
0, & eta <alpha \
1, & eta =alpha .
ea ight.$$

That is $v_1=dssum_{s=0}^alpha dsinom{alpha }{s}u_s$.
Hence we have
$$v_k=sum_{s=0}^alpha dsinom{alpha }{s}(s+1)^{1-k}u_s$$
and
$$Prob(a_k=p^eta )=sum_{s=eta }^alpha (-1)^{s-eta }
dsinom{alpha }{s}dsinom{s}{eta }(s+1)^{1-k}
=dsinom{alpha }{eta }sum_{s=eta }^alpha
dsf{(-1)^{s-eta }}{(s+1)^{k-1}}dsinom{alpha -eta }{s-eta }.$$

One can avoid matrices and prove this formula by induction by writing down
fairly obvious recursions, but it is a little ugly.
In particular, let $X$ be the
number of steps before the RDS terminates.
Then $Xle k$ if and only if $a_k=1=p^0$.
Hence
$$Prob(Xle k)=sum_{s=0}^alpha dsf{(-1)^s}{(s+1)^{k-1}}dsinom{alpha }{s}.$$

Recall that if $Y$ is any non-negative integer valued random variable, then
egin{align*}
mathbb{E}(Y)
& =sum_{k=0}^infty kcdot Prob(Y=K)
=sum_{k=0}^infty sum_{m=1}^k Prob(Y=k)\
& =sum_{m=1}^infty sum_{k=m}^infty Prob(Y=k)
=sum_{m=1}^infty Prob(Yge k).
end{align*}

Let $n=dsprod_{i=1}^r p_i^{alpha _i}$
and let $X$ be the number of steps until the RDS beginning
at $n$ terminates.
Let $X_i$ be the time until the RDS has no factor of $p_i$.
As described in remark (7), looking at only the factors of $p_i$ in an RDS beginning
at $n$ gives an RDS beginning at $p_i^{alpha _i}$.
Thus the $X_i$ are independent and satisfy
$$Prob(X_ile k)=sum_{s=0}^{alpha _i}dsf{(-1)^s}{(s+1)^{k-1}}dsinom{alpha _i}{s}.$$

Since $X=maxlimits_i X_i$
we have
$Xle k$ if and only if $X_ile k$ for all $i$ and hence
$$Prob(Xle k)=prod_{i=1}^r Prob(X_ile k)
=prod_{i=1}^r sum_{s=0}^{alpha _i}dsf{(-1)^s}{(s+1)^{k-1}}dsinom{alpha _i}{s}.$$

Hence (letting $m=k-2$)
egin{align*}
E(n)=E(X)
& =sum_{k=1}^infty Prob(Xge k)=1+sum_{k=2}^infty (1-Prob(Xle k-1))\
& =1+sum_{m=0}^infty (1-Prob(Xle m+1)).
end{align*}

Plugging in the formula above for $Prob(Xle m+1)$ gives the first formula.
The second formula is obtained the first by expanding the product and summing
the resulting geometric series.

subsection{Conclusion}
In this paper, we first defined the previously undefined random divisor sequence.
We found a general formula that can be used to solve recursively for
$E(n)$ for integral $n > 1$.
Then we found an explicit formula for $E(p^alpha )$, where
$p$ is a prime and $alpha $ is a nonnegative integer, in terms of $alpha $, greatly improving
the computational efficiency of finding $E(p^alpha )$.
The techniques used in finding an explicit formula for $E(p^alpha )$ might also help in
finding an explicit formula for
$E(n)=Eleft(dsprod_{i=1}^k p_i^{alpha _i} ight)$,
where $dsprod_{i=1}^k p_i^{alpha _i}$
is the prime factorization of $n$,
in terms of
$alpha _1,alpha _2,ldots ,alpha _k$.
Such explicit formulas currently appear difficult to
generate because of the complexity of the recursion used in Theorem 1.

section*{Bibliography}
i
item[{[1]}]
Crawford, Matthew (2005, December 11), United States of America,
Mathematical Talent Search (USAMTS) Round 3 Problems, Year 17

item[{[2]}]
http://www.usamts.org/Tests/USAMTSProblems\_17\_3.pdf

item[{[3]}]
N. Sato, (2006, February 19), United States of America, Mathematical
Talent Search Solutions to Problem 2/3/17
ei

igskip
hfill
{Large Saurabh Pandey}

%%%%%%%