Applying Eigenvalues to the Fibonacci Problem

http://scottsievert.github.io/blog/2015/01/31/the-mysterious-eigenvalue/

The Fibonacci problem is a well known mathematical problem that models population growth and was conceived in the 1200s. Leonardo of Pisa aka Fibonacci decided to use a recursive equation: xn=xn1+xn2 with the seed values x0=0 and x1=1. Implementing this recursive function is straightforward:1

1
2
3
4
def fib(n):
    if n==0: return 0
    if n==1: return 1
    else: return fib(n-1) + fib(n-2)

Since the Fibonacci sequence was conceived to model population growth, it would seem that there should be a simple equation that grows almost exponentially. Plus, this recursive calling is expensive both in time and memory.2. The cost of this function doesn’t seem worthwhile. To see the surprising formula that we end up with, we need to define our Fibonacci problem in a matrix language.3

[xnxn1]=xn=Axn1=[1110][xn1xn2]

Calling each of those matrices and vectors variables and recognizing the fact that xn1 follows the same formula as xn allows us to write

xn=Axn1=AAAx0=Anx0

where we have used An to mean n matrix multiplications. The corresponding implementation looks something like this:

1
2
3
4
5
def fib(n):
    A   = np.asmatrix('1 1; 1 0')
    x_0 = np.asmatrix('1; 0')
    x_n = np.linalg.matrix_power(A, n).dot(x_0)
    return x_n[1]

While this isn’t recursive, there’s still an n1 unnecessary matrix multiplications. These are expensive time-wise and it seems like there should be a simple formula involving n. As populations grow exponentially, we would expect this formula to involve scalars raised to the nth power. A simple equation like this could be implemented many times faster than the recursive implementation!

The trick to do this rests on the mysterious and intimidating eigenvalues and eigenvectors. These are just a nice way to view the same data but they have a lot of mystery behind them. Most simply, for a matrix A they obey the equation

Ax=λx

for different eigenvalues λ and eigenvectors x. Through the way matrix multiplication is defined, we can represent all of these cases. This rests on the fact that the left multiplied diagonal matrix Λjust scales each xi by λi. The column-wise definition of matrix multiplication makes it clear that this is represents every case where the equation above occurs.

A[x1x2]=[x1x2][λ100λ2]

Or compacting the vectors xi into a matrix called X and the diagonal matrix of λi’s into Λ, we find that

AX=XΛ

Because the Fibonacci eigenvector matrix is invertible,4

A=XΛX1

And then because a matrix and it’s inverse cancel

An=XΛX1XΛX1=XΛnX1

Λn is a simple computation because Λ is a diagonal matrix: every element is just raised to the nth power. That means the expensive matrix multiplication only happens twice now. This is a powerful speed boost and we can calculate the result by substituting for An

xn=XΛnX1x0

For this Fibonacci matrix, we find that Λ=diag(1+52,152)=diag(λ1,λ2). We could define our Fibonacci function to carry out this matrix multiplication, but these matrices are simple: Λ is diagonal and x0=[1;0]. So, carrying out this fairly simple computation gives

xn=15(λn1λn2)151.618034n

We would not expect this equation to give an integer. It involves the power of two irrational numbers, a division by another irrational number and even the golden ratio phi ϕ1.618! However, it gives exactly the Fibonacci numbers – you can check yourself!

This means we can define our function rather simply:

1
2
3
4
5
6
7
def fib(n):
    lambda1 = (1 + sqrt(5))/2
    lambda2 = (1 - sqrt(5))/2
    return (lambda1**n - lambda2**n) / sqrt(5)
def fib_approx(n)
    # for practical range, percent error < 10^-6
    return 1.618034**n / sqrt(5)

As one would expect, this implementation is fast. We see speedups of roughly 1000 for n=25, milliseconds vs microseconds. This is almost typical when mathematics are applied to a seemingly straightforward problem. There are often large benefits by making the implementation slightly more cryptic!

I’ve found that mathematics5 becomes fascinating, especially in higher level college courses, and can often yield surprising results. I mean, look at this blog post. We went from a expensive recursive equation to a simple and fast equation that only involves scalars. This derivation is one I enjoy and I especially enjoy the simplicity of the final result. This is part of the reason why I’m going to grad school for highly mathematical signal processing. Real world benefits + neat theory = <3.

  1. The complete implementation can be found on Github.

  2. Yes, in some languages some compilers are smart enough to get rid of recursion for some functions.

  3. I’m assuming you have taken a course that deals with matrices.

  4. This happens when a matrix is diagonalizable.

  5. Not math. Courses beyond calculus deserve a different name.

 Jan 31st, 2015  math


原文地址:https://www.cnblogs.com/yymn/p/4454458.html