Friday, January 18, 2013

Relations on Pokemon

Pokémon are really interesting creatures. They gain experience, and then use this experience to evolve. Different precursor Pokémon evolve into different evolved forms. We can represent these relationships with an evolution diagram that connects precursor Pokémon to their evolutions with arrows. I like pictures, but I'm bad at art, so all pictures in this blog post are from the Bulbapedia.1


The diagram above shows that Growlithe evolves into Arcanine, and that Staryu involves Starmie. Diagrams like this make us believe that everything is right in the world; everyone has their proper place in the order of things. We could make our diagram bigger to involve more Pokémon  Ideally, if all of the Pokémon that can evolve were on the left of our diagram, and all the Pokémon that result from evolutions were on the right of our diagram, we might be able to make a diagram with all the Pokémon in the universe! In that case, every time we caught a Pokémon, we could look it up in our diagram. Precursor Pokémon would show up on the left side, so we could figure out exactly what they evolve into, and evolved Pokémon would up on the right side of our diagram, we could figure out exactly what they evolved from! Wouldn't this be exciting? Before we make the full diagram, though, let's try a few more smaller ones as practice. For simplicity, let's only consider Generation 1 Pokémon so we don't get overwhelmed.


Oh no! Something has gone terribly, terribly wrong. Our beautiful plan of figuring out exactly what each precursor Pokémon evolves into is ruined. When my Eevee was discussing its career plans with me, I showed it this diagram, and it got really confused. In fact, it hurt itself in its confusion! This made me very upset. Let's try again.


Oops! Looks like we messed up again. Nothing evolves into Aerodactyl. When my Aerodactyl saw me making this diagram, it realized that if it ever had children, they wouldn't be cute little precursor Pokémon like the little Pichu babies the neighbor's Raichu just had, but instead they would be just as bony and ugly as itself. Then it threw a tantrum and destroyed half of my kitchen. Now I have to eat take-out Weepinbell sprouts with synthetic Gloom sauce from Gary's Garden, a low-quality "health food" joint run by a former Pokémon trainer who has fallen on bad times. My stomach is sad.

It turns out that I should never have dreamed about a universal Pokémon evolution diagram with every precursor Pokémon pointing to a single evolved form. Let me explain why, so you never have to suffer through what I've suffered through.

First, let's introduce the concept of a binary relation. In mathematics, a binary relation is defined as a set of pairs. In this context, the word "set" also has a very specific meaning - it's a bag that contains certain things and only those things. A set is often described as a list of things enclosed in curly braces. For example, the empty set contains nothing between the braces; sometimes it's also represented by a special symbol:


{}. In the context of evolution in Pokémon, the binary relation we are interested in is pairs of Pokémon where the first Pokémon involves into the second. We can represent these as ordered pairs like the ones that we use to denote coordinates in a graph. Thus, the first diagram we drew contained the following pair:


The full meaning of the first diagram can be represented as a relation, or a set of these pairs:

As you can see, the pictures are much nicer. Set notation can get quite cumbersome. Nevertheless, it helps for some purposes, so let's also describe the second diagram in this fashion:


The problem that we noticed in the diagram is now visible in a different fashion: the name Eevee is repeated multiple times in the first elements of different pairs. In mathematical terms, the problem with this relation is that it is not a function. A function is a relation that maps every input to a single output. In a functional evolution relation, a precursor Pokémon can map to two evolutions only if those evolutions are also the exact same Pokémon. In other words, the same precursor Pokémon cannot map to any two different evolutions in a functional relation.

Maybe we can fix this by reversing the order of the diagram? Then, it would look like this:


This is a functional relation, because every evolution maps to exactly one precursor. However, it still confuses my Eevee, because it doesn't know which evolution it is the precursor of! What's the problem now? The relation now looks like this:


The problem now is that Eevee is repeated multiple times in the second elements of different pairs. In mathematical terms, this means that the relation is not injective. An injective precursor relation is one in which two evolved forms can map to the same precursor only if they are also the same Pokémon. In other words, no two different evolved forms can map to the same precursor.

Finally, we can represent the third diagram (with Aerodactyl) as a relation in set notation:


Although there is a Pokémon in the third diagram, there are no pairs of Pokémon, so the relation is equivalent to the empty set. That's not the reason that the diagram is a problem, though. If we added Aerodactyl to the first diagram, the new diagram would look like:


The set theory notation for this relation is still:


Now we see that the real problem with this relation is that it doesn't say anything at all about Aerodactyl, even though Aerodactyl is in the diagram. In mathematical terms, this means that the relation is not surjective. A surjective relation is one in which every evolved form of a Pokémon is contained in the map because it is paired with some precursor.

Now that we know about functionality, injectivity and surjectivity, we can see that the first diagram was a function that was both injective and surjective. These kinds of relations are called bijective functions, or bijections. It turns out that if a relation is a bijection, that means that the set of things on the left side of the relation diagram (the domain) is exactly the same size as the set of things on the right side (the range). This is true even for infinitely large sets: if we show that there exists a bijection between two infinitely large sets, then these sets are infinitely large with the same order of infinity. This means that there are some countably infinite sets, meaning that there exists a bijection between those sets and the set of natural, counting numbers, and some nondenumerably (uncountably) infinite sets, for which no such bijection exists. Mind-boggling, right? Anyway, we can't make a full Pokémon evolution diagram that has the properties we wanted at the beginning, because the evolution relation is not bijective on the set of all Pokémon, not even on just Generation 1 Pokémon. So don't even bother. You'll just end up eating take-out Weepinbell sprouts with synthetic Gloom sauce from Gary's Garden.

1. http://bulbapedia.bulbagarden.net/wiki/Main_Page

Thursday, January 17, 2013

Optimization Theory: A Primer (Part 3)


Sanity Check

In the previous installment of this multi-part blog post, we saw a lot of theoretical math with some fairly complicated notation. Although there were pictures and explanations to help make the math as intuitive as possible, there were so many steps involved that it is difficult to come away feeling comfortable with the powerful, abstract equations and identities that we just derived. Therefore, it is worthwhile to reassure ourselves that the math actually makes sense. In order to do this, we can solve a very simple toy problem that we already know the answer to: in a Euclidean geometry, what is the shortest distance between two points? The answer, of course, is a straight line: as illustrated in the picture below,12 travel along a straight line makes the distance traveled equivalent to the displacement between two points, whereas other paths add unnecessary travel distance.


Since we noted earlier that distance is the integral of arc length, we can use this problem to test the validity of the calculus of variations. In part 1, we showed:


So the shortest distance from A to B minimizes the integral:

Recall the Euler-Lagrange equation:

This equation describes an optimal solution q(x). In our work in part 1, we described a path using a depth variable y(x). To make our notation consistent, let us assume that y(x) = q(x), so y(x) now describes the depth profile of the shortest path. Then, we can restate the Euler-Lagrange equation in our context as:


In this case, although the integrand does not depend on x, it is actually easier to solve the Euler-Lagrange equation directly than it is to apply the Beltrami identity. Therefore, let us proceed directly:


By integrating both sides, we can show that:

is true for some constant C. We can multiply both sides by the radical term:

We can square both sides:

We can factor out the variable:

And then we can rearrange this expression to show that:

By squaring both sides earlier, we created a false solution, so only one of the two roots (positive or negative) is the correct slope. The sign of the slope depends on whether the starting point A is above or below the endpoint B. The exact value of the slope is, however, totally besides the point; we have no idea what the constant C is. All that matters is that we have expressed the slope of the curve, y', completely in terms of constants (1 and some unknown constant C), so the slope itself must be constant, and the optimal path must be a straight line. In terms of calculus, since our expression for y' consists solely of constants, y' does not depend on x:


Integrate both sides, and call the constant m:

Integrate both sides again, and call the constant b this time:

Thus our optimal curve is described by the classic slope-intercept form equation for a line in the Cartesian plane. By solving the Euler-Lagrange equation, we demonstrated that the shortest path between two points is a straight line. We already know this is true from basic principles of geometry, so we have verified that the calculus of variations produces a sensible answer to the question we asked. Now that we understand how the solution process works in practice for a simple problem, we can comfortably use the methods derived in the previous post to solve more difficult problems. Let's proceed to the step we've all been waiting for: the solution to the brachistochrone problem!

The Solution

Unlike in the toy problem above, where y(x) described the depth problem of the shortest path, we are now looking for the fastest path from A to B. Therefore, we have to incorporate the results from part 1 that we derived using physics. Recall our integral for travel time in the brachistochrone problem:


The integrand here is more complex, so it pays to apply the Beltrami identity:

Again, to make our notation consistent, let's rephrase this to describe the path in terms of the depth y:

The function of interest is the integrand:

We can apply the quotient rule and chain rule to differentiate F:

Rearrange a little:

Now we can cancel some of the radicals containing gravity:


Thus:

Factor out the radical containing gravity:

Manipulate the terms in parentheses to achieve a common denominator:

The y' terms in the numerator cancel:

Rearrange slightly:

Square both sides:

C is some arbitrary constant from the Beltrami identity, so the value of the constants is not relevant at this point. We are just trying to find some curve where the right side of the above equation is constant. Therefore, let us simplify the expression by  by introducing a new constant to consolidate all the old constants:


Rearrange again:

Combine terms in the radical:

Use separation of variables to solve for x:


Now we need to perform a complicated trigonometric substitution:

Apply a trigonometric identity to simplify the integration step:

Substituting these into our integral expression for x:

Cancel the constant k from inside the radical:

Apply another trigonometric identity:

Cancel the cosines and consolidate:

Reapply a familiar trigonometric identity:

Apply another trigonometric identity:

Substituting this into our integrand:

Thus we have produced a set of parametric equations:

We have solved the brachistochrone problem! Thanks to Caltech13 for notes on some of the less intuitive substitution steps. The above expressions for x and y are a complete description of the optimal path that the rollercoaster must take to get from A to B as quickly as possible using the acceleration of gravity. The only detail we ignored were the integration constants that would be determined in the context of a problem with specific numbers by the exact x and y coordinates of points A and B. The shape of the curve is the same, though; it's merely the offset that changes with these constants. Curiously, the shape described by these equations is equivalent to an upside-down rotation of a section of the path traced out by a point on the edge of a wheel as it rolls along a flat surface, also called a cycloid.14 See the picture below from BCIT:15


A great description of the origin of the brachistochrone problem as a challenge posed by Jakob Bernoulli, and its relation to the rivalries between great mathematicians of the era, can be found at the University of ST. Andrews Website.16

Acknowledgements

It turns out that optimization theory is useful for far more than just rocket science. In particular, it is heavily used in medical image analysis, especially of MRI images, in biomedical research. Special thanks to Professor Paul Yushkevich at the University of Pennsylvania for first introducing me to the field of optimization theory in this context.

15. http://commons.bcit.ca/math/entertainment/coaster/index.html
16. http://www-history.mcs.st-and.ac.uk/HistTopics/Brachistochrone.html