HAiKQDM:MCfGTtTQCaCQM postmortem

Last week I attended a conference at the Perimeter Institute for Theoretical Physics, titled “Hopf Algebras in Kitaev’s Quantum Double Models:  Mathematical Connections from Gauge Theory to Topological Quantum Computing and Categorical Quantum Mechanics”.  My flight from LAX (Los Angeles) to YYZ (Pearson-Toronto) was delayed due to a maintenance issue, leaving me sitting in the plane on the tarmac for about an hour and a half before takeoff, right close to midnight.  My flight to Oxford for the QPL conference (Quantum Physics and Logic) two years ago had a similarly long delay… Hmmmn.  Despite this conference being fairly small—40 people on the list of participants—there were several people present that I met at that QPL conference (Ross Duncan, Stefano Gogioso, and Pawel Sobocinski), and a couple more that I met while I was a grad student at Riverside (Tobias Fritz and Derek Wise).  It turns out Ross and Stefano are sharing an apartment on the floor above my room. Continue reading

We now interrupt your regularly scheduled radio silence…

It has been nearly two years since my last post here, and several things of note have occurred or are about to occur.  My Ph.D was conferred last December.  My dissertation, Categories in Control:  Applied PROPs, is available on the arXiv and via eScholarship.  In the meantime I have been teaching as adjunct faculty at Victor Valley College in the Mathematics and Physics departments, keeping an eye on conferences of interest and postdoc positions.

Since I last wrote, there are two new countries’ stamps on my passport, and in a few days I’ll finally be adding one from a country that shares a land border with the U.S.  Less cryptically, I will be spending next week in Canada, attending a conference on Hopf Algebras and other things in a title that is longer than this description at the Perimeter Institute.  I would like to take this opportunity to thank Tobias Fritz for bringing the conference to my attention and Daniel Gottesman for providing the Perimeter Institute invitation.  Saying the invitation covers practically all the expenses is putting it mildly.

I have been in courtship with a postdoc position since about the beginning of June with the Air Force Research Laboratory in Dayton, Ohio.  There is a close connection there with a group at Penn led by Daniel Koditschek.  I have been invited to give a talk and spend a couple of days at Penn in early September so we can get to know each other better, both personally and mathematically.

A little bit farther in the future, I am looking with interest at the upcoming AMS sectional meeting in November at UC Riverside and the upcoming Joint Math Meeting in January in San Diego.  I still need to submit an abstract for the Applied Category Theory session at the sectional meeting.  I plan to simply attend the JMM—it will be 14 years since the last time I went to a JMM, when it was last in Phoenix.  It will be nice to have a fresh look from a more mathematically mature perspective.

AMS talk 10/24/15

About two months ago I gave a talk at the AMS Fall Western Sectional Meeting.  While I could have safely rehashed my QPL talk, I decided to push forward instead.  That may or may not have been the best idea – the results were certainly less polished.  On the other hand, I was able to describe controllability and observability of a control system in terms of string diagrams.  This is something that was painfully missing from my QPL talk’s results in July.  Seeing the nontrivial constants in the string diagrams the quantum folk were using provided the key insight, and I wanted to capitalize on it as soon as possible.

The punchline of the AMS talk is that the duality between controllability and observability noticed by Kalman in the late 50s and early 60s can be expressed in terms of a PROP, which is a kind of symmetric monoidal category.  In particular, this PROP includes a subPROP of finite-dimensional vector spaces and linear relations, which is basically what Paweł Sobociński deals with here under the name of Interacting Hopf monoids.  Okay, so the actual punchline is that the duality Kalman noticed six and a half decades ago between controllability and observability? it’s simply time-reversed bizarro duality.

Bizarro is Sobociński’s term (seen in episode 7 of his blog), but I’m kind of partial to it.

Grading exams (a modest^H^H^H^Hck proposal)

Exams are some of the most time-consuming parts of being an instructor.  They have to be written, proctored, and finally graded.  Given a class of n students, it seems reasonable to expect that writing a free-response exam would be a O(1) time commitment, proctoring would also be a O(1) time commitment, while grading would be a O(n) time commitment.  Maybe grading speeds up as you get familiar with the particular set of mistakes your class makes, so I might believe something like O(\frac{n}{\log n}) instead of O(n).  In any case, asymptotically, grading takes a much greater share of time consumed than the other portions of the process combined.

With this in mind, I propose the following as a way of reducing the amount of necessary time spent in the grading process, while still giving a reasonable estimate of student understanding.  At least for a game theory class.

Game theory final exam (proposed)

(1) (100 points) Your score on this exam will be based entirely on this one question.  If x is your answer and no one in this class answers with a greater number, 100-x will be your score.  Otherwise, x will be your score.
x = _____

It is interesting to think about the possible strategies one may devise, which, for the student taking the exam, may depend on the class size and/or the student’s grade going into the final exam.  While it is fun to ponder, I will leave that as food for thought for the time being save that discussion for the comments.  Hopefully it will be clear that this isn’t entirely a serious proposal for an actual basis for grading a class, as forcing the entire class into a final exam that somewhat resembles the prisoner’s dilemma would be somewhat cruel.  There are some variations that may or may not be as evil:

Variation 1 (Unlimited collaboration)

The students are told in advance what the exam problem will be.  Thus, they are free to collaborate with as many of their classmates as they would like in order to come up with a strategy.  However, they cannot see what answers are actually submitted by their classmates.  This variation might be more evil.

Variation 2 (Limited collaboration)

The students are paired up and are allowed to discuss strategy only with their partner.  They may be able to see what their partner submits, but they have no information about what anyone else does.

Variation 3 (Multitrack)

The students are given the option to take one of two exams.  The first choice is the proposed exam above.  The wording could be modified slightly to restrict from the entire class to the subset of the class that chooses that choice.  The second choice is a standard final exam.  For the student who thinks, “I just need 50% on the exam to get the grade I want.”, it seems plausible for that student to take the first choice and write x=50 for a guaranteed score and a personal 3 hour savings.

Other thoughts

I thought about this game while grading some calc exams.  It turned out there were quite a few people who incorrectly solved the problem I was grading, yet according to my rubric, their score was exactly the answer they wrote down.  Purely a coincidence, but it sparked the thought that led me to this game.  I showed this game to a few people yesterday, and the first two variations directly came from some of those discussions.  One person I showed immediately declared, “I don’t want to play this game!”.  Between the game and the variations, the multitrack variation seems the most reasonable to even consider actually giving.  Concerning the title of this post – computer printouts (in some contexts) used to display ^H when the user hit the backspace key.  Thus, it is meant to indicate the previous characters are being stricken out.

QPL 2015

On July 17 I gave a talk at Oxford in the Quantum Physics and Logic 2015 conference.  It was recently released on the Oxford Quantum Group youtube channel.  There were a bunch of other really cool talks that week.  In my talk I refer to Pawel Sobocinski’s Graphical Linear Algebra tutorial on Monday and Tuesday, as well as Sean Tull’s talk on Categories of relations as models of quantum theory.  There are benefits to speaking near the end of the conference.

CSUSB colloquium

I’m giving a talk tomorrow afternoon at the CSUSB Mathematics department.  I will explain some of the stuff I’ve been working on, then shift gears to show how the general machinery can be applied to another mathematical landscape: knots.  Sticking purely with knots is a bit restrictive, so I almost immediately skip ahead to the open version of knots, i.e. tangles.  Luckily John Armstrong posted an article on arXiv in 2005, where he showed (among other very nice things) that tangle groups really are tangle invariants.

While I probably will not mention it tomorrow, another landscape that is being opened up (in the same sense that tangles open up knots) is Markov processes.  Blake Pollard is doing exciting work on this front, and the same general machinery I’ll be talking about should have some interesting things to say about Markov processes as well.

Associator

After more delays that I would like to admit, I finally got around to making this. Perhaps some explanation of what’s going on in the picture is in order. We begin with a category with (all) pushouts, starting with some objects and morphisms arranged as chained spans. A pushout is used to compose the two spans on the left. This gives a new span, L \rightarrow \Gamma'' \Gamma' \leftarrow J, which in turn can be composed with the span to its right via pushout. This is just one way that we can compose the chain of three spans. If we had built up from right to left instead of left to right, we would have \Gamma'' (\Gamma' \Gamma) at the top instead of (\Gamma'' \Gamma') \Gamma.

The associative property is really nice, so it would be good to show those two ways of composing spans are “the same” in some appropriate sense. Spoiler: that sense is “up to isomorphism.” Instead of directly finding them isomorphic to each other, it is easier to show each is isomorphic to some other thing – the colimit of the entire diagram we started with. This colimit will here be called an associator; it is something that both compositions of all three spans will be isomorphic to. I’m just going to sweep under the rug the question of whether that colimit will exist and just assume it does in this post.

So we’ve got the groundwork set, now we need to take advantage of some universal properties to give unique maps that make sub-diagrams commute. The first universal property taken advantage of is that of the first pushout. The associator has arrows going to it from \Gamma'' and \Gamma', so there will exist a unique morphism, \Gamma'' \Gamma' \rightarrow \Gamma'' \Gamma' \Gamma. such that everything commutes. Going up a step and using the arrow we just made, we can play the same game to get a unique morphism (\Gamma'' \Gamma') \Gamma \rightarrow \Gamma'' \Gamma' \Gamma that still makes everything commute.

To go in the reverse direction, we need morphisms from each of the Gammas to (\Gamma'' \Gamma') \Gamma. We already have one from \Gamma, and the other two can be built by composing morphisms. We can join two arrows together, tail to tip, to get a single arrow regardless of what category we are in. Now the universal property of the associator kicks in, giving a unique morphism \Gamma'' \Gamma' \Gamma \rightarrow (\Gamma'' \Gamma') \Gamma, but there’s a catch. We only know it makes some of the diagram commute. We have to put in a bit more effort to show the arrows involved in the pushouts will commute, too. The first step of this isn’t too bad — the three arrows that don’t point directly at \Gamma'' \Gamma' \Gamma will commute because the morphisms from \Gamma'' and \Gamma' to (\Gamma'' \Gamma') \Gamma commute with them by construction.

The universal pushout maps are a bit trickier, but since both legs of the pushouts commute with everything else, the universal maps will commute as well. Without the picture, that sentence might be a bit hard to understand. Look at the picture. They say those things are worth a thousand words. As parts of the diagram are shown to commute with the blue arrow on top, those part turn dark green. The arrows that already commute with everything get highlighted when they are used to show other arrows commute with everything. That should clarify what the picture is doing. Once everything commutes, the arrows turn black again, and we are left with unique maps in both directions at the top of the diagram that commute with each other. Long story short, this gives an isomorphism between the composition of spans by pushouts and the composition of spans “all at once” by associator. Watching the animation in a mirror gives the other way of composing spans by pushouts as isomorphic to this as well.

So this picture together with its mirror image shows (\Gamma'' \Gamma') \Gamma \simeq \Gamma'' (\Gamma' \Gamma), which is really the best we could possibly hope for when pushouts themselves are only unique up to isomorphism.

Classical Mechanics formulations (Part 2)

Last time I mentioned looking at two functions, M(p,\dot p) and J(\dot q,\dot p) that would ideally carry all the data needed to solve certain types of problems in classical mechanics. In the case of M, I conjectured that equations analogous to the Lagrange equations ought to exist and hold, namely,
\displaystyle \frac{d}{dt}\frac{\partial M}{\partial \dot p_i} = \frac{\partial M}{\partial p_i}.
If we got lucky, we could recover q and \dot q from those partials. While my approach on that front (prior to writing this post) has been… heuristic at best, I have found evidence that indicates M = L with the coordinates transformed. At least in some situations. In my attempt to present this evidence yesterday afternoon without complete notes, I was unable to recapture this evidence on the fly. There is probably a good lesson in that, even if nothing else is gained from this exercise.

In the comments for Part 1, Dr. Baez suggested looking at the one-dimensional Lagrangian L(q,\dot q) = \frac{1}{2}m\dot q\cdot\dot q - V(q), for which the equations of motion are well known. Acting on this suggestion over the weekend, during a bout of insomnia while running a fever, I wove together the evidence I found so elusive yesterday with a skein of half-baked yarn. I probably could mix more metaphors in there if I tried, but the point is, the following paragraph may need to be taken with a liberal grain of salt.

At first I was uncertain how to approach an ‘easy’ V(q), like a gravitational potential, so I took a simple spring potential, V(q) = \frac{1}{2}kq\cdot q. So that’s already a point in favor of feverish-insomnia me over at-the-whiteboard-without-notes me, who couldn’t even write down the potential correctly. The equation of motion is m\ddot q = -kq, from which we can find q(t) = A\sin(\sqrt{\frac{k}{m}}t+B). That can be simplified by choosing units where A=1 and a suitable time translation for which B=0, but I did not originally make those simplifications, so I won’t here. \dot q(t) = A\sqrt{\frac{k}{m}}\cos(\sqrt{\frac{k}{m}}t+B) = \frac{p}{m}, and \ddot q(t) = -A\frac{k}{m}\sin(\sqrt{\frac{k}{m}}t+B) = \frac{\dot p}{m}. Noticing those two outside equalities, \dot q = \frac{p}{m}, and \ddot q = \frac{\dot p}{m}, along with the equation of motion itself, m\ddot q = -kq, right off the bat would have saved me from having to do any of these contortions in t. There goes that point I gave to feverish-insomnia me.

In particular, L(q,\dot q) = \frac{1}{2}(m\dot q\cdot\dot q - kq\cdot q) can be rewritten by converting q to \ddot q via the equation of motion, and from there \dot q can be written in terms of p, so \ddot q can be written in terms of \dot p. Presto:
\displaystyle L = \frac{p\cdot p}{2m} - \frac{\dot p\cdot\dot p}{2k}
without any reference to t. But what happens when we treat this as M and write the Lagrange-analogy equations?
\displaystyle \frac{d}{dt}\frac{\partial L}{\partial \dot p} = \frac{d}{dt}\left(-\frac{\dot p}{k}\right) = \frac{d}{dt}(q), because -kq = m\ddot q = \dot p. Similarly,
\displaystyle \frac{\partial L}{\partial p} = \frac{p}{m} = \dot q,
and indeed, \frac{dq}{dt} = \dot q.

So that worked out for a simple spring. And it wasn’t just tenuous dreamstuff it was made of, after all. Bolstered by that success, I continued to the ‘simpler’ case of constant gravity. Here, V(q) = mgq, where g is the acceleration due to gravity, and is constant. Again, I took the cheap route of writing things in terms of t, but this time I was unable to fold all the ts back into an expression with only p,\dot p as variables. So that’s the bad news for this one. The good news is that if you take L(p,\dot p,q) and apply the partials, you do get \frac{dq}{dt} = \dot q again. Unfortunately, this requires more data than the original formulations, as the Hamiltonian approach gets by with a proper subset of those coordinate variables. I had thought I had gotten away from requiring q, but looking at what I wrote in my delirious state, that really was a phantasm of night.

It would appear this pseudo Lagrangian approach can work for certain situations, most likely ones whose equations of motion are second order homogeneous differential equations such that q can be solved for in terms of \dot q, \ddot q, which can be converted to p,\dot p. This seems likely at least for the given kinetic energy that was used here. A DHO (damped harmonic oscillator) might be worth looking at in the future, as those provide a generic linear second order homogeneous differential equation of motion.

Classical Mechanics formulations (Part 1)

Given fixed starting position and time, q_0 and t_0, and fixed ending position and time, q_1 and t_1, when a particle travels along a path of ‘least’ action, it obeys the Euler-Lagrange equations for some Lagrangian, L:
\displaystyle \frac{d}{dt}\frac{\partial L}{\partial \dot{q}^i} = \frac{\partial L}{\partial q^i}
which is a fancy way of saying the time derivative of p_i (momentum) is equal to F_i (force). The natural variables for this Lagrangian approach are q (position) and \dot{q} (velocity), from which p and F are built.

Given the right conditions, we can recast this formulation into the Hamiltonian approach, where q^i and p_i are the natural variables, and several nice things happen. By ‘right conditions’ I mean conditions that would allow us to return to the Lagrangian approach. One of the nice things that happen is position and momentum can be seen to be conjugate to each other in a way that is impossible for position and velocity. Indeed, Hamilton’s equations illustrate this conjugacy nicely:
\displaystyle \frac{dp_i}{dt} = -\frac{\partial H}{\partial q^i}\\ \frac{dq^i}{dt} = \frac{\partial H}{\partial p_i}
where H(q^i,p_i) is the total energy function.

If p and q are so similar, could we not recast this yet again, but in terms of p and \dot p = F, much like the Lagrangian approach was in terms of q and \dot q? Or if we are feeling zealous with reformulations, why not consider mechanics recast in terms of \dot q and F, like a derived Hamiltonian approach? These are some questions that occurred to me near the end of last quarter in Dr. Baez’ Classical Mechanics (Math 241) course, when he went over how to transition from the Lagrangian approach to the Hamiltonian approach.

At first blush, these ‘new’ approaches seem to give us less information than the old ones. To wit, in Newtonian mechanics, p = m \dot q, so any information about the absolute position appears to be lost to a constant of integration, especially for the \dot q, F approach. But it is worse than this. For each particle being considered, the constant of integration may be different. So not only do we lose absolute position, we also lose relative position. This limits our considerations to situations where potential energy is zero, unless something very nice happens that would allow us to recover q.

The nice thing about mathematics versus physics, is that I can cast aside the difficulties of whether or not things are physically relevant, as long as they make sense mathematically. So I will set aside any complaints about possible non-utility and forge ahead. I have not actually looked that far ahead yet, but I suspect the first ‘new’ approach will be somewhat similar to the Lagrangian approach, via the analogy between q and p. That is, I suspect there will be some function, M, analogous to the Lagrangian, L, such that:
\displaystyle \frac{d}{dt}\frac{\partial M}{\partial \dot{p}_i} = \frac{\partial M}{\partial p_i}
where \frac{\partial M}{\partial \dot{p}_i} and \frac{\partial M}{\partial p_i} will be interesting quantities, perhaps even position and velocity.

The second ‘new’ approach will almost certainly be lossy, but I suspect it will follow a pattern similar to the Hamilton equations. For convenience, I will write v:=\dot q for these:
\displaystyle \frac{dF_i}{dt} = -\frac{\partial J}{\partial v^i}\\ \frac{dv^i}{dt} = \frac{\partial J}{\partial F_i}
where J is something analogous to total energy, and should have units of power / time.

So I leave, for now, with some things to ponder, and some guesses as to the direction they will lead.

All in a day’s work

A friend of mine is teaching engineering-type stuff at another university, and he relayed a question to me, which I think he said came from his students.  He asked me if I could prove why a certain transformation works.  After a false start in interpretation, wherein I proved that it works, which he was already convinced of and for which didn’t need any further corroboration, I think I understand the spirit of his question well enough to provide a (hopefully) satisfactory answer.  And since I will be writing the answer up anyway, I may as well blog it up here.

The problem:  You have two coordinate frames, each Cartesian, with one frame rotated and displaced relative to the other.  Let R be the matrix that describes the rotation, and let v be the vector that gives the displacement of the origin.  A point, p, with coordinates given in the second frame (as a column vector) can be expressed in the first frame by the transformation p_1 = Rp + v.  If you picked the wrong direction to rotate or translate, replace R with R^{-1} or v with -v.  Regardless, this transformation is annoyingly affine.

The solution:  Augment R with v as a new column, and row filled with zeros in all entries except the last, which will be 1.  Let’s call this augmented matrix T.  Append a 1 to the bottom of p, too.  Let’s call that p'.  Now Tp' will also have a 1 in the bottom entry, and p_1 can be read off by ignoring that extra 1.  The question of whether this will work is left as an exercise to the reader; it is not difficult to convince oneself it will always work.

The puzzle:  Why does adding a dimension like this convert our transformation from one that is affine to one that is linear?

An aside, as to the engineering significance of the problem, suppose you have several rods connected by rotating joints.  If you want to know the position of the end of the last rod, relative to the base of the first rod, this kind of transformation, possibly composed several times, would be a way to determine that position.  The potential for composing the transformation several times is very good reason why it is so nice that the affine transformation can be converted to a linear one.

The justification:  Generalize the problem to the case where the direction of the displacement is fixed, but the magnitude is not.  That is, p_1 = Rp +\lambda v, where \lambda \in \mathbb{R}.  The reason for this generalization is that it makes the affine transformation we are interested in be a special case of this transformation, with \lambda=1, and just as importantly, this transformation is a linear combination of linear transformations!  The value of \lambda is independent of the rotation, which means we have the side effect of increasing the dimension of the transformation by 1, as seen in the solution.

A calculation analogous to the exercise above shows for this generalization, the matrix representing it is R, augmented by v.  This is not a square matrix, but square matrices are rather convenient, since the vector space in the codomain is the same as in the domain.  We can make it square by adding a row.  This row will be filled with zeros except the last column, which will be 1, in order to make the determinant 1.  This is exactly T.  Now let’s take a look at why p' is what it is.  In order to translate by \lambda v, \lambda is appended to p instead of 1.  But as we have already seen, the transformation we are actually interested in is when \lambda=1.  Therefore the linearization ought to have the form prescribed by the solution.