问题描述:

Category theory and abstract algebra deal with the way functions can be combined with other functions. Complexity theory deals with how hard a function is to compute. It's weird to me that I haven't seen anyone combine these fields of study, since they seem like such natural pairs. Has anyone done this before?

As a motivating example, let's take a look at monoids. It's well known that if an operation is a monoid, then we can parallelize the operation.

For example in Haskell, we can trivially define that addition is a monoid over the integers like this:

`instance Monoid Int where`

mempty = 0

mappend = (+)

Now if we want to compute the sum of 0 to 999, we could do it sequentially like:

`foldl1' (+) [0..999]`

or we could do it in parallel

`mconcat [0..999] -- for simplicity of the code, I'm ignoring that this doesn't *actually* run in parallel`

But parallelizing this monoid makes sense only because mappend runs in constant time. What if this weren't the case? Lists, for example, are monoids where mappend does not run inconstant time (or space!). I'm guessing this is why there is no default parallel mconcat function in Haskell. The best implementation depends on the complexity of the monoid.

It seems like there should be a convenient way to describe the differences between these two monoids. We should then be able to annotate our code with these differences and have programs automatically choose the best algorithms to use depending on a monoid's complexity.

I do not claim to be an expert on these subjects, but I might be able to shed some light on this idea.

Let's first consider category theory. Category theory is a study of mathematical structure at a very high level. The notion of a category is very general, and a great many mathematical objects form categories. Category theory was first considered to be incredibly pure and abstract, but as these things often go in mathematics, it turned out to have many uses in applied subjects, such as computer science and even quantum mechanics. Monads turned out to be of great use in reasoning about the semantics of functional programs, which are generally expressed denotationally (and hence do not force any kind of order to computation, just the result). From this it was also realised that the monad is also a very good *design pattern* for writing functional programs, and its usefulness has led to it being very prominent in the design of Haskell (ie the do notation etc). Functors, Applicatives, Monoids, all followed somewhat later as objects less powerful than monads but therefore also more applicative (no pun intended!).

However, category theory itself is studies these kind of structures in a much more general way, which has turned out to be relevant in many areas of maths (and physics etc). As a non-expert it is not immediately clear how much of this might relate to complexity theory, but let's have a go.

Complexity theory is concerned with the *feasibility* of computation. Turing and others had showed that there were some functions that were not computable in general (for example the halting problem, the busy beaver problem etc), but the question of how *easy* a particular computation was in principle is a harder question in general. As you are probably aware, algorithms (which can be represented as Turing machines) can be put into complexity classes based on their asymptotic running time. There are a great many complexity classes that have been identified (see The Complexity Zoo) but comparatively little is known about the structure of these classes. The famous P = NP problem shows how difficult it is to reason about complexity.

From intuition about the nature of complexity classes and how difficult it is to prove relationships between them, I'd have thought it would be tricky to establish categories within complexity classes. Obviously the set of Turing machines forms a category, but the set of machines in O(n)? or the set of machines in P? This might well be a good research direction for complexity experts, and then again it may not! Personally I couldn't say without a lot more work.

Now lets consider your example of complexity within a Monoid, and parallelisation strategies. If it seems like this second section has little to do with the first, that is because I think these are very different concepts. First is the mathematics of categories and complexity, secondly there is the specifics of parallelising algorithms within certain design patterns.

If we know a certain type is a Monoid, what can we reason about the complexity of working with it? This is the class definition from `Data.Monoid`

```
class Monoid a where
mempty :: a
mappend :: a -> a -> a
mconcat :: [a] -> a
mconcat = foldr mappend mempty
```

Of course, we cannot say anything about the complexity because all we know is the type, as you surmised. The documentation has this to say about the default implementation of `mconcat`

within Data.Monoid:

```
-- ^ Fold a list using the monoid.
-- For most types, the default definition for 'mconcat' will be
-- used, but the function is included in the class definition so
-- that an optimized version can be provided for specific types.
```

The point I am trying to make is that `mconcat`

cannot necessarily be generalised from the complexity of the other operations. It would be difficult to prove in many cases that some, faster algorithm doesn't exist. `mconcat`

can be implemented manually in such a case.

You also mention automatically defining parallel strategies. Haskell certainly allows various different strategies to be defined, of which many useful ones are already in `Control.Parallel.Strategies`

. For example, parList applies a strategy in parallel to every element of a list:

```
parList :: Strategy a -> Strategy [a]
parList strat [] = ()
parList strat (x:xs) = strat x `par` (parList strat xs)
```

from this a parallel map function can be defined.

```
parMap :: Strategy b -> (a -> b) -> [a] -> [b]
parMap strat f xs = map f xs `using` parList strat
using :: a -> Strategy a -> a
using x s = s x `seq` x
```

Note that this allows for separation of implementation and parallelisation. I do not think that this kind of notion could be automated easily, especially just from annotations describing complexity of individual algorithms.

In summary then, it is possible that category theory might be a useful tool for future complexity research. However I do not think that this is likely to lead to automatic generation of parallelisation strategies.

At a superficial level, complexity theorists already do category theory-esque things, they just phrase it in their own language. For instance, the complexity class NP is a category where objects are NP languages and morphisms are polynomial-time reductions. Then a complete problem is an initial object in this category, and all NP-complete problems are isomorphic. It's not very reasonable, as in Vic's answer, to consider Turing machines as objects. This is mostly because of how finnicky Turing machines can be (they are many different models of Turing machines with quite different time and space complexities for the same problems). But the thesis idea of category theory, that the main interest in a category is in the morphisms, tells us that the morphisms should be algorithms.

Another view is that of hierarchies. Complexity classes form a poset category under inclusion, and there are some nice limit objects like P, PSPACE, NC, AC, etc.

Of course, it's unclear how the categorical perspective helps us prove theorems in complexity theory, and whether solving the problems via category theory would be any easier than solving the original problems. For example, I would consider the existence of a nontrivial functor from the poset category of complexity classes to, say, the category of groups to be a wildly transformative result. It would have the potential to separate complexity classes, which in this field is prohibitively difficult as a matter of course.

Quick googling shows a paper:

Safe Recursion Revisited: Categorical Semantics and Type Systems for Lower Complexity

I also remember works by Japaridze on polynomial time arithmetics, see http://arxiv.org/abs/0902.2969

I think you can start from there and walk by references.