So I’ve been on a bender with this Python thing not having proper functional piping. I just can’t beleive it was not built in. Every language should have this. It cleans up your code so much its unbeleive-able. Fear not however, I have built one for Python :D. Now I just need to figure out how to create pip packages. Anyways, lets just do a quick walk through on what it is, how it works etc.
Lets just start with what your code probably looks like right now if you are coding python vanilla.
result = [ t['name'] for t in d['categories'] if t['id'] == 58 ]
Alright, thats not too bad. BUT! What if you need to do more complex things, such as for all of your annotations, map the category id to a category name that lives in a dictionary and keep the image id. So now you have a filter and a map and sort of a zip thing going on. So now you think, lets try to make this a bit more functional, lets try this…
result = list (map(lambda x: x['name'], list(filter(lambda x: x['id'] == 58, d['categories']))))
Ouch, thats kinda ugly and hard to read. And its only 2 items in the pipe. Ok, so lets go ahead and turn it into what is actually more read-able.
result = list(filter(lambda x: x['id'] == 58, d['categories'])) result = list (map(lambda x: x['name'], result))
Not bad, but I think we can do better. Option 1 really is the best out of all of these, but its not very flexible. Option 1 also doesn’t follow the concept of functional composition. Functional Composition allows some really powerful concepts and abilities. So lets take a look at the optimal option.
d['categories'] |Filter| (lambda x: x['id'] == 58) |Map| (lambda x: x['name'])
Now we have something that is easy to read, understand, compose, pipe, modify etc etc. It will also be much easier to perform a map on the annotations and add the filter/map from categories after it. Its a simple pipeline now. But where did |Filter| and |Map| come from?
There is a bit of setup code, and here it is…
from functools import partial class Infix(object): def __init__(self, func): self.func = func def __or__(self, other): return self.func(other) def __ror__(self, other): return Infix(partial(self.func, other)) def __call__(self, v1, v2): return self.func(v1, v2) @Infix def Filter(data, func): return list(filter(func,data)) @Infix def Map(data, func): return list(map(func,data))
So now we just need to sprinkle this on a few high order functions and we have a very robust data manipulation set up and package similar to R’s DplyR & Magrittr combo but with regular ol’ Python.
Finally lets just show off the slightly more complex scenario…
d['annotations'] |Map| (lambda x: (x['image_id'], d['categories'] |Filter| (lambda z: z['id'] == x['category_id'] ) |Map| (lambda z: z['name']) |ItemAt| 0) )
So here we can see a nice concise version of a more complex problem of transforming our annotations into a simple list of tuples of image id and category name by joining the two together. Not to mention that this type of coding is treating everything as immutable and therefor easily distributed.