Analyzing the Pandas Series Apply Method

I saw the Pandas.apply method and started thinking about this pattern and if it’d be useful to implement on other objects. Here’s a brief blog about the pattern.

The general pattern outside of framework or language specifics is:

Apply an anonymous function to a value or an iterable

Pandas Example

In the case of Pandas.apply it let’s you apply a function to each item in a iterable, which, in the case of Pandas, is each item of a single Series in a DataFrame. Series implements the apply method.

import pandas as pd

df = pd.DataFrame({'number': [1, 2, 3, 4]})

series = df['number']

print(type(series), '\n')

is_even = series.apply(lambda n: n % 2 == 0)

print(is_even)

### OUTPUT ###
# <class 'pandas.core.series.Series'>

# 0    False
# 1     True
# 2    False
# 3     True
# Name: number, dtype: bool

Python object instance example

Can this pattern then be used with Python base class instances inheriting from object? Yes, here’s what that would look like.

class Person(object):
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def apply(self, func):
        return func(self)
    
person = Person('Bob', 60)

person.apply(lambda p: "{p.name} is {p.age}".format(p=p))

### OUTPUT ###
# 'Bob is 60'

Defining the apply method allows the lambda to be immediately returned, instead of having to assign it to a reference, then invoke it.

f = lambda p: "{p.name} is {p.age}".format(p=p)
f(person)

### OUTPUT ###
# 'Bob is 60'

Python iterable example

Okay, so let’s use the apply pattern with a Python iterable and see what it looks like.

This code snippet subclasses the Python list standard type, which may or may not be kosher. The point is that the type being subclassed just has to implement the iterable interface. This could be anything, a Django QuerySet for example.

class PersonList(list):
    def apply(self, func):
        return [func(x) for x in self.__iter__()]

person_list = PersonList()

person = Person('Bob', 60)
person2 = Person('Jerry', 55)

person_list.append(person)
person_list.append(person2)

person_list.apply(lambda p: "{p.name} is {p.age}".format(p=p))

### OUTPUT ###
# ['Bob is 60', 'Jerry is 55']

Let’s do the same using Python’s map builtin function.

[x for x in map(lambda p: "{p.name} is {p.age}".format(p=p),
                [person, person2])]

### OUTPUT ###
# ['Bob is 60', 'Jerry is 55']

Python dict example for iter and items

Python iterable example #2 - dict

class EmployeeDict(dict):
    def apply(self, func):
        return [func(x) for x in self.__iter__()]
    
    def apply_items(self, func):
        return [func(k,v) for k,v in self.items()]

employee_dict = EmployeeDict({'manager': person, 'clerk': person2})

employee_dict.apply(lambda x: x)

### OUTPUT ###
# ['manager', 'clerk']

Use apply pattern with dict items

employee_dict.apply_items(lambda k,v: "The {}'s name is {}".format(k, v.name))

### OUTPUT ###
# ["The manager's name is Bob", "The clerk's name is Jerry"]

Summary and Note on `Pandas.apply`

The Pandas.apply method does a lot more than apply a lambda function. Here is the full method signature and link to the source.

It makes sense to do a lot more, because if your class has to defined an extra method apply when you could just be calling the builtin Python map function, then it should be worth it.