Vectorization Methods

Pandas vectorized methods

Concepts covered

1. Lambda Refresher
• lambda functions are small inline functions that are defined on-the-fly in Python
• lambda x: x>= 1 will take an input x and return x>=1, or a boolean that equals True or False.
2. map()
• create a new Series by applying the lambda function to each element
• can only be used on a Series to return a new Series
3. applymap()
• create a new DataFrame by applying the lambda function to each element
• can only be used on a DataFrame to return a new DataFrame
4. df.apply(numpy.mean)
• Get mean of every column in a DataFrame
• Exactly the same as df.mean()
In [50]:
import pandas as pd
import numpy as np

In [51]:
# columns
columns = ['one', 'two']

In [52]:
# index
index = ['a', 'b', 'c', 'd']

In [53]:
# lists
one = [1, 2, 3, 4]
two = [1, 2, 3, 4]

In [54]:
# dictionary
d = {
'one': one,
'two': two
}

In [55]:
# DataFrame
df = pd.DataFrame(d, columns=col, index=index)

In [56]:
df

Out[56]:
one two
a 1 1
b 2 2
c 3 3
d 4 4
In [58]:
# mean of every single column in df
df.apply(np.mean)

Out[58]:
one    2.5
two    2.5
dtype: float64
In [61]:
# you can use a pandas command too
df.mean()

Out[61]:
one    2.5
two    2.5
dtype: float64
In [64]:
# .map() on particular columns (Series)
# goes through every value in column and evaluate if it's > 1
df['one'].map(lambda x: x >= 1)

Out[64]:
a    True
b    True
c    True
d    True
Name: one, dtype: bool
In [65]:
df.applymap(lambda x: x >= 1)

Out[65]:
one two
a True True
b True True
c True True
d True True
Tags: