Vectorization Methods

Pandas vectorized methods

Concepts covered

1. Lambda Refresher
• lambda functions are small inline functions that are defined on-the-fly in Python
• lambda x: x>= 1 will take an input x and return x>=1, or a boolean that equals True or False.
2. map()
• create a new Series by applying the lambda function to each element
• can only be used on a Series to return a new Series
3. applymap()
• create a new DataFrame by applying the lambda function to each element
• can only be used on a DataFrame to return a new DataFrame
4. df.apply(numpy.mean)
• Get mean of every column in a DataFrame
• Exactly the same as df.mean()
In :
import pandas as pd
import numpy as np

In :
# columns
columns = ['one', 'two']

In :
# index
index = ['a', 'b', 'c', 'd']

In :
# lists
one = [1, 2, 3, 4]
two = [1, 2, 3, 4]

In :
# dictionary
d = {
'one': one,
'two': two
}

In :
# DataFrame
df = pd.DataFrame(d, columns=col, index=index)

In :
df

Out:
one two
a 1 1
b 2 2
c 3 3
d 4 4
In :
# mean of every single column in df
df.apply(np.mean)

Out:
one    2.5
two    2.5
dtype: float64
In :
# you can use a pandas command too
df.mean()

Out:
one    2.5
two    2.5
dtype: float64
In :
# .map() on particular columns (Series)
# goes through every value in column and evaluate if it's > 1
df['one'].map(lambda x: x >= 1)

Out:
a    True
b    True
c    True
d    True
Name: one, dtype: bool
In :
df.applymap(lambda x: x >= 1)

Out:
one two
a True True
b True True
c True True
d True True
Tags: