Calculate mean and median

## Measures of Central Tendency¶

* Describing a distribution using measures of center*

- Mode
- Value (on the x-axis) at which frequency is highest
- Other cases
- May be a range that occured with the highest frequency
- No mode for uniform distributions
- May have multiple modes
- May be a categorical mode
- X-axis (plain and peanut)
- y-axis (plain = 60,000, peanut = 10,000)
- Mode = plain (x-axis)
- 60,000 and 10,000 are frequencies

- All scores in the dataset may not affect the mode
- [2, 2, 3, 4, 100]
- Mode is the same even if we add a big number 10000

- Mode changes with each sample
- May not be the same as the population's mode

- Mode changes with bin sizes
- There is no equation for calculating the mode

- Median
- Value in the middle for an odd set of numbers
- Mean of the 2 values in the middle for an even set of numbers
- Properties
- This will not be affected by the outlier
- It does not take every score in the distribution

- Mean
- Average
- Properties
- All scores of a distribution affect the mean
- Mean can be represented by a formula
- Many samples would have similar means
- Mean will be affected by outliers

**Calculating measures of central tendency in Pandas**

In [55]:

```
import pandas as pd
```

In [56]:

```
url = './fb_data.csv'
data = pd.read_csv(url, header=None)
```

In [64]:

```
data
```

Out[64]:

In [58]:

```
sorted(data)
data
```

Out[58]:

In [59]:

```
# Since this is a pandas DataFrame, we can use mean() and median() methods
type(data)
```

Out[59]:

In [60]:

```
data.mean()
```

Out[60]:

In [61]:

```
data.median()
```

Out[61]:

In [63]:

```
# this is a uniform distribution
data.mode()
```

Out[63]: