Selecting multiple rows and columns in pandas
Selecting Multiple Rows and Columns

This introduction to pandas is derived from Data School's pandas Q&A with my own notes and code.

Selecting multiple rows and columns from a pandas DataFrame

  • .loc
  • .iloc
  • .ix
In [1]:
import pandas as pd
In [3]:
url = 'http://bit.ly/uforeports'
ufo = pd.read_csv(url)
In [5]:
# show first 3 shows
ufo.head(3)
Out[5]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00

.loc usage
This is a really powerful and flexible method

In [6]:
# .loc DataFrame method
# filtering rows and selecting columns by label

# format
# ufo.loc[rows, columns]

# row 0, all columns
ufo.loc[0, :]
Out[6]:
City                       Ithaca
Colors Reported               NaN
Shape Reported           TRIANGLE
State                          NY
Time               6/1/1930 22:00
Name: 0, dtype: object
In [10]:
# rows 0, 1, 2
# all columns

ufo.loc[[0, 1, 2], :]

# more efficient code
ufo.loc[0:2, :]
Out[10]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
In [12]:
# if you leave off ", :" pandas would assume it's there
# but you should leave it there to improve code readability
ufo.loc[0:2]
Out[12]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
In [13]:
# all rows
# column: City
ufo.loc[:, 'City']
Out[13]:
0                      Ithaca
1                 Willingboro
2                     Holyoke
3                     Abilene
4        New York Worlds Fair
5                 Valley City
6                 Crater Lake
7                        Alma
8                     Eklutna
9                     Hubbard
10                    Fontana
11                   Waterloo
12                     Belton
13                     Keokuk
14                  Ludington
15                Forest Home
16                Los Angeles
17                  Hapeville
18                     Oneida
19                 Bering Sea
20                   Nebraska
21                        NaN
22                        NaN
23                  Owensboro
24                 Wilderness
25                  San Diego
26                 Wilderness
27                     Clovis
28                 Los Alamos
29               Ft. Duschene
                 ...         
18211                 Holyoke
18212                  Carson
18213                Pasadena
18214                  Austin
18215                El Campo
18216            Garden Grove
18217           Berthoud Pass
18218              Sisterdale
18219            Garden Grove
18220             Shasta Lake
18221                Franklin
18222          Albrightsville
18223              Greenville
18224                 Eufaula
18225             Simi Valley
18226           San Francisco
18227           San Francisco
18228              Kingsville
18229                 Chicago
18230             Pismo Beach
18231             Pismo Beach
18232                    Lodi
18233               Anchorage
18234                Capitola
18235          Fountain Hills
18236              Grant Park
18237             Spirit Lake
18238             Eagle River
18239             Eagle River
18240                    Ybor
Name: City, dtype: object
In [15]:
# all rows
# column: City, State
ufo.loc[:, ['City', 'State']]

# similar code for City through State
ufo.loc[:, 'City':'State']
Out[15]:
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
3 Abilene NaN DISK KS
4 New York Worlds Fair NaN LIGHT NY
5 Valley City NaN DISK ND
6 Crater Lake NaN CIRCLE CA
7 Alma NaN DISK MI
8 Eklutna NaN CIGAR AK
9 Hubbard NaN CYLINDER OR
10 Fontana NaN LIGHT CA
11 Waterloo NaN FIREBALL AL
12 Belton RED SPHERE SC
13 Keokuk NaN OVAL IA
14 Ludington NaN DISK MI
15 Forest Home NaN CIRCLE CA
16 Los Angeles NaN NaN CA
17 Hapeville NaN NaN GA
18 Oneida NaN RECTANGLE TN
19 Bering Sea RED OTHER AK
20 Nebraska NaN DISK NE
21 NaN NaN NaN LA
22 NaN NaN LIGHT LA
23 Owensboro NaN RECTANGLE KY
24 Wilderness NaN DISK WV
25 San Diego NaN CIGAR CA
26 Wilderness NaN DISK WV
27 Clovis NaN DISK NM
28 Los Alamos NaN DISK NM
29 Ft. Duschene NaN DISK UT
... ... ... ... ...
18211 Holyoke NaN DIAMOND MA
18212 Carson NaN DISK CA
18213 Pasadena GREEN FIREBALL CA
18214 Austin NaN FORMATION TX
18215 El Campo NaN OTHER TX
18216 Garden Grove ORANGE LIGHT CA
18217 Berthoud Pass NaN TRIANGLE CO
18218 Sisterdale NaN DIAMOND TX
18219 Garden Grove NaN CHEVRON CA
18220 Shasta Lake BLUE DISK CA
18221 Franklin NaN DISK NH
18222 Albrightsville NaN OTHER PA
18223 Greenville NaN NaN SC
18224 Eufaula NaN DISK OK
18225 Simi Valley NaN FORMATION CA
18226 San Francisco NaN FORMATION CA
18227 San Francisco NaN TRIANGLE CA
18228 Kingsville NaN LIGHT TX
18229 Chicago NaN DISK IL
18230 Pismo Beach NaN OVAL CA
18231 Pismo Beach NaN OVAL CA
18232 Lodi NaN NaN WI
18233 Anchorage RED VARIOUS AK
18234 Capitola NaN TRIANGLE CA
18235 Fountain Hills NaN NaN AZ
18236 Grant Park NaN TRIANGLE IL
18237 Spirit Lake NaN DISK IA
18238 Eagle River NaN NaN WI
18239 Eagle River RED LIGHT WI
18240 Ybor NaN OVAL FL

18241 rows × 4 columns

In [17]:
# multiple rows and multiple columns
ufo.loc[0:2, 'City':'State']
Out[17]:
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
In [18]:
# filter using City=='Oakland'
ufo[ufo.City=='Oakland']
Out[18]:
City Colors Reported Shape Reported State Time
1694 Oakland NaN CIGAR CA 7/21/1968 14:00
2144 Oakland NaN DISK CA 8/19/1971 0:00
4686 Oakland NaN LIGHT MD 6/1/1982 0:00
7293 Oakland NaN LIGHT CA 3/28/1994 17:00
8488 Oakland NaN NaN CA 8/10/1995 21:45
8768 Oakland NaN NaN CA 10/10/1995 22:40
10816 Oakland NaN LIGHT OR 10/1/1997 21:30
10948 Oakland NaN DISK CA 11/14/1997 19:55
11045 Oakland NaN TRIANGLE CA 12/10/1997 1:30
12322 Oakland NaN FIREBALL CA 10/9/1998 19:40
12941 Oakland NaN CYLINDER CA 1/23/1999 21:30
16803 Oakland NaN TRIANGLE MD 7/4/2000 23:00
17322 Oakland NaN CYLINDER CA 9/1/2000 21:35
In [20]:
# easier-to-read code
# here you specify the rows and columns you want
# ufo.loc[rows, columns]

ufo.loc[ufo.City=='Oakland', :]
Out[20]:
City Colors Reported Shape Reported State Time
1694 Oakland NaN CIGAR CA 7/21/1968 14:00
2144 Oakland NaN DISK CA 8/19/1971 0:00
4686 Oakland NaN LIGHT MD 6/1/1982 0:00
7293 Oakland NaN LIGHT CA 3/28/1994 17:00
8488 Oakland NaN NaN CA 8/10/1995 21:45
8768 Oakland NaN NaN CA 10/10/1995 22:40
10816 Oakland NaN LIGHT OR 10/1/1997 21:30
10948 Oakland NaN DISK CA 11/14/1997 19:55
11045 Oakland NaN TRIANGLE CA 12/10/1997 1:30
12322 Oakland NaN FIREBALL CA 10/9/1998 19:40
12941 Oakland NaN CYLINDER CA 1/23/1999 21:30
16803 Oakland NaN TRIANGLE MD 7/4/2000 23:00
17322 Oakland NaN CYLINDER CA 9/1/2000 21:35
In [21]:
# again, specifying the rows and columns you want
# this would be the best way to do it compared to chain indexing 
ufo.loc[ufo.City=='Oakland', 'State']
Out[21]:
1694     CA
2144     CA
4686     MD
7293     CA
8488     CA
8768     CA
10816    OR
10948    CA
11045    CA
12322    CA
12941    CA
16803    MD
17322    CA
Name: State, dtype: object
In [24]:
# chain indexing 
# there may be issues in some cases
# try not to use this
ufo[ufo.City=='Oakland'].State
Out[24]:
1694     CA
2144     CA
4686     MD
7293     CA
8488     CA
8768     CA
10816    OR
10948    CA
11045    CA
12322    CA
12941    CA
16803    MD
17322    CA
Name: State, dtype: object

iloc usage

In [25]:
ufo.iloc[:, [0, 3]]
Out[25]:
City State
0 Ithaca NY
1 Willingboro NJ
2 Holyoke CO
3 Abilene KS
4 New York Worlds Fair NY
5 Valley City ND
6 Crater Lake CA
7 Alma MI
8 Eklutna AK
9 Hubbard OR
10 Fontana CA
11 Waterloo AL
12 Belton SC
13 Keokuk IA
14 Ludington MI
15 Forest Home CA
16 Los Angeles CA
17 Hapeville GA
18 Oneida TN
19 Bering Sea AK
20 Nebraska NE
21 NaN LA
22 NaN LA
23 Owensboro KY
24 Wilderness WV
25 San Diego CA
26 Wilderness WV
27 Clovis NM
28 Los Alamos NM
29 Ft. Duschene UT
... ... ...
18211 Holyoke MA
18212 Carson CA
18213 Pasadena CA
18214 Austin TX
18215 El Campo TX
18216 Garden Grove CA
18217 Berthoud Pass CO
18218 Sisterdale TX
18219 Garden Grove CA
18220 Shasta Lake CA
18221 Franklin NH
18222 Albrightsville PA
18223 Greenville SC
18224 Eufaula OK
18225 Simi Valley CA
18226 San Francisco CA
18227 San Francisco CA
18228 Kingsville TX
18229 Chicago IL
18230 Pismo Beach CA
18231 Pismo Beach CA
18232 Lodi WI
18233 Anchorage AK
18234 Capitola CA
18235 Fountain Hills AZ
18236 Grant Park IL
18237 Spirit Lake IA
18238 Eagle River WI
18239 Eagle River WI
18240 Ybor FL

18241 rows × 2 columns

In [28]:
# iloc excludes 4 (compared to loc where it includes 4)
# iloc includes 0
ufo.iloc[:, 0:4]
Out[28]:
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
3 Abilene NaN DISK KS
4 New York Worlds Fair NaN LIGHT NY
5 Valley City NaN DISK ND
6 Crater Lake NaN CIRCLE CA
7 Alma NaN DISK MI
8 Eklutna NaN CIGAR AK
9 Hubbard NaN CYLINDER OR
10 Fontana NaN LIGHT CA
11 Waterloo NaN FIREBALL AL
12 Belton RED SPHERE SC
13 Keokuk NaN OVAL IA
14 Ludington NaN DISK MI
15 Forest Home NaN CIRCLE CA
16 Los Angeles NaN NaN CA
17 Hapeville NaN NaN GA
18 Oneida NaN RECTANGLE TN
19 Bering Sea RED OTHER AK
20 Nebraska NaN DISK NE
21 NaN NaN NaN LA
22 NaN NaN LIGHT LA
23 Owensboro NaN RECTANGLE KY
24 Wilderness NaN DISK WV
25 San Diego NaN CIGAR CA
26 Wilderness NaN DISK WV
27 Clovis NaN DISK NM
28 Los Alamos NaN DISK NM
29 Ft. Duschene NaN DISK UT
... ... ... ... ...
18211 Holyoke NaN DIAMOND MA
18212 Carson NaN DISK CA
18213 Pasadena GREEN FIREBALL CA
18214 Austin NaN FORMATION TX
18215 El Campo NaN OTHER TX
18216 Garden Grove ORANGE LIGHT CA
18217 Berthoud Pass NaN TRIANGLE CO
18218 Sisterdale NaN DIAMOND TX
18219 Garden Grove NaN CHEVRON CA
18220 Shasta Lake BLUE DISK CA
18221 Franklin NaN DISK NH
18222 Albrightsville NaN OTHER PA
18223 Greenville NaN NaN SC
18224 Eufaula NaN DISK OK
18225 Simi Valley NaN FORMATION CA
18226 San Francisco NaN FORMATION CA
18227 San Francisco NaN TRIANGLE CA
18228 Kingsville NaN LIGHT TX
18229 Chicago NaN DISK IL
18230 Pismo Beach NaN OVAL CA
18231 Pismo Beach NaN OVAL CA
18232 Lodi NaN NaN WI
18233 Anchorage RED VARIOUS AK
18234 Capitola NaN TRIANGLE CA
18235 Fountain Hills NaN NaN AZ
18236 Grant Park NaN TRIANGLE IL
18237 Spirit Lake NaN DISK IA
18238 Eagle River NaN NaN WI
18239 Eagle River RED LIGHT WI
18240 Ybor NaN OVAL FL

18241 rows × 4 columns

In [31]:
# this is the major difference
# exclusive of 3
ufo.iloc[0:3, :]
Out[31]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
In [38]:
# non-explicit code
ufo[['City', 'State']]

# explicit code
ufo.loc[:, ['City', 'State']]
Out[38]:
City State
0 Ithaca NY
1 Willingboro NJ
2 Holyoke CO
3 Abilene KS
4 New York Worlds Fair NY
5 Valley City ND
6 Crater Lake CA
7 Alma MI
8 Eklutna AK
9 Hubbard OR
10 Fontana CA
11 Waterloo AL
12 Belton SC
13 Keokuk IA
14 Ludington MI
15 Forest Home CA
16 Los Angeles CA
17 Hapeville GA
18 Oneida TN
19 Bering Sea AK
20 Nebraska NE
21 NaN LA
22 NaN LA
23 Owensboro KY
24 Wilderness WV
25 San Diego CA
26 Wilderness WV
27 Clovis NM
28 Los Alamos NM
29 Ft. Duschene UT
... ... ...
18211 Holyoke MA
18212 Carson CA
18213 Pasadena CA
18214 Austin TX
18215 El Campo TX
18216 Garden Grove CA
18217 Berthoud Pass CO
18218 Sisterdale TX
18219 Garden Grove CA
18220 Shasta Lake CA
18221 Franklin NH
18222 Albrightsville PA
18223 Greenville SC
18224 Eufaula OK
18225 Simi Valley CA
18226 San Francisco CA
18227 San Francisco CA
18228 Kingsville TX
18229 Chicago IL
18230 Pismo Beach CA
18231 Pismo Beach CA
18232 Lodi WI
18233 Anchorage AK
18234 Capitola CA
18235 Fountain Hills AZ
18236 Grant Park IL
18237 Spirit Lake IA
18238 Eagle River WI
18239 Eagle River WI
18240 Ybor FL

18241 rows × 2 columns

In [40]:
# ambiguous code again, are we referring to rows or columns?
ufo[0:2]

# use iloc!
ufo.iloc[0:2, :]
Out[40]:
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00

.ix usage
Mix labels and integers when using selection.

In [41]:
drinks_url = 'http://bit.ly/drinksbycountry'
drinks = pd.read_csv(drinks_url, index_col='country')
In [42]:
drinks.head()
Out[42]:
beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol continent
country
Afghanistan 0 0 0 0.0 Asia
Albania 89 132 54 4.9 Europe
Algeria 25 0 14 0.7 Africa
Andorra 245 138 312 12.4 Europe
Angola 217 57 45 5.9 Africa
In [43]:
drinks.ix['Albania', 0]
Out[43]:
89
In [44]:
drinks.ix[1, 'beer_servings']
Out[44]:
89
In [46]:
# for .ix, columns are exclusive of 2
drinks.ix['Albania':'Andorra', 0:2]
Out[46]:
beer_servings spirit_servings
country
Albania 89 132
Algeria 25 0
Andorra 245 138
In [48]:
# for rows, .ix is inclusive from start to end
# for columns, .ix is exclusive of end but inclusive of start
ufo.ix[0:2, 0:2]
Out[48]:
City Colors Reported
0 Ithaca NaN
1 Willingboro NaN
2 Holyoke NaN
Tags: pandas