Python sets (2.7.x)

Imagine that you have an online shop, and from your sales data you want to find all the countries you have ever sold and shipped a product to. You don’t care how many sales you made to each country, just that you have sold something there at least once.

This is a good example of where using sets would be easier and faster than a solution that uses lists.

 

Sets

 

Sets can be described as follows:

  • Each element in a set is unique.
  • The elements are unordered within the set.

 

Initialising a set

 

An empty set can be initialised by:

1
my_set = set([])

A pre-populated set can be initialised by:

1
my_set = set(["one", 5, "hello"])

Or (preferred method):

1
my_set = {"one", 5, "hello"}

 

Note that from the examples above a set, like a list, can be populated with elements of different data types (i.e. they do not all have to be the same data type).

 

add

 

You can add an element to a set by using add():

Example Usage Script
1
2
3
my_set = set([])
my_set.add("hello")
print my_set
Output
1
set(['hello'])

 

remove

 

You can remove an element from a set by using remove():

Example Usage Script
1
2
3
my_set = {"bop", "bit", 5}
my_set.remove("bop")
print my_set
Output
1
set(['bit', 5])

 

union

 

If you have two sets and want to create a new set with all the elements from both sets, you can use union():

Example Usage Script
1
2
3
4
5
set_one = {"hello", 12, 7}
set_two = {"apple", "hello", 7, 18}
 
set_union = set_one.union(set_two)
print set_union
Output
1
set([18, 'apple', 7, 12, 'hello'])

 

intersection

 

If you have two sets and want to find the elements that are in both sets, you can use intersection():

Example Usage Script
1
2
3
4
5
set_one = {"hello", 12, 7}
set_two = {"apple", "hello", 7, 18}
 
set_intersection = set_one.intersection(set_two)
print set_intersection
Output
1
set(['hello', 7])

 

difference

 

If you have two sets and wish to remove any elements that appear in the second set from the first set, you can use difference():

Example Usage Script
1
2
3
4
5
set_one = {"hello", 12, 7}
set_two = {"apple", "hello", 7, 18}
 
set_difference = set_one.difference(set_two)
print set_difference
Output
1
set([12])

 

symmetric difference

 

If you have two sets and want to find elements that appear in one set but not in both sets, you can use symmetric_difference():

Example Usage Script
1
2
3
4
5
set_one = {"hello", 12, 7}
set_two = {"apple", "hello", 7, 18}
 
set_symmetric_difference = set_one.symmetric_difference(set_two)
print set_symmetric_difference
Output
1
set([18, 12, 'apple'])