Imagine that you have an online shop, and from your sales data you want to find all the countries you have ever sold and shipped a product to. You don’t care how many sales you made to each country, just that you have sold something there at least once.
This is a good example of where using sets would be easier and faster than a solution that uses lists.
Sets
Sets can be described as follows:
- Each element in a set is unique.
- The elements are unordered within the set.
Initialising a set
An empty set can be initialised by:
1 | my_set = set ([]) |
A pre-populated set can be initialised by:
1 | my_set = set ([ "one" , 5 , "hello" ]) |
Or (preferred method):
1 | my_set = { "one" , 5 , "hello" } |
Note that from the examples above a set, like a list, can be populated with elements of different data types (i.e. they do not all have to be the same data type).
add
You can add an element to a set by using add()
:
1 2 3 | my_set = set ([]) my_set.add( "hello" ) print my_set |
1 | set(['hello']) |
remove
You can remove an element from a set by using remove()
:
1 2 3 | my_set = { "bop" , "bit" , 5 } my_set.remove( "bop" ) print my_set |
1 | set(['bit', 5]) |
union
If you have two sets and want to create a new set with all the elements from both sets, you can use union()
:
1 2 3 4 5 | set_one = { "hello" , 12 , 7 } set_two = { "apple" , "hello" , 7 , 18 } set_union = set_one.union(set_two) print set_union |
1 | set([18, 'apple', 7, 12, 'hello']) |
intersection
If you have two sets and want to find the elements that are in both sets, you can use intersection()
:
1 2 3 4 5 | set_one = { "hello" , 12 , 7 } set_two = { "apple" , "hello" , 7 , 18 } set_intersection = set_one.intersection(set_two) print set_intersection |
1 | set(['hello', 7]) |
difference
If you have two sets and wish to remove any elements that appear in the second set from the first set, you can use difference()
:
1 2 3 4 5 | set_one = { "hello" , 12 , 7 } set_two = { "apple" , "hello" , 7 , 18 } set_difference = set_one.difference(set_two) print set_difference |
1 | set([12]) |
symmetric difference
If you have two sets and want to find elements that appear in one set but not in both sets, you can use symmetric_difference()
:
1 2 3 4 5 | set_one = { "hello" , 12 , 7 } set_two = { "apple" , "hello" , 7 , 18 } set_symmetric_difference = set_one.symmetric_difference(set_two) print set_symmetric_difference |
1 | set([18, 12, 'apple']) |