Imagine that you have an online shop, and from your sales data you want to find all the countries you have ever sold and shipped a product to. You don’t care how many sales you made to each country, just that you have sold something there at least once.
This is a good example of where using sets would be easier and faster than a solution that uses lists.
Sets
Sets can be described as follows:
- Each element in a set is unique.
- The elements are unordered within the set.
Initialising a set
An empty set can be initialised by:
my_set = set([])
A pre-populated set can be initialised by:
my_set = set(["one", 5, "hello"])
Or (preferred method):
my_set = {"one", 5, "hello"}
Note that from the examples above a set, like a list, can be populated with elements of different data types (i.e. they do not all have to be the same data type).
add
You can add an element to a set by using add()
:
my_set = set([]) my_set.add("hello") print my_set
set(['hello'])
remove
You can remove an element from a set by using remove()
:
my_set = {"bop", "bit", 5} my_set.remove("bop") print my_set
set(['bit', 5])
union
If you have two sets and want to create a new set with all the elements from both sets, you can use union()
:
set_one = {"hello", 12, 7} set_two = {"apple", "hello", 7, 18} set_union = set_one.union(set_two) print set_union
set([18, 'apple', 7, 12, 'hello'])
intersection
If you have two sets and want to find the elements that are in both sets, you can use intersection()
:
set_one = {"hello", 12, 7} set_two = {"apple", "hello", 7, 18} set_intersection = set_one.intersection(set_two) print set_intersection
set(['hello', 7])
difference
If you have two sets and wish to remove any elements that appear in the second set from the first set, you can use difference()
:
set_one = {"hello", 12, 7} set_two = {"apple", "hello", 7, 18} set_difference = set_one.difference(set_two) print set_difference
set([12])
symmetric difference
If you have two sets and want to find elements that appear in one set but not in both sets, you can use symmetric_difference()
:
set_one = {"hello", 12, 7} set_two = {"apple", "hello", 7, 18} set_symmetric_difference = set_one.symmetric_difference(set_two) print set_symmetric_difference
set([18, 12, 'apple'])