Source: Python Morsels

Description

Return all unique items in an iterable. Return a lazy iterator (generator); work with non-hashable types; optimize for both hashable and non-hashable types.

1
2
3
4
5
6
7
8
>>> list(uniques_only([1, 2, 2, 1, 1, 3, 2, 1]))
[1, 2, 3]
>>> nums = [1, -3, 2, 3, -1]
>>> squares = (n**2 for n in nums)
>>> list(uniques_only(squares))
[1, 9, 4]
>>> list(uniques_only([['a', 'b'], ['a', 'c'], ['a', 'b']]))
[['a', 'b'], ['a', 'c']]

Notes

Solution involves using set() to optimize lookup of hashable items, and a list to accumulate them if you need to preserve order (because a set is unordered).

My solution

My first thought was to convert the input list to a set() to eliminate duplicates. However, sets are unordered and the provided tests required ordered output. So I used a list to accumulate values, checking the list of accumulated values to avoid adding the same value twice. Checking the entire list of accumulated values would make this slow for sizable input iterables, but that’s as far as I got on a busy day.

1
2
3
4
5
6
def uniques_only(iter):
    result = []
    for item in iter:
        if item not in result:
            result.append(item)
            yield item

Provided solutions

One possibility is to add accumulated values to a set. Sets use hashes for lookup, so this will likely be faster than a list for large inputs. Separately, a list is used to accumulate the output, because sets are unordered.

However, this solution will not work with non-hashable types.

1
2
3
4
5
6
7
8
def uniques_only(iterable):
    seen = set()
    items = []
    for item in iterable:
        if item not in seen:
            items.append(item)
            seen.add(item)
    return items

An optimized solution for both hashable and non-hashable types would require tracking both separately (hashables in a set, non-hashables in a list), and either (1) checking each item to see if it is an instance of Hashable (use isinstance for this) or (2) more Pythonically, letting an exception determine what happens: try to add it to a set, and if that throws a TypeError then add it to a list.