Get unique characters in a Python string

How Do You Get Unique Characters in a String? Python Basics Explained

Knowing how to retrieve unique characters from a Python string is a very common operation you might have to implement in your code.

To get unique characters in a Python string you have to consider that a Python string is a list of characters. You might want to remove duplicates from the string and in that case you could use the set() built-in function. If you only want to get the characters in the string that are unique you can use collections.Counter and a list comprehension.

To make things clear there are two distinct scenarios here:

  • Getting all characters in a string after removing duplicates.
  • Retrieving characters in a string that are unique. In other words characters that only appear once in the string.

Let’s see how to do this with code!

How Do You Get Unique Characters From a String in Python?

There are multiple ways to get unique characters in a Python string.

In this section I will show you the fastest way so you can fix this in your code and continue working on your project.

The goal here is to get all the characters in the string without including duplicates.

We will use the following principle…

A Python set is an unordered collection that doesn’t contain duplicate elements.

Let’s take the following string as an example:

>>> word = "London"

Firstly we will convert the string into a set using the built-in set() function.

>>> set(word)
set(['d', 'L', 'o', 'n'])

As you can see we got back a set and given that a set cannot contain duplicate elements the letter ‘o’ is only present one time.

Exactly what we want!

Now, if you want to get a string that contains all characters without duplicates you can use the string join method to create that string.

>>> "".join(set(word))
'dLon'

And if you want to make sure you only get back lowercase letters you can also use the string lower() method.

>>> "".join(set(word)).lower()
'dlon'

Makes sense?

How to Get Unique Characters in a String and Preserve Their Order

Previously we have seen how to remove duplicate characters in a string, but using a set we couldn’t preserve the order of the characters.

If you also want to preserve the order of the characters we can do the following:

  • create an empty string that contains the unique characters. We will call this variable unique_characters.
  • use a for loop that goes through each character of the initial string.
  • concatenate a character to the string unique_characters if the character doesn’t already exist in that string.
word = "London"

unique_characters = ""

for character in word:
    if character not in unique_characters:
        unique_characters += character.lower()

print("The list of unique characters is: {}".format(unique_characters))

Notice a few things you might find useful if you are just getting started with Python:

  • we have used not in to find out if a character is not part of the unique_characters string.
  • the + operator is used to concatenate a character to the unique_characters string.
  • to print the final message we have used the string format method.

And here is the output of our code:

The list of unique characters is: lond

That’s cool, the unique letters are now ordered.

How to Find Unique Ordered Characters in a String using a List and the String Join Method

We can obtain the same result from the previous section by using a Python list and the string join method.

Let’s see how the previous code changes…

We will make the following changes:

  • The unique_characters variable becomes a list instead of being a string.
  • Considering that we have to add elements to the unique_characters list we will use the list append() method instead of the + concatenation operator.
  • In order to create the final string of unique characters we will use the string join method and we will pass the unique_characters list to it.

Here is the updated code…

word = "London"

unique_characters = []

for character in word:
    if character not in unique_characters:
        unique_characters.append(character.lower())

print("The list of unique characters is: {}".format("".join(unique_characters)))

The output doesn’t change:

The list of unique characters is: lond

The new code works, but have a look at this.

To append a new character to our list we can either use the list append() method or the + concatenation operator:

>>> unique_characters = []
>>> character = 'a'
>>> unique_characters.append(character)
>>> unique_characters
['a']
>>> character = 'b'
>>> unique_characters += character
>>> unique_characters
['a', 'b']

Can you see that the effect of both on the list is the same?

Replace the following line in the code above:

unique_characters.append(character.lower())

With code that uses the concatenation operation:

unique_characters += character.lower()

And verify that the output of the code is the same.

Find Distinct Characters and Their Count in a Python String

This is a slightly different type of question…

Given a Python string we want to know which characters are unique in that string.

We could do it using a for loop but before doing that I want to show you a quick solution to this problem that uses Counter a dictionary subclass part of the collections module.

Here is what we get back when we pass a string to collections.Counter.

>>> from collections import Counter
>>> word = "london"
>>> Counter(word)
Counter({'o': 2, 'n': 2, 'l': 1, 'd': 1})

We get back a dictionary where the characters in the string are the keys and the number of occurrences of each character in the string are the values.

To check which characters are unique in a string we have to get all the keys that have value equal to 1.

We will use a list comprehension to do that.

>>> [key for key in Counter(word).keys() if Counter(word)[key] == 1]
['l', 'd']

Try this on your computer if it’s not immediately clear.

Using a For Loop to Find Unique Characters in a String

The last exercise we will do in this tutorial is to use a for loop instead of a list comprehension to get the same result from the previous section.

We will use a for loop to check which characters are unique in a string.

word = "London"

unique_characters = []

for character in word:
    if character not in unique_characters:
        unique_characters.append(character.lower())
    else:
        unique_characters.remove(character.lower())

print("The list of unique characters is: {}".format(unique_characters))

In the for loop we check if a specific character is inside the unique_characters list.

We append it to the list if it’s not in the list and we remove it from the list if the character is in the unique_characters list.

That’s because we only want to know which characters are unique in our string.

To remove a character from the unique_characters list we use the list remove() method.

And the output is exactly the same we have got in the previous example:

The list of unique characters is: ['l', 'd']

Conclusion

In this tutorial we have learned how to:

  • get a list of the characters in a string without including duplicates.
  • generate a list of the characters that are unique in a string.

We have used multiple approaches:

  • set() function with string join method.
  • for loop with string + concatenation operator.
  • for loop with list append method.
  • collections.Counter and list comprehension.
  • for loop with append() and remove() list methods.

Have you found this useful? Which method do you prefer?

Leave a Reply

Your email address will not be published.

How to Import a Python Function fro...
How to Import a Python Function from Another File