Python String Operations

A string is one of the many data types in Python. Some of the other common ones are objects, lists, integers, and dates. At its core, a string is really a list/array of characters. Today we will be Python String Operations. We will cover how to Concatenate, Tokenize, Manipulate, Search strings, and create string templates.

Concatenate

Concatenating strings is just taking two separate strings and gluing them together. In Python, it is extremely easy. You simply use the ‘+’ symbol to add the strings together.

For example, you might have two strings:

If you want to create a single string out of these to strings, you want to concatenate them. In python it is really easy. You can either create a third string. Or you can modify one of the existing ones to contain the new resulting string.

An example of how you might combine these two strings into one new one is like this:
Newstring = Phrase + Name

The resulting new string will be equal to:
“Is tiredSean”

Notice how I did not put a space in between tired and Sean. That is not a typo. Rather we combined the two strings, but there was no space, so the words got stuck together. To get around this, we need to add a space. An example of that would be:
Newstring = phrase + “ ” + name

The resulting value of   would be:
“Is tired Sean”

You can confirm by running:
Print(new-string)

String Templates

You can get away without using templates in Python. You can do a lot of string concatenations to construct the string you need. But as you do more of those, they become unwieldy. This is where templates come into play. If you have a given sequence of text that you use all the time, and you just need to do some substitutions, templates are a good way of solving that issue.

To begin we will create our template:

Next, we use the substitute function to fill in the variable:

The output will be:
“I like to play Baseball”

This becomes even more useful when you use multiple variables string:

The output would be:
“I like to Cook Food”

Manipulating

When working with strings there are all kinds of reasons you might need to modify the strings. Perhaps you are trying to compare two strings, Or perhaps you are preparing the data before you insert it into a database. Here are a few common operations you may need to perform on a string

Convert to upper or lower case

When you compare strings, it is helpful for the two strings to all be the same case. Doesn’t matter if it is all upper or lower case. In most programming languages, the string “Sean” and “sEan” would be two different strings. In our example, we will use the following two strings:

To convert them to all capital or lowercase is very easy:

 

In the two examples above, we are modifying the original string to make it upper or lower case. However, we don’t have to modify the original string. We could print the string in all caps by running:

Or we could compare the strings with an if statement:

Strip the whitespace and characters from a string

At times you will have strings that have some extra characters that need to be removed. Let’s take the following two examples:

In String1 we have a bunch of extra spaces at both the beginning and end of the string. We can remove this extra spaces by using the strip() function as shown here:

The above code will remove all of the extra spaces. You can confirm by running:

Next we have String2. It has a similar problem as String1 as it has a bunch of extra hash marks. We can use the strip() function on this string as well, we just have to pass an extra argument. By default the strip() function removes extra spaces. But we can pass in any character we want. For example, if we run:

The output will be:
Wasn’t that Awesome?

If you want to remove only characters from one side, or the other, you can use the lstrip() and rstrip() functions. For example, in the case of String2:

Would output:
“Wasn’t that Awesome?########”

Would output:
“#######Wasn’t that Awesome?”

Next, what if we want to replace a word or character in the middle of a string, we can use the replace function for that. The following will replace the word that with nothing, effectively removing it from the string:

Or, we could insert additional text:

What if we want to remove some of the # marks from the beginning of a string, but not all of them. For that we don’t use a function, but we can do the following:

The above will remove the first six characters from the string. So the output would be:
#Wasn’t that Awesome?########

This operation requires a little bit more explanation. As was stated earlier, a string is a list/array of characters. In this operation we have told the system to show us String1 starting from character #6 and all the way to the end of the array. If we wanted to remove just the first character, we could run:

This works because the first character in the list is zero. So when you start counting from 1, you are skipping the first character.

You can also use this method for removing the last several characters from string, but first you have to be conscious of how many characters are in the array. You can find that information with the Len() function. Example:

Once you know the length of your string, in our case, String2 is 37 characters long, so counting from zero, the last character in the string has an index location of 36
If we want to strip the last character from our String, we would run:

The output would be the original string, Linux the last character. You can combine both operations to remove both the first and last string with the following:

Searching

Python has a find() function which allows you to search strings for other strings. In this example, we will use the following three strings:

First thing we want to know is does String1 contain the word drive? To find out, we will run:

Or, we could run:

If String1 contains the word “drive”, the function will return the index location where it found the word. In this case, it should return a 13.

Next, let’s do a search for a word that does not exist:

String1 does not contain the word orange, so it will return:
“-1”

Now that we know a bit about searching for words within strings, we should do a one more enhancement to this process. These searches are case sensitive, so the word “Drive” is not the same as the word “drive”. Before we do our searches, we should convert all of our strings to lowercase using the .lower() function. Here is one example of doing that:

Tokenizing Strings

Tokenizing strings is when you take a string and break it up into tokens that you can work with individually. An example of this is converting an existing string into a list or array. The simple way to do this is with the .split() function.
String1 = “I went for a drive to the store”
String2=“Orange,Apple,Grape,Kiwi”

If we run:

Array1 will be an array of the words from String1.

Alternatively, we can run:

By default, the split() function splits up the string based on spaces. Gut you can pass in other characters as well. In this case we are performing the split based on the commas in our string. Now that we have an array, we can get the first word from the array by running:

Or we could print each word one at a time by running:

Once we are done working with the array, we might need to convert it back into a string. You can do that with the join() function. To use the join function, we specify the separating character we want between each word, and then call the function. For example, if we want to have a “-“ between each word in our new string, we would run:

The above will create a new string called NewString. It will take every element in Array 2 and insert it into NewString separated by a “-“. The output would look like this:
“Orange-Apple-Grape-Kiwi”

You can use whatever separator you want. You could do a space separator:

Or a Tab Separator:

%d bloggers like this: