Creating word clouds in Python is easy thanks to a few open source libraries. Today, we we’ll use the ammueler word cloud library and matplotlib to draw some word clouds.
Word clouds are useful visualization tools for looking at the general theme of a document. As a document contains more instances of a given word, that word gets larger in the word cloud, and other words get smaller.
Before we get started, you will need to install the prerequisites by running the following commands:
pip3 install wordcloud
Pip3 install matplotlib
If you don’t have pip installed, see our article on getting started with python. There is a section on how to install pip.
Downloading a lexicon
Before you can create your word cloud, you need a sample text that you can use to generate your image. In this example. I will be using the works of Shakespeare. Compliments of Guttenberg.org, you can download the works of Shakespeare here.
If you don’t like the works of Shakespeare, you can also try the US constitution by clicking here.
Creating your word cloud
Below is the commented code for creating your word cloud. The code assumes that your text file is in the same folder as the python script you are executing.
from os import path
from wordcloud import WordCloud
import matplotlib.pyplot as plt
#Set the directory containing your lexicon
dirname = path.dirname(__file__)
# Read the whole text.
text = open(path.join(dirname, 'shakespear.txt')).read()
# Generate a word cloud object and plot it on the x and y axis
wordcloud = WordCloud().generate(text)
#Turn off the axis. Otherwise you will see a bunch of extra numbers around the word cloud
#Show the word cloud
Below are a couple of examples of word clouds using the works of Shakespeare and the us constitution: