Artificial intelligence has come a long way. Its applications in text processing and natural language processing have eased the lives of authors and writers everywhere.
When we talk about artificial intelligence, how can we leave out Python? The two are almost synonymous at this point.
Python is an object-oriented programming (OOP) language that is simultaneously very powerful and much easier to use than other OOP languages.
Python has many libraries and pre-defined functions that allow a programmer to efficiently write code. In particular, the libraries related to mathematical functions and artificial intelligence are very fleshed out.
What is Paraphrasing?
Paraphrasing is the act of writing some text using different words, phrases, and sentence structures without changing its meaning. Paraphrasing is an important technique utilized in writing.
When a writer has to explain the ideas and concepts of another person in their own words, then they need to resort to paraphrasing. The need may arise because the original wording might be too difficult for readers to grasp.
Artificial intelligence is used to automate the process of paraphrasing. There exist many paraphrase tools that can paraphrase text automatically. However, programmers can also directly create a Python program to paraphrase text.
In this article, we are going to check out both methods of paraphrasing, a) by using Python, and b) by using paraphrasing tools.
How to Paraphrase Your Text Using Python?
There are plenty of different ways to create a paraphrasing program using Python. However, currently one of the most popular methods is to use transformers.
Transformers are artificial neural networks that are able to learn the context of a text. This means they can understand its meaning. This makes them ideal to use for paraphrasing as you can count on them to not butcher the meaning of the text.
In the example we are going to follow, we will be using the Pegasus model. It is a transformer that uses an encode-decoder model for sequence-to-sequence learning. You can learn more about the Pegasus model by reading its documentation.
Pegasus stands for “Pre-training with Extracted Gap-Sentences for Abstractive Summarization”. Don’t be confused by this. Abstractive summaries are basically paraphrasing as well. An abstract summary summarizes a text by rewriting all of its main points in a concise form using different wording.
Now that we have covered the basics, let’s see how we can use Pegasus for paraphrasing.
- Setup Your Environment
If you are an experienced Python user, then you probably already have an IDE (Integrated Development Environment) setup already. But if you are a new user then fear not, simply use the “Google Collaboratory”. It is a free notebook that lets you run code without any setup required. It is a cloud service, so you need the internet to use it.
So, step 1, is to install all the dependencies required. That is very simple to do, you only need to run the following three commands.
In text form (so that you can copy it for yourself)
!pip install sentence-splitter
!pip install transformers
!pip install SentencePiece
This will prompt the system to download and install these dependencies on the cloud computer allocated to you (or on your system if you are not using Collaboratory).
The output on Collaboratory looks something like this.
In this screenshot, there is more text, because the system in question already has most of the dependencies installed already. So, the “requirement already satisfied” messages have increased the amount of text.
- Setting Up Pegasus
The next thing we need to do is to import the Pegasus model and set it up because we need it to do the paraphrasing. Without it, things will become more difficult.
To set up Pegasus, we need to install Pytorch first. Pytorch is a Python package that provides high-level features of tensor computation and deep neural networks. It is the underlying framework that powers Pegasus.
The commands to install these packages are given below.
In text form:
import torch
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
model_name = ‘tuner007/Pegasus_paraphrase’
torch_device = ‘cuda’ if torch.cuda.is.available() else ‘cpu’
Tokenizer = PegasusTokenizer.from.pretrained(model_name)
model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_defvice)
Running this code should download a bunch of files that look like this. The Pytorch model bin is a pretty large file, so don’t worry if it takes a bit of time to download it.
Then we need to test if the model is working or not. There is a simple way to do that. Just input the following commands and have some sample text handy.
Enter your sample text in inverted commas. It should go in place of the orange text in the above-given picture. After running the “get response (text, 5)” text, there will be an output of 5 sentences that are slightly paraphrased.
The output will look like this.
If you get an output with some paraphrased sentences then the model is working properly.
- Break the Text into Individual Sentences
The next step in paraphrasing is to break the provided sample text into sentences. This is because it is easier to paraphrase one sentence rather than an entire paragraph.
To do that you have to first input your text. Do that by using the following lines of code.
Your text should go in front of the “=” sign in inverted commas. If you haven’t made any mistakes then it should be orange in color.
The output should be simply the entire text you have inputted. On Collaboratory it should show up beneath the cell you used to input the text.
Once the text is in the system, then it needs to be split into separate sentences. Well basically, it will be a list of sentences. The code used for it is given below.
In text form:
from sentence_splitter import SentenceSplitter, split_text_into_sentences
splitter = SentenceSplitter (language= ‘en’)
sentence_list = splitter.split(context)
sentence-list
The output of this command will generate a list of all the sentences in the paragraph. It will look like this;
- Paraphrase the Paragraph
This step basically involves using a loop to iteratively go through the list and rephrase each sentence. Then we will do it once more so that we have two paraphrased versions of each tool.
Then finally we will combine them together to make a coherent paragraph. The next part only involves coding, so just follow along and there is not much explanation needed.
First, we have to create a loop that will go through the list of sentences we generated in the previous step.
The code for that is as follows:
For i in sentence_lists:
a = get_response(i, 1)
paraphrase.append (a)
paraphrase
This will generate the following output, which is just another list but the sentences are rephrased.
Then we need to create a second split using the following code.
paraphrase2 = [‘ ‘.join(x) for x in paraphrase]
paraphrase2
This will have a similar output as the first time. Then finally we will combine these split lists into a paragraph. The code for doing that is:
In text form:
paraphrase3 = [‘ ‘.join(x for x in paraphrase2) ]
paraphrased_text =str(paraphrase3).strip(‘[ ]’).strip(“’”)
paraphrased_text
The output will be a paragraph (not a list of sentences). This paragraph is our final paraphrased version.
We can use a simple command to write both the original and rephrased paragraph together so that we can compare them.
Simply print the ‘context’ variable which contains the original text as well as print the ‘paraphrased_text’ variable whose name is very self-explanatory.
The output should look like this:
And that is how you can use Python to paraphrase text. The upside of this approach is that there are no word limits. So, you can rewrite entire articles and essays without an issue. However, processing too long of a text will take a lot of time.
(Freely available code, courtesy of Viktor Dey.)
3 Suggested Paraphrasing Tools
We just looked at how you can use Python for paraphrasing. However, even with the help of tools such as Google Collaboratory, it was inconvenient and relatively time-consuming.
However, there is a way to avoid that hassle altogether. You can simply get help from an online paraphrasing tool. Most of these tools are free, and not the kind of free where you have to give your credit card info. They don’t even require an account.
So, you can get assistance from an online paraphrasing tools with no strings attached. Obviously, they do come with some drawbacks such as ads and word limits, but that is an acceptable compromise.
Now we will look at some reliable paraphrasing tools that you can utilize.
- Paraphraser
The paraphrasing tools by Paraphraser.io is a free tool. It also does not require registration for usage. Hence it is an extremely accessible tool that anyone with an internet connection can use.
Accessibility features aside, the tool itself is also pretty good. It comes with multiple modes, two of which are free. They are called:
- Standard mode
- Fluency mode
Standard mode utilizes the synonym exchange technique to replace a few words with their synonyms. It does not make many changes to the text, but the output is still visibly different.
The other mode in this paraphraser is called Fluency mode. It changes the text considerably compared to the Standard mode. The highlight, however, is its ability to make the text more readable. The synonyms it uses are easier to read, and it replaces and restructures sentences that are confusing.
The other two modes in this paraphrasing tools are called:
- Creative
- Smarter
Both of these modes require the user to upgrade to premium. The only other downside (practicality-wise) is that only 1,000 words can be rephrased in one session.
- Linguix.com
The paraphrasing tool by Linguix.com is another great entry on this list. This is a tool that is more suited to learning how to paraphrase, as it requires more user input than most tools.
It is free to use, but it does require creating an account before the user can access its full features. It does not come with fancy modes, but it does provide many variations of each sentence. It is up to the user to choose which variation they want to use.
The method of using it is quite simple. Once the text has been uploaded/written in the tool’s interface, the user has to select it using either the mouse or the “Ctrl+A” shortcut. Then they have to click the small icon at the end of the text.
This opens a drop-down menu under the first selected sentence. This menu contains different paraphrased versions of the text. The user can select from these versions and move on to the next sentence.
Obviously, the drawback here is that the tool is not fully automatic, but the end result is much more personalized compared to other tools.
One great thing about this tool is that it does not have word limits. So, users can paraphrase very large documents if they like.
The final tool in our list is the paraphraser by Editpad.org. This tool is also free and does not require registration. But unlike the other two tools on this list, it does not have any premium features nor any options to upgrade its functionality.
The tool comes with four modes, they are called:
- Smooth
- Formal
- Smart
- Improver
The Smooth mode in this paraphrasing tool increases the readability of the text. It makes the text smoother to read.
The Formal mode rephrases the text to change its tone and make it sound more official and respectful.
Smarter mode is kind of a mixed bag, it tries to make the text more unique. To do that it adds some minor, non-existent details which change the text considerably.
Improver mode does literally just that, it improves the text by making sentence structure better and reducing unnecessary contractions.
Four free modes are a pretty good deal and the only downside to this tool is that it only lets users rephrase 1,000 words per session.
Conclusion
Python is a powerful programming language and it has found much success in the field of artificial intelligence. In this article, we looked at how Python and the Pegasus transformer can be used to paraphrase text.
The method was overall quite straightforward and there was not any complicated programming involved. Nevertheless, it was inconvenient because of the setup and installation processes involved.
Thus, we looked at the alternative as well, which was to use online paraphrasers. Three free tools were discussed along with their features such as multiple modes. With the exception of Linguix, the tools had 1,000-word limits which are quite limiting.
It is up to user discretion which tools they want to use. People interested in programming and artificial intelligence can try the Python method as well.