Configuration Questions

In previous Tech Experts posts, we discussed how trials are a great way to make your app shine. Trialforce is the way to let your customers and prospects to take a test drive and get to know your…

Smartphone




Understanding the Levenshtein Distance Equation for Beginners

What?

I have minimal experience with matrices and have never taken Linear Algebra, so initially, I was bewildered. Eventually however, I was able to piece together an elementary understanding of what is happening, which I’ll attempt to explain below. This explanation is meant for beginners — anyone who is confused by the equation above or has never taken higher level mathematics courses. Warning: if you don’t fall into the above camp, you’ll probably find the explanation below needlessly tedious.

Introduction

The Levenshtein distance is a number that tells you how different two strings are. The higher the number, the more different the two strings are.

For example, the Levenshtein distance between “kitten” and “sitting” is 3 since, at a minimum, 3 edits are required to change one into the other.

An “edit” is defined by either an insertion of a character, a deletion of a character, or a replacement of a character.

Functions

A quick refresher if you haven’t looked at functions recently… The first thing to understand is that functions are a set of instructions that take a given input, follow a set of instructions, and yield an output. You probably saw lots of basic functions in your high school math courses.

Piecewise Functions

Piecewise functions are more complex functions. In a piecewise function, there are multiple sets of instructions. You choose one set over another based on a certain condition. Consider the example below:

In the above example, we use different sets of instructions based on what the input is. Piecewise function are denoted by the brace { symbol.

With that in mind, the Levenshtein Distance equation should look a little more readable.

Original
In other words…

What do a, b, i, and j stand for?

a = string #1

b = string #2

i = the terminal character position of string #1

j = the terminal character position of string #2.

The positions are 1-indexed. Consider the below example where we compare string“cat” with string “cap”:

The conditional (aᵢ ≠bⱼ)

aᵢ refers to the character of string a at position i

bⱼ refers to the character of string b at position j

We want to check that these are not equal, because if they are equal, no edit is needed, so we should not add 1. Conversely, if they are not equal, we want to add 1 to account for a necessary edit.

Solving Using a Matrix

The Levenshtein distance for strings A and B can be calculated by using a matrix. It is initialized in the following way:

From here, our goal is to fill out the entire matrix starting from the upper-left corner. Afterwards, the bottom-right corner will yield the Levenshtein distance.

Let’s fill out the matrix by following the piecewise function.

Now we can fill out the rest of the matrix using the same piecewise function for all the spots in the matrixes.

One more example:

If you feel comfortable with the equation at this point, try to fill out the rest of the matrix. The result is posted below:

Since the lower-right corner is 3, we know the Levenshtein distance of “kitten” and “sitting” is 3. This is what we expected since we already knew the Levenshtein distance was 3 (as explained in the introduction). This is also reflected in the matrix shown on the Levenshtein Distance Wikipedia page.

Conclusion

Oh, I see!

Hopefully, the above looks less intimidating to you now. Even better, I hope you understand what it means!

Vladimir Levenshtein, inventor of the Levenshtein algorithm

Works Used / Cited

Add a comment

Related posts:

Love letter one

As it happens I have told one hundred people that I love you before I ever said it to you. Maybe I will tell 100 more, maybe you will never hear it, and maybe next time I say it everyone will have… Read more

When You Look

A Poem That Perceives Your View. “When You Look” is published by WOLRAD. Read more

Using multiple connections with Laravel 5.5

if you are looking for one way of use two databases (or more) with laravel, keep reading, this article will teach you how you can do it, and it is more easy than you expect! best way of do it is… Read more