This report presents details on and an in-depth evaluation of a similarity function used for detecting similar test steps in manual test cases, written in natural language. Using an industrial data set of 65 000 test steps, we show that even though the similarity function builds on standard functions from the open source data base Postgres, it is capable of finding similarities in parity of what the state of the art suggests. Rather few miss classifications were found. We also show that by fine tuning the function, the number of clusters of similar can be reduced by 13%. Manual inspection further shows that there is potential to reduce the set of clusters even more.