dev@glassfish.java.net

Re: "Did you mean" in asadmin

From: Cheng Fang <Cheng.Fang_at_Sun.COM>
Date: Fri, 11 Aug 2006 16:26:55 -0400

The basic algorithm of the string edit distance is, given two strings,
compute the number of insert/remove/replace chars it takes to get from
one to the other.

>
> So sure, you can sort them in order, or cut by certain threshold, etc.
>
> The code I wrote isn't going to scale for a large dataset, but if
> since we are talking about 100 max, it's not a problem.
>
> http://fisheye5.cenqua.com/browse/jaxb2-sources/jaxb-ri/runtime/src/com/sun/xml/bind/v2/util/EditDistance.java?r=1.1
>
>
>> "Your command "foo" was not recognized. Perhaps you meant 'go-foo',
>> 'foo-bar', or 'foo-lish'?"
>
> Edit distance computation won't find this kind of match, for the
> reasons Kedar mentioned. "foo" is closer to "bar" (3 steps) than
> "foolish" (4 steps), according to the definition of the edit distance.
>
> But it does find "list-components" from "list-comopnent" or
> "start-domain" from "startDomain", things like that.
>
I really like to see a singular-plurar correction.

I suggest we compile a list of common typos, and asadmin consult this
static mapping first, then try any matching algorithm. So when we know
of a new common typo, we can directly add it to the static mapping,
without trying to retrofitting our code to produce this result.

With this static mapping, we could also add this feature right away with
minimal coding and dependecy.