On Aug 11, 2006, at 1:26 PM, Cheng Fang wrote:
> The basic algorithm of the string edit distance is, given two
> strings, compute the number of insert/remove/replace chars it takes
> to get from one to the other.
>
>>
>> So sure, you can sort them in order, or cut by certain threshold,
>> etc.
>>
>> The code I wrote isn't going to scale for a large dataset, but if
>> since we are talking about 100 max, it's not a problem.
>>
>> http://fisheye5.cenqua.com/browse/jaxb2-sources/jaxb-ri/runtime/
>> src/com/sun/xml/bind/v2/util/EditDistance.java?r=1.1
>>
>>> "Your command "foo" was not recognized. Perhaps you meant 'go-
>>> foo', 'foo-bar', or 'foo-lish'?"
>>
>> Edit distance computation won't find this kind of match, for the
>> reasons Kedar mentioned. "foo" is closer to "bar" (3 steps) than
>> "foolish" (4 steps), according to the definition of the edit
>> distance.
>>
>> But it does find "list-components" from "list-comopnent" or "start-
>> domain" from "startDomain", things like that.
>>
> I really like to see a singular-plurar correction.
>
> I suggest we compile a list of common typos, and asadmin consult
> this static mapping first, then try any matching algorithm. So
> when we know of a new common typo, we can directly add it to the
> static mapping, without trying to retrofitting our code to produce
> this result.
As a Web 2.0 feature, we could collect all users' erroneous commands
and upload them to a wiki page that we would use dynamically to
figure out what the user really meant.
Craig
>
> With this static mapping, we could also add this feature right away
> with minimal coding and dependecy.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>
Craig Russell
Architect, Sun Java Enterprise System
http://java.sun.com/products/jdo
408 276-5638 mailto:Craig.Russell_at_sun.com
P.S. A good JDO? O, Gasp!
- application/pkcs7-signature attachment: smime.p7s