dev@fi.java.net

Re: Generation of external vocabularies

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Wed, 07 Sep 2005 14:16:35 +0200

Kohsuke Kawaguchi wrote:
> Paul Sandoz wrote:
>
>> It would not be necessary to do exact matches based on validation.
>>
>> All that would be necessary to do is given a set of qualified names (
>> {namespace}localName ) in the schema count how many occurences of
>> those qualified names occur in a set of n documents.
>>
>> Once everything has been counted sort the set of qualified names
>> according to the number of occurences, the qualified name with highest
>> number of occurences being first.
>>
>> Then assign an index to each qualified name whose value is position of
>> the qualified name in the sorted set.
>
>
> I see. Well, then do you really need a schema? Can't you just build up
> histogram of names just by looking at some number of instances and
> assign indices to them?
>

That be true, good point. I wanted to combine both since the set of
samples may not contain all used data, just reflect what is most
commonly used. For small documents size can be affected for the uncommon
cases.


> In any case, if you need XSOM help, please let me know...
>

Thanks, when i have some time and get round to this i will ask if help
is needed :-)

Paul.
-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109