Oracle 10G Text, new Features, testing results
While loading (using imp) a test table lasted very long (due to a small sized pc) creating an ctxsys.CONTEXT-type textindex took a shorter time. i created an online-index and noticed that the build process seems to go through two phases; in the first step, cpu load was very high and nearly no i/o occurred, then it goes on with massive i/o traffic and low cpu usage. After approx. 20 minutes, the creation finished, which is quite fast in my opinion.
During creation it was possible to issue updates against the base table. But online-creation is only possible for context-typed indexes.
A bad behaviour of older oracle text versions is that the more results a query produces - the longer it takes to complete. Without additional restrictions or joins a query with a small resultset completes in e.g. 90 msec's but with - say some thousand hits it takes 60 secs, even for count(*) queries (jarkata lucene is much better here). In 10G this behaviour is still present. Under normal conditions this is still much more faster than a fullscan or a index fast full scan in case the column is Btree-Indexed. But there are cases in which usage of an ordinary index is faster. My tests showed that some wildcard search expressions completed faster without usage of a text index.
Another bad behaviour of Oracle Text is that sometimes execution aborts (e.g. DRG-50937). this is the case when a wildcard expression is getting (internally) too complex, in other words - there are too many hits. My tests showed that this is much better in 10G but the problem still exists (even when using wordlist preference WILDCARD_MAXTERMS on a max. allowed value of 15000).
SYNC ON COMMIT TRANSACTIONAL
this works fine, a DML against the basetable returns and the index is in sync. this means the new content is immediatly searchable.
New German Spelling
Basic lexer has now an attribute NEW_GERMAN_SPELLING. this worked fine in my tests. But there exists some minor restrictions. Text produces only hits for the alternate spellings when seaching is done using whole words. a wildcard expression seach does not hit all spellings. the same for words which are now allewod to be spelled in two words (Radfahren - rad fahren). finally there is a inconsistency when using new german spelling an base letter conversion. i refer on metalink (note 279399.995, 249991.1) for this.
During creation it was possible to issue updates against the base table. But online-creation is only possible for context-typed indexes.
A bad behaviour of older oracle text versions is that the more results a query produces - the longer it takes to complete. Without additional restrictions or joins a query with a small resultset completes in e.g. 90 msec's but with - say some thousand hits it takes 60 secs, even for count(*) queries (jarkata lucene is much better here). In 10G this behaviour is still present. Under normal conditions this is still much more faster than a fullscan or a index fast full scan in case the column is Btree-Indexed. But there are cases in which usage of an ordinary index is faster. My tests showed that some wildcard search expressions completed faster without usage of a text index.
Another bad behaviour of Oracle Text is that sometimes execution aborts (e.g. DRG-50937). this is the case when a wildcard expression is getting (internally) too complex, in other words - there are too many hits. My tests showed that this is much better in 10G but the problem still exists (even when using wordlist preference WILDCARD_MAXTERMS on a max. allowed value of 15000).
SYNC ON COMMIT TRANSACTIONAL
this works fine, a DML against the basetable returns and the index is in sync. this means the new content is immediatly searchable.
New German Spelling
Basic lexer has now an attribute NEW_GERMAN_SPELLING. this worked fine in my tests. But there exists some minor restrictions. Text produces only hits for the alternate spellings when seaching is done using whole words. a wildcard expression seach does not hit all spellings. the same for words which are now allewod to be spelled in two words (Radfahren - rad fahren). finally there is a inconsistency when using new german spelling an base letter conversion. i refer on metalink (note 279399.995, 249991.1) for this.

0 Comments:
Kommentar veröffentlichen
<< Home