
CHAPTER 5. Viewing Highlighted Text
This chapter describes how to view highlighted text using the PL/SQL procedures provided by Oracle ConText Option.
The topics covered in this chapter are:
Overview of Viewing
CTX_QUERY.HIGHLIGHT is a stored procedure provided by ConText Option to create various forms of the documents that can be used in an application to produce viewable output.
Other stored procedures in the CTX_QUERY package provide for managing the result tables used to store the viewing output.
CTX_QUERY.HIGHLIGHT Procedure
The PL/SQL procedure CTX_QUERY.HIGHLIGHT generates filtered text, marked-up highlight text, and highlight information. In an application, the CTX_QUERY.HIGHLIGHT procedure is called after executing a text query.
CTX_QUERY.HIGHLIGHT can be used to generate the following output for a document:
- an unfiltered version of the original document (stored in a NOFILTAB table)
- a plain (ASCII) text version of the document (stored in a PLAINTAB table)
- a marked-up ASCII version of the document with occurrences of specified query expressions highlighted (stored in a MUTAB table)
- highlight information that identifies the position and length of the query terms found in the source document (stored in a HIGHTAB table
The positions and lengths of the query terms are specified as offsets from the beginning of the ASCII text version of the document
Note: If the document is an HTML document filtered through the internal HTML filter, the marked-up ASCII text version generated by HIGHLIGHT and stored in an MUTAB table retains the original HTML tags from the document.
A fifth type of output, ICF, is generated automatically by HIGHLIGHT when a document in one of the supported formats is viewed in the Windows 32-bit viewer.
For more information about the Windows 32-bit viewer, see "Viewing in a 32-bit Windows Environment" in this chapter.
Highlighting Mark-up
The markup that is used to indicate the start and end of a highlighted word or phrase can be specified when CTX_QUERY.HIGHLIGHT is called for a document.
If no markup is specified, CTX_QUERY.HIGHLIGHT uses default markup. The default highlighting mark-up produced by CTX_QUERY.HIGHLIGHT differs depending on the format of the source document.
If the source document is an ASCII document or a formatted document, the default highlighting markup is three angle brackets immediately to the left (<<<) and right (>>>) of each term.
If the source document is an HTML document filtered through an external filter, the default highlighting markup is the same as the highlighting markup for ASCII or formatted documents (<<< and >>>).
If the source document is an HTML document filtered through the internal HTML filter, the default highlighting markup is the HTML tags used to indicate the start and end of a font change:
- <FONT = . . .> to the immediate left of the term
- </FONT> to the immediate right of the term
For more information about internal and external filters, see Oracle ConText Option Administrator's Guide.
Using CTX_QUERY.HIGHLIGHT
To provide document and highlight viewing in an application, you must perform the following tasks:
Figure 5 - 1. Diagram of PL/SQL Viewing Tasks
Allocate Result Tables
The result tables required by the HIGHLIGHT procedure can be allocated manually using the CREATE TABLE command in SQL or using the CTX_QUERY.GETTAB procedure provided by ConText Option.
Perform a Text Query
A one-step, two-step, or in-memory query is performed to return a hitlist of documents.
The hitlist provides the textkeys that are used to generate highlight and display output for specified documents in the hitlist.
Call CTX_QUERY.HIGHLIGHT
The application passes to CTX_QUERY.HIGHLIGHT a pointer to a document (generally the textkey obtained from the hitlist) and a query expression.
Note: While the query expression is usually the same as the expression used to return documents in the text query, it is not required that the query expressions match.
For example, the application developer might allow a user to search for all articles by a particular author and then allow the user to view highlighted references to a specified subject in the returned documents.
CTX_QUERY. HIGHLIGHT returns to the application various forms of the specified document that can be further processed or displayed by the application.
The highlight offset information and marked-up ASCII text are generated using the query expression specified in the HIGHLIGHT procedure. In addition, the offset information is based on the ASCII text version of the document.
Provide HIGHLIGHT Output for Viewing
Use the highlight table to manually mark up documents within the application or to display the documents returned by the HIGHLIGHT procedure in various tables.
Release Result Tables
After documents have been processed by the HIGHLIGHT procedure and displayed to the user, drop the result tables.
If the tables were allocated using CTX_QUERY.GETTAB, you use CTX_QUERY.RELTAB to release the tables.
If the tables were created manually, drop the tables using the SQL command DROP TABLE.
CTX_QUERY.HIGHLIGHT Example
In the following code sample, a table called MU_TEXT is created to receive marked-up documents. The CTX_QUERY.HIGHLIGHT procedure is then used to locate specific text in the documents and display the marked-up documents. The text to be searched for and marked up is the same.
To use the code sample, perform the following steps:
1. Copy the code in the sample and save it as a SQL script.
2. Initiate a SQL*PLUS session and execute the script
Note: As a prerequisite, this example assumes that the EMP demonstration table distributed with ConText Option has been installed, an Oracle database is running, and a ConText server has been started with the Query personality.
Code Sample
create table mu_text (id number, document long);
set termout off
set verify off
col ct new_value ct
col sess new_value sess
col score format 990 head 'RANK'
col textkey format a4 head 'KEY'
col document format a45 word_wrap
set long 60
/* Since this application uses a shared results table */
/* for the hitlist, the application must create a */
/* unique id for each user sharing the results table. */
/* In this example, the unique id is created from the */
/* "sessionid" parameter of the users environment. */
select userenv('sessionid') sess from dual;
/*Run an initial text query to return a hitlist */
begin ctx_query.contains
('EMP_HISTORY','&1','CTX_TEMP',1,&sess);
end;
/* Count the hits */
select count(*) ct from EMP,CTX_TEMP
where empno=textkey
and conid=&sess;
/* Clear prior results from MU_TEXT */
delete MU_TEXT;
commit;
declare
tk varchar2(12)
numtk number(10);
mudoc varchar2(2000);
cursor s is
select textkey from CTX_TEMP.EMP
where empno=textkey
and conid=&sess;
begin
open s;
/*for each hit, produce a marked up row*/
for i in 1 ..&ct loop
fetch s into tk;
/*create numeric id for MU_TEXT from textkey*/
numtk := to_numer(tk);
/* call ctx_query.highlight for each document using */
/* the same policy and query expression from the */
/* initial query. Create marked-up ASCII output for */
/* document, stored in mutab table named MU_TEXT. */
begin
ctx_query.highlight(
cspec => 'EMP_HISTORY',
textkey => tk,
query => '&1',
id => numtk,
mutab => 'MU_TEXT');
end;
end loop;
end;
/*Join the hits with the marked-up docs*/
select score RANK, textkey KEY, document
from CTX_TEMP, MU_TEXT
where is=to_number(textkey)
and conid=&sess
order by score desc;
set echo on
Example
In the following example, the sample code has been stored as a script named highlight.sql and the query term is the word used:
@highlight 'used'
The output generated by highlight.sql is:
RANK KEY DOCUMENT
-------------------
10 7369 <<<Used>>> to build horse shoes
10 7698 Blake <<<used>>> to be a manager at apple
The query term used was found in two documents (textkeys 7369 and 7698). CTX_QUERY.HIGHLIGHT highlighted the specified term using the default markup '<<<' and '>>>'.
Viewing in Windows
In addition to viewing text in a PL/SQL environment, Oracle ConText Option Workbench enables you to view highlighted documents in the following Windows environments:
- 32-bit (i.e. Windows NT or Windows 95)
In a Windows 16-bit environment, you must still use CTX_QUERY.HIGHLIGHT procedure on the server side to create the necessary highlight table, whose contents are outputed to the operating system for subsequent viewing with the 16-bit viewer (CTXV16).
However, in a 32-bit Windows environment, you can embed the Oracle Context Option Viewer Control in a client-side application to allow users to view documents with query terms highlighted. To view a document, you need not write any PL/SQL code; given the database connection, the textkey, and the query term, the 32-bit control displays the document with highlights.
Viewing in a 16-bit Windows Environment
To view highlighted documents in a Windows 16-bit environment, you can use the 16-bit viewer (CTXV16). The procedure for using CTXV16 is:
3. Copy the output from the result table to an operating system file.
4. Call the viewer from the command-line, using the file name as input.
The CTXIO16 utility, provided with Oracle ConText Option Workbench, enables you to automate the above steps for viewing a highlighted document in a 16-bit Windows environment.
For example, to view a Microsoft Word document that is already loaded and indexed in a text table, the CTXIO16 utility enables you to do following:
1. Create a MUTAB result table. The MUTAB table contains ASCII text with highlight markup; if you want to view a document without highlights, you can use the PLAINTAB table.
2. Call CTX_QUERY.HIGHLIGHT to create a highlighted ASCII version of the Word document in the MUTAB table.
3. Download the highlighted ASCII document to the operating system.
4. Download the original Word document to the operating system.
5. View the highlighted Word document using the CTXV16 utility.
For more information about viewing in a 16-bit environment, see "Windows 16-bit Viewing (Appendix A)".
Supported Formats
You can use the Oracle ConText Option Viewer for Windows 3.1 (CTXV16) to view documents in the following formats:
- Ami Pro for Windows, versions 1, 2, and 3
- Lotus 123, versions 2, 3, 4, and 5
- Microsoft Word for Windows, versions 2, 6.x, and 7
- Microsoft Word for DOS, version 5.X
- Microsoft Word for Macintosh, versions 5 and 6
- Microsoft Excel for Windows, versions 5 and 7
- WordPerfect for Windows, versions 2 and 5
- Xerox XIF for Unix, versions 5 and 6
Note: Many of these formats are not supported by the internal filters provided by ConText Option. For documents in those formats that are not supported by the internal filters, external filters must be used.
For more information about internal and external filters, see Oracle ConText Option Administrator's Guide.
Viewing in a 32-bit Windows Environment
You can use the Oracle ConText Option Viewer Control (CTXV32.OCX) to provide viewing of highlighted documents in a Windows 32-bit environment. The viewer enables the user to browse documents in the supported formats with query terms highlighted.
The user can view a Word document, for example, as it would appear in Microsoft Word. The user can also scroll through the document using the Next and Previous buttons to jump to the next or previous occurrence of the search term(s).
Using the Context Option Viewer Control
As OCX modules are not stand-alone executables, you need a development environment such as Visual C++ or Visual Basic to exploit the functionality of the Context Option Viewer Control. Within such an environment, you can add the control to the tool palette, from where you can place instances of the control on a form or canvas.
For example, in Visual Basic 4.0, you add the control to the tool palette by selecting Custom Controls from the Tool menu. Use the browser to select the Oracle ConText Option Viewer Control, CTXV32.OCX, from the oracle_home\BIN directory.
Alternatively, you can create instances of the control dynamically, using the identification string "CTXV32.CTXViewer.1"
If the viewer control is embedded in an HTML page, the browser must support ActiveX components and the client machine must have the viewer installed on it with all required support files. The viewer uses SQL*Net to communicate with the database. Within HTML, you can invoke the methods using Visual Basic scripting, for example, and change properties with the OBJECT tag and parameter settings syntax.
For more information about the methods and properties associated with the 32-bit viewer control, see the ConText Option Viewer Control help file, CTXV32.HLP. This file has Visual Basic and HTML examples.
Supported Formats
You can use the ConText Option Viewer Control to view documents in the following server-side supported formats:
- Microsoft Word for Windows 2, 6.x
- WordPerfect for Windows 5.x, 6.x
- WordPerfect for DOS 5.0, 5.1, 6.0