com.endeca.BulkLoad
Class BulkIngester

java.lang.Object
  extended by com.endeca.BulkLoad.BulkIngester

public class BulkIngester
extends java.lang.Object

The primary entry point for the client-side Bulk Load Interface for loading data into an Endeca data domain. It makes a socket connection to the data domain and spawns a thread to handle replies. Clients to this interface must:

  1. Define classes that implement the four callback interfaces, ErrorCallback, FinishedCallback, AbortCallback, and StatusCallback, and do something useful when their handler methods are called (which happens in the response thread).
  2. Instantiate a BulkIngester object with the data domain hostname and port number (this information can be obtained by using the allocateBulkPort web service method in the manage web service), the four callback objects etc. as defined in the constructor.
  3. Call the begin method to start the response thread. If this is not called, an IOException will be thrown.
  4. Call sendRecord repeatedly to send Data.Record objects to the data domain.
  5. When finished sending records, call endIngest to terminate the response thread and close the socket.


Constructor Summary
protected BulkIngester(java.io.DataInputStream din, java.io.DataOutputStream dout, ErrorCallback errorCallback, FinishedCallback finishedCallback, AbortCallback abortCallback, StatusCallback statusCallback)
          Alternative constructor for unit testing only.
  BulkIngester(java.lang.String host, int port, boolean useSSL, boolean doFinalMerge, boolean doUpdateDictionary, int timeout, ErrorCallback errorCallback, FinishedCallback finishedCallback, AbortCallback abortCallback, StatusCallback statusCallback)
          Primary constructor.
 
Method Summary
 void begin()
          Spawn a thread to asynchronously read the data domain's responses.
 void endIngest()
          Terminates the response thread and closes the socket.
 void requestStatusUpdate()
          Poll the data domain on its current status.
 void sendCancel()
          Tell the data domain that the client wants to cancel the bulk ingest process.
 void sendRecord(Data.Record rec)
          Sends the provided record over the wire to the data domain.
 void setFinalMerge(boolean doFinalMerge)
          Set whether to perform the final merge after this ingestion.
 void setTransactionId(java.lang.String transactionId)
          Set the transaction ID.
 void setUpdateDictionary(boolean doUpdateDictionary)
          Set whether to rebuild the aspell dictionary after this ingestion, so that the newly-added data will be available for spelling hints and autocorrect.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BulkIngester

public BulkIngester(java.lang.String host,
                    int port,
                    boolean useSSL,
                    boolean doFinalMerge,
                    boolean doUpdateDictionary,
                    int timeout,
                    ErrorCallback errorCallback,
                    FinishedCallback finishedCallback,
                    AbortCallback abortCallback,
                    StatusCallback statusCallback)
             throws java.io.IOException
Primary constructor. The client application must define and instantiate the callback classes for handling responses from the data domain.

Parameters:
host - Hostname of the data domain.
port - Port number to connect to on same.
useSSL - Whether to connect via SSL.
doFinalMerge - Whether to perform the final merge after ingestion.
doUpdateDictionary - Whether to update the aspell dictionary after ingestion.
timeout - Timeout for connecting to the data domain.
errorCallback - ErrorCallback object to handle error conditions.
finishedCallback - FinishedCallback object to be called when ingestion finishes.
abortCallback - AbortCallback object to handle aborts.
statusCallback - StatusCallback object to handle status updates.
Throws:
java.io.IOException

BulkIngester

protected BulkIngester(java.io.DataInputStream din,
                       java.io.DataOutputStream dout,
                       ErrorCallback errorCallback,
                       FinishedCallback finishedCallback,
                       AbortCallback abortCallback,
                       StatusCallback statusCallback)
Alternative constructor for unit testing only. Does not use sockets. Do not use outside of testing

Parameters:
din -
dout -
errorCallback - ErrorCallback object to handle error conditions.
finishedCallback - FinishedCallback object to be called when ingestion finishes.
abortCallback - AbortCallback object to handle aborts.
statusCallback - StatusCallback object to handle status updates.
Method Detail

setTransactionId

public void setTransactionId(java.lang.String transactionId)
Set the transaction ID. Can be called at any time before begin.

Parameters:
transactionId - The transaction ID.

setFinalMerge

public void setFinalMerge(boolean doFinalMerge)
Set whether to perform the final merge after this ingestion. This re-indexes the database to optimize query performance, and defaults to true. If doing multiple consecutive bulk load operations, set this to false on all except the last one. Can be called at any time before endIngest.

Parameters:
doFinalMerge - true if final merge is to be performed, otherwise False.

setUpdateDictionary

public void setUpdateDictionary(boolean doUpdateDictionary)
Set whether to rebuild the aspell dictionary after this ingestion, so that the newly-added data will be available for spelling hints and autocorrect. This defaults to true. If doing multiple consecutive bulk load operations, set this to false on all except the last one. Can be called at any time before endIngest.

Parameters:
doUpdateDictionary - true if dictionary is to be updated, otherwise False.

begin

public void begin()
           throws java.io.IOException
Spawn a thread to asynchronously read the data domain's responses.

Throws:
java.io.IOException

endIngest

public void endIngest()
               throws java.io.IOException,
                      java.lang.InterruptedException
Terminates the response thread and closes the socket. Must be called after the client finishes sending data.

Throws:
java.io.IOException
java.lang.InterruptedException

sendRecord

public void sendRecord(Data.Record rec)
                throws java.io.IOException
Sends the provided record over the wire to the data domain. This will block if the data domain's input queue is full.

Parameters:
rec - The record to send.
Throws:
java.io.IOException

requestStatusUpdate

public void requestStatusUpdate()
                         throws java.io.IOException
Poll the data domain on its current status. This method blocks only on send. Receive is handled asynchronously.

Throws:
java.io.IOException

sendCancel

public void sendCancel()
                throws java.io.IOException
Tell the data domain that the client wants to cancel the bulk ingest process. If this happens, depending on the structure of the data being ingested, the data domain may be left in an inconsistent state.

Throws:
java.io.IOException