Oracle® In-Database Container for Hadoop Java API Reference
Release 1.0.1

E54638-01

oracle.sql.hadoop
Class Job

java.lang.Object
  extended by org.apache.hadoop.mapreduce.task.JobContextImpl
      extended by org.apache.hadoop.mapreduce.Job
          extended by oracle.sql.hadoop.Job
All Implemented Interfaces:
org.apache.hadoop.mapreduce.JobContext, org.apache.hadoop.mapreduce.MRJobConfig

public class Job
extends org.apache.hadoop.mapreduce.Job

The job submitter's view of the Job.

It allows the user to configure the job, and run it. The set methods only work until the job is submitted, afterwards they will throw an IllegalStateException.


Nested Class Summary
static class Job.HiveTypes
           
protected static class Job.JobState
           
static class Job.Operation
          Operations available in oc4hadoop
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Job
org.apache.hadoop.mapreduce.Job.TaskStatusFilter
 
Field Summary
protected static java.lang.String CONF_KEY
           
protected static java.lang.String CREATE_DESERIALIZED_SPLITS_TABLE_ATTR
           
protected static java.lang.String CREATE_OUTPUT_TABLE_ATTR
           
protected static java.lang.String CREATE_SERIALIZED_SPLITS_TABLE_ATTR
           
protected static java.lang.String DATA_TYPE_CHECK_FLAG
           
protected static java.lang.String DEBUG_MODE_FLAG
           
protected static java.lang.String DESERIALIZED_SPLITS_TABLE_NAME_PROPERTY
           
protected static java.lang.String HINPUT_FUNCTION
           
protected static java.lang.String HINPUTANDMAP_FUNCTION
           
protected static java.lang.String ID_STRING_ATTR
           
protected static java.lang.String INPUT_CLASSES_ATTR
           
protected static java.lang.String INPUT_KEY_CLASS_ATTR
           
protected static java.lang.String INPUT_SPLIT_RECORD_CLASSES_ATTR
           
protected static java.lang.String INPUT_SPLIT_RECORD_DB_TYPES_PROPERTY
           
protected static java.lang.String INPUT_SPLIT_RECORD_KEY_CLASS_ATTR
           
protected static java.lang.String INPUT_SPLIT_RECORD_KEY_DB_TYPE_PROPERTY
           
protected static java.lang.String INPUT_SPLIT_RECORD_OUTTYPE_PROPERTY
           
protected static java.lang.String INPUT_SPLIT_RECORD_OUTTYPESET_PROPERTY
           
protected static java.lang.String INPUT_SPLIT_RECORD_VALUE_CLASS_ATTR
           
protected static java.lang.String INPUT_SPLIT_RECORD_VALUE_DB_TYPE_PROPERTY
           
protected static java.lang.String INPUT_TABLE_NAME_PROPERTY
           
protected static java.lang.String INPUT_VALUE_CLASS_ATTR
           
protected static java.lang.String MAP_FUNCTION
           
protected static java.lang.String MAP_METHOD
           
protected static java.lang.String MAP_OUTPUT_KEY_CLASS_ATTR
           
protected static java.lang.String MAP_OUTPUT_KEY_DB_TYPE_PROPERTY
           
protected static java.lang.String MAP_OUTPUT_VALUE_CLASS_ATTR
           
protected static java.lang.String MAP_OUTPUT_VALUE_DB_TYPE_PROPERTY
           
protected static java.lang.String MAP_OUTTYPE_PROPERTY
           
protected static java.lang.String MAP_OUTTYPESET_PROPERTY
           
protected static java.lang.String MAP_SIGNATURE
           
protected static java.lang.String MAPPER_IMPL
           
protected static java.lang.String MULTI_VALUE
           
protected static java.lang.String NUM_HINPUT_TASKS_ATTR
           
protected static java.lang.String NUM_INSERT_TASKS_ATTR
           
protected static java.lang.String OPERATION_ATTR
           
protected static java.lang.String OUTPUT_KEY_CLASS_ATTR
           
protected static java.lang.String OUTPUT_KEY_DB_TYPE_PROPERTY
           
protected static java.lang.String OUTPUT_SCHEMA_NAME_ATTR
           
protected static java.lang.String OUTPUT_TABLE_NAME_PROPERTY
           
protected static java.lang.String OUTPUT_VALUE_CLASS_ATTR
           
protected static java.lang.String OUTPUT_VALUE_DB_TYPE_PROPERTY
           
protected static java.lang.String OUTTYPE_PROPERTY
           
protected static java.lang.String OUTTYPESET_PROPERTY
           
protected static java.lang.String REDUCE_FUNCTION
           
protected static java.lang.String REDUCER_IMPL
           
protected static java.lang.String SCHEMA_NAME_ATTR
           
protected static java.lang.String SERIALIZED_SPLITS_TABLE_NAME_PROPERTY
           
protected static java.lang.String SERIALIZED_SPLITS_TABLE_RAW_SIZE_ATTR
           
protected static java.lang.String SINGLE_COLUMN_ATTR
           
protected static java.lang.String SQLTYPES
           
 
Fields inherited from class org.apache.hadoop.mapreduce.Job
COMPLETION_POLL_INTERVAL_KEY, OUTPUT_FILTER, PROGRESS_MONITOR_POLL_INTERVAL_KEY, SUBMIT_REPLICATION, USED_GENERIC_PARSER
 
Fields inherited from class org.apache.hadoop.mapreduce.task.JobContextImpl
conf, credentials
 
Fields inherited from interface org.apache.hadoop.mapreduce.MRJobConfig
APPLICATION_ATTEMPT_ID, APPLICATION_ATTEMPT_ID_ENV, APPLICATION_MASTER_CLASS, APPLICATION_TOKENS_FILE, CACHE_ARCHIVES, CACHE_ARCHIVES_SIZES, CACHE_ARCHIVES_TIMESTAMPS, CACHE_ARCHIVES_VISIBILITIES, CACHE_FILE_TIMESTAMPS, CACHE_FILE_VISIBILITIES, CACHE_FILES, CACHE_FILES_SIZES, CACHE_LOCALARCHIVES, CACHE_LOCALFILES, CACHE_SYMLINK, CLASSPATH_ARCHIVES, CLASSPATH_FILES, COMBINE_CLASS_ATTR, COMBINE_RECORDS_BEFORE_PROGRESS, COMBINER_GROUP_COMPARATOR_CLASS, COMPLETED_MAPS_FOR_REDUCE_SLOWSTART, COUNTER_GROUP_NAME_MAX_DEFAULT, COUNTER_GROUP_NAME_MAX_KEY, COUNTER_GROUPS_MAX_DEFAULT, COUNTER_GROUPS_MAX_KEY, COUNTER_NAME_MAX_DEFAULT, COUNTER_NAME_MAX_KEY, COUNTERS_MAX_DEFAULT, COUNTERS_MAX_KEY, DEFAULT_JOB_ACL_MODIFY_JOB, DEFAULT_JOB_ACL_VIEW_JOB, DEFAULT_JOB_AM_ACCESS_DISABLED, DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED, DEFAULT_LOG_LEVEL, DEFAULT_MAP_CPU_VCORES, DEFAULT_MAP_MEMORY_MB, DEFAULT_MAPRED_ADMIN_JAVA_OPTS, DEFAULT_MAPRED_ADMIN_USER_ENV, DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH, DEFAULT_MAX_SHUFFLE_FETCH_RETRY_DELAY, DEFAULT_MR_AM_COMMAND_OPTS, DEFAULT_MR_AM_COMMITTER_CANCEL_TIMEOUT_MS, DEFAULT_MR_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT, DEFAULT_MR_AM_CPU_VCORES, DEFAULT_MR_AM_HISTORY_COMPLETE_EVENT_FLUSH_TIMEOUT_MS, DEFAULT_MR_AM_HISTORY_JOB_COMPLETE_UNFLUSHED_MULTIPLIER, DEFAULT_MR_AM_HISTORY_MAX_UNFLUSHED_COMPLETE_EVENTS, DEFAULT_MR_AM_HISTORY_USE_BATCHED_FLUSH_QUEUE_SIZE_THRESHOLD, DEFAULT_MR_AM_IGNORE_BLACKLISTING_BLACKLISTED_NODE_PERCENT, DEFAULT_MR_AM_JOB_CLIENT_THREAD_COUNT, DEFAULT_MR_AM_JOB_REDUCE_PREEMPTION_LIMIT, DEFAULT_MR_AM_JOB_REDUCE_RAMP_UP_LIMIT, DEFAULT_MR_AM_LOG_LEVEL, DEFAULT_MR_AM_NUM_PROGRESS_SPLITS, DEFAULT_MR_AM_STAGING_DIR, DEFAULT_MR_AM_TASK_ESTIMATOR_SMOOTH_LAMBDA_MS, DEFAULT_MR_AM_TASK_LISTENER_THREAD_COUNT, DEFAULT_MR_AM_TO_RM_HEARTBEAT_INTERVAL_MS, DEFAULT_MR_AM_TO_RM_WAIT_INTERVAL_MS, DEFAULT_MR_AM_VMEM_MB, DEFAULT_MR_CLIENT_MAX_RETRIES, DEFAULT_MR_CLIENT_TO_AM_IPC_MAX_RETRIES, DEFAULT_REDUCE_CPU_VCORES, DEFAULT_REDUCE_MEMORY_MB, DEFAULT_SHELL, GROUP_COMPARATOR_CLASS, HADOOP_WORK_DIR, ID, INDEX_CACHE_MEMORY_LIMIT, INPUT_FORMAT_CLASS_ATTR, IO_SORT_FACTOR, IO_SORT_MB, JAR, JAR_UNPACK_PATTERN, JOB_ACL_MODIFY_JOB, JOB_ACL_VIEW_JOB, JOB_AM_ACCESS_DISABLED, JOB_CANCEL_DELEGATION_TOKEN, JOB_CONF_FILE, JOB_JAR, JOB_JOBTRACKER_ID, JOB_LOCAL_DIR, JOB_NAME, JOB_NAMENODES, JOB_SPLIT, JOB_SPLIT_METAINFO, JOB_SUBMIT_DIR, JOB_SUBMITHOST, JOB_SUBMITHOSTADDR, JOB_TOKEN_TRACKING_IDS, JOB_TOKEN_TRACKING_IDS_ENABLED, JOB_UBERTASK_ENABLE, JOB_UBERTASK_MAXBYTES, JOB_UBERTASK_MAXMAPS, JOB_UBERTASK_MAXREDUCES, JVM_NUMTASKS_TORUN, KEY_COMPARATOR, MAP_CLASS_ATTR, MAP_COMBINE_MIN_SPILLS, MAP_CPU_VCORES, MAP_DEBUG_SCRIPT, MAP_ENV, MAP_FAILURES_MAX_PERCENT, MAP_INPUT_FILE, MAP_INPUT_PATH, MAP_INPUT_START, MAP_JAVA_OPTS, MAP_LOG_LEVEL, MAP_MAX_ATTEMPTS, MAP_MEMORY_MB, MAP_OUTPUT_COLLECTOR_CLASS_ATTR, MAP_OUTPUT_COMPRESS, MAP_OUTPUT_COMPRESS_CODEC, MAP_OUTPUT_KEY_CLASS, MAP_OUTPUT_KEY_FIELD_SEPERATOR, MAP_OUTPUT_VALUE_CLASS, MAP_SKIP_INCR_PROC_COUNT, MAP_SKIP_MAX_RECORDS, MAP_SORT_SPILL_PERCENT, MAP_SPECULATIVE, MAPRED_ADMIN_USER_ENV, MAPRED_ADMIN_USER_SHELL, MAPRED_MAP_ADMIN_JAVA_OPTS, MAPRED_REDUCE_ADMIN_JAVA_OPTS, MAPREDUCE_APPLICATION_CLASSPATH, MAPREDUCE_JOB_CREDENTIALS_BINARY, MAPREDUCE_JOB_DIR, MAPREDUCE_JOB_USER_CLASSPATH_FIRST, MAPREDUCE_TASK_CLASSPATH_PRECEDENCE, MAPREDUCE_V2_CHILD_CLASS, MAX_SHUFFLE_FETCH_RETRY_DELAY, MAX_TASK_FAILURES_PER_TRACKER, MR_AM_COMMAND_OPTS, MR_AM_COMMITTER_CANCEL_TIMEOUT_MS, MR_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT, MR_AM_CPU_VCORES, MR_AM_CREATE_JH_INTERMEDIATE_BASE_DIR, MR_AM_ENV, MR_AM_HISTORY_COMPLETE_EVENT_FLUSH_TIMEOUT_MS, MR_AM_HISTORY_JOB_COMPLETE_UNFLUSHED_MULTIPLIER, MR_AM_HISTORY_MAX_UNFLUSHED_COMPLETE_EVENTS, MR_AM_HISTORY_USE_BATCHED_FLUSH_QUEUE_SIZE_THRESHOLD, MR_AM_IGNORE_BLACKLISTING_BLACKLISTED_NODE_PERECENT, MR_AM_JOB_CLIENT_PORT_RANGE, MR_AM_JOB_CLIENT_THREAD_COUNT, MR_AM_JOB_NODE_BLACKLISTING_ENABLE, MR_AM_JOB_RECOVERY_ENABLE, MR_AM_JOB_REDUCE_PREEMPTION_LIMIT, MR_AM_JOB_REDUCE_RAMPUP_UP_LIMIT, MR_AM_JOB_SPECULATOR, MR_AM_LOG_LEVEL, MR_AM_NUM_PROGRESS_SPLITS, MR_AM_PREFIX, MR_AM_SECURITY_SERVICE_AUTHORIZATION_CLIENT, MR_AM_SECURITY_SERVICE_AUTHORIZATION_TASK_UMBILICAL, MR_AM_STAGING_DIR, MR_AM_TASK_ESTIMATOR, MR_AM_TASK_ESTIMATOR_EXPONENTIAL_RATE_ENABLE, MR_AM_TASK_ESTIMATOR_SMOOTH_LAMBDA_MS, MR_AM_TASK_LISTENER_THREAD_COUNT, MR_AM_TO_RM_HEARTBEAT_INTERVAL_MS, MR_AM_TO_RM_WAIT_INTERVAL_MS, MR_AM_VMEM_MB, MR_CLIENT_MAX_RETRIES, MR_CLIENT_TO_AM_IPC_MAX_RETRIES, MR_JOB_END_NOTIFICATION_MAX_ATTEMPTS, MR_JOB_END_NOTIFICATION_MAX_RETRY_INTERVAL, MR_JOB_END_NOTIFICATION_PROXY, MR_JOB_END_NOTIFICATION_URL, MR_JOB_END_RETRY_ATTEMPTS, MR_JOB_END_RETRY_INTERVAL, MR_PREFIX, NUM_MAP_PROFILES, NUM_MAPS, NUM_REDUCE_PROFILES, NUM_REDUCES, OUTPUT, OUTPUT_FORMAT_CLASS_ATTR, OUTPUT_KEY_CLASS, OUTPUT_VALUE_CLASS, PARTITIONER_CLASS_ATTR, PRESERVE_FAILED_TASK_FILES, PRESERVE_FILES_PATTERN, PRIORITY, QUEUE_NAME, RECORDS_BEFORE_PROGRESS, REDUCE_CLASS_ATTR, REDUCE_CPU_VCORES, REDUCE_DEBUG_SCRIPT, REDUCE_ENV, REDUCE_FAILURES_MAXPERCENT, REDUCE_INPUT_BUFFER_PERCENT, REDUCE_JAVA_OPTS, REDUCE_LOG_LEVEL, REDUCE_MARKRESET_BUFFER_PERCENT, REDUCE_MARKRESET_BUFFER_SIZE, REDUCE_MAX_ATTEMPTS, REDUCE_MEMORY_MB, REDUCE_MEMORY_TOTAL_BYTES, REDUCE_MEMTOMEM_ENABLED, REDUCE_MEMTOMEM_THRESHOLD, REDUCE_MERGE_INMEM_THRESHOLD, REDUCE_SKIP_INCR_PROC_COUNT, REDUCE_SKIP_MAXGROUPS, REDUCE_SPECULATIVE, SETUP_CLEANUP_NEEDED, SHUFFLE_CONNECT_TIMEOUT, SHUFFLE_FETCH_FAILURES, SHUFFLE_INPUT_BUFFER_PERCENT, SHUFFLE_MEMORY_LIMIT_PERCENT, SHUFFLE_MERGE_PERCENT, SHUFFLE_NOTIFY_READERROR, SHUFFLE_PARALLEL_COPIES, SHUFFLE_READ_TIMEOUT, SKIP_OUTDIR, SKIP_RECORDS, SKIP_START_ATTEMPTS, SPECULATIVE_SLOWNODE_THRESHOLD, SPECULATIVE_SLOWTASK_THRESHOLD, SPECULATIVECAP, SPLIT_FILE, STDERR_LOGFILE_ENV, STDOUT_LOGFILE_ENV, TASK_ATTEMPT_ID, TASK_CLEANUP_NEEDED, TASK_DEBUGOUT_LINES, TASK_ID, TASK_ISMAP, TASK_LOG_DIR, TASK_LOG_SIZE, TASK_MAP_PROFILE_PARAMS, TASK_OUTPUT_DIR, TASK_PARTITION, TASK_PROFILE, TASK_PROFILE_PARAMS, TASK_REDUCE_PROFILE_PARAMS, TASK_TEMP_DIR, TASK_TIMEOUT, TASK_TIMEOUT_CHECK_INTERVAL_MS, TASK_USERLOG_LIMIT, USER_LOG_RETAIN_HOURS, USER_NAME, WORKDIR, WORKING_DIR
 
Constructor Summary
Job()
          Constructor for the Job class
Job(org.apache.hadoop.conf.Configuration conf)
          Constructor for the Job class
Job(org.apache.hadoop.conf.Configuration conf, java.lang.String jobName)
          Constructor for the Job class
 
Method Summary
 java.lang.Boolean checkDataType()
          Return the flag for data type check.
static Job copy(long key)
          Retrieve the configuration stored under key, if any.
 void drop()
          If there is no stored version of the configuration signal an error, otherwise remove the stored configuration and the other database state created for the job by the init or run method.
 long getConfKey()
          Get the retrieval key for the Hadoop job configuration.
 boolean getCreateDeserializedSplitsTable()
          Return the boolean value indicating whether the output table will be created (or dropped and recreated) as part of executing the run method for ReadInputSplits operations.
 boolean getCreateOutputTable()
          Return the boolean value indicating whether the output table will be created (or dropped and recreated) as part of executing the run method for MapReduce operations.
 boolean getCreateSerializedSplitsTable()
          Return the boolean value indicating whether the output table will be created (or dropped and recreated) as part of executing the run method for GetInputSplits operations.
 java.lang.String getDeserializedSplitsTableName()
          Get the name of the table to hold deserialized InputSplits.
 java.lang.String getIdString()
          Get the job id string used for constructing schema object names
 java.lang.Class[] getInputClasses()
          Returns an array of classes provided for identifying the argument types of map method.
 java.lang.Class<?> getInputKeyClass()
          Get the input key class for the job.
 java.lang.Class[] getInputSplitRecordClasses()
          For the ReadInputSplits operations, return an array of the java class types in the output record type.
 java.lang.String getInputSplitRecordDBType()
          Get the database object type (row type of the table) for the input split record data.
 java.lang.String[] getInputSplitRecordDBTypes()
          For the ReadInputSplits operations, return an array of the database column types in the output record type.
 java.lang.String getInputSplitRecordDBTypeSet()
          Get the database table type for the input split record data.
 java.lang.Class<?> getInputSplitRecordKeyClass()
          Get the InputSplit record key class for the job.
 java.lang.String getInputSplitRecordKeyDBType()
          Get the input split record key database type.
 java.lang.Class<?> getInputSplitRecordValueClass()
          Get the InputSplit record value class for the job.
 java.lang.String getInputSplitRecordValueDBType()
          Get the input split record value database type.
 java.lang.String getInputTableName()
          Get the input table name.
 java.lang.Class<?> getInputValueClass()
          Get the input value class for the job.
 java.lang.String getMapMethodName()
          Get the Map method name
 java.lang.Class[] getMapMethodSignature()
          For the GetReadAndMapReduceInputSplits and ReadAndMapReduceInputSplits operations, this method provides for querying the actual arguments types of the map method that will be used.
 java.lang.String getMapOutputDBType()
          Get the database object type (row type of the table) for the map output data.
 java.lang.String getMapOutputDBTypeSet()
          Get the database table type for the map output data.
 java.lang.Class<?> getMapOutputKeyClass()
          Get the key class for the map output data.
 java.lang.String getMapOutputKeyDBType()
          Get the map output key database type.
 java.lang.Class<?> getMapOutputValueClass()
          Get the value class for the map output data.
 java.lang.String getMapOutputValueDBType()
          Get the map output value database type.
 java.lang.String getMapperImpl()
          Get the database object type for the Mapper call specification.
 int getNumHInputTasks()
          Get the number of hinput tasks for the job.
 int getNumMapTasks()
          Get the number of map tasks for the job.
 Job.Operation getOperation()
          Gets the operation value for Job
 java.lang.String getOutputDBType()
          Get the database object type (row type of the table) for the (final) job output data.
 java.lang.String getOutputDBTypeSet()
          Get the database table type for the (final) job output data.
 java.lang.Class<?> getOutputKeyClass()
          Get the key class for the (final) job output data.
 java.lang.String getOutputKeyDBType()
          Get the output key database type.
 java.lang.String getOutputSchemaName()
          Get the schema name for output tables.
 java.lang.String getOutputTableName()
          Get the output table name.
 java.lang.Class<?> getOutputValueClass()
          Get the output value class.
 java.lang.String getOutputValueDBType()
          Get the output value database type.
 java.lang.Class<? extends org.apache.hadoop.mapreduce.Partitioner<?,?>> getPartitionerClass()
          Get the Partitioner class for the job.
 java.lang.String getReducerImpl()
          Get the database object type for the Reducer call specification.
 java.lang.String getSchemaName()
          Get the schema name for input tables.
 java.lang.String getSerializedSplitsTableName()
          Get the name of the table to hold serialized InputSplits.
 int getSerializedSplitsTableRawSize()
          For the GetInputSplits operation, returns the size of the RAW column used in output table
 boolean getUsingHive()
          Wrapper over getHiveMetaStoreData that forces reconnection and retrieval of metadata if properties required for connection are present, and returns whether hive is in use (ie whether those properties are set).
 long init()
          Initialize a job by creating the database state required for executing the job.
 long init(Job.Operation operation)
          Initialize a job by creating the database state required for executing the job.
static void initializeLogging()
          Initializes log4j from the oc4hadoop_log4j.properties file loaded as a JAVA Resource
static Job lookup(long key)
          Retrieve the configuration stored under key, if any.
 void reopen(boolean dropGeneratedObjects)
          Change state back to DEFINE so that fields can be modified.
 void resetConfigDBStore()
          Reset the sequence and table for storing Hadoop job configurations.
 void run()
          Run the job with the table names as set in the configuration.
 void run(java.lang.String table)
          Runs the job for GetInputSplits operation
 void run(java.lang.String inTable, java.lang.String outTable)
          Run the job with the given tables.
 void setCreateDeserializedSplitsTable(boolean value)
          Set the boolean value determining whether the output table will be created (or dropped and recreated) as part of executing the run method for ReadInputSplits operations.
 void setCreateOutputTable(boolean value)
          Set the boolean value determining whether the output table will be created (or dropped and recreated) as part of executing the run method for MapReduce operations.
 void setCreateSerializedSplitsTable(boolean value)
          Set the boolean value determining whether the output table will be created (or dropped and recreated) as part of executing the run method for GetInputSplits operations.
 void setDataTypeCheck(java.lang.Boolean flag)
          Set if data type checks between SQL and Hadoop writable types for map/reduce input data are performed at the beginning of each stage.
 void setDeserializedSplitsTableName(java.lang.String theTable)
          Set the name of the table to hold deserialized InputSplits.
 void setIdString(java.lang.String theId)
          Set the job id string used for constructing schema object names
 void setInputClasses(java.lang.String value)
          Sets the class names for identifying the argument types of map method.
 void setInputFormatClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.InputFormat> cls)
          Set the InputFormat for the job.
 void setInputKeyClass(java.lang.Class<?> theClass)
          Set the input key class for the job.
 void setInputSplitRecordClasses(java.lang.String value)
          For the ReadInputSplits operations, set the values in the array returned by getInputSplitRecordClasses.
 void setInputSplitRecordDBTypes(java.lang.String value)
          For the ReadInputSplits operations, set the values in the array returned by getInputSplitRecordDBTypes.
 void setInputSplitRecordKeyClass(java.lang.Class<?> theClass)
          Set the InputSplit record key class for the job.
 void setInputSplitRecordKeyDBType(java.lang.String theType)
          Set the key database type for the input split record data.
 void setInputSplitRecordValueClass(java.lang.Class<?> theClass)
          Set the InputSplit record value class for the job.
 void setInputSplitRecordValueDBType(java.lang.String theType)
          Set the value database type for the input split record data.
 void setInputTableName(java.lang.String theTable)
          Set the input table name.
 void setInputValueClass(java.lang.Class<?> theClass)
          Set the input value class for the job.
 void setJobName(java.lang.String name)
          Set the user-specified job name.
 void setMapMethodName(java.lang.String methodName)
          Set the Map method name
 void setMapOutputKeyClass(java.lang.Class<?> theClass)
          Set the key class for the map output data.
 void setMapOutputKeyDBType(java.lang.String theType)
          Set the key database type for the map output data.
 void setMapOutputValueClass(java.lang.Class<?> theClass)
          Set the value class for the map output data.
 void setMapOutputValueDBType(java.lang.String theType)
          Set the value database type for the map output data.
 void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls)
          Set the Mapper for the job.
 void setMultiValue(boolean value)
           
 void setNumHInputTasks(int tasks)
          Set the number of hinput tasks for the job.
 void setNumMapTasks(int tasks)
          Set the number of map tasks for the job.
 void setNumReduceTasks(int tasks)
          Set the number of reduce tasks for the job.
 void setOperation(Job.Operation operation)
          Sets the operation property of Job.
 void setOutputKeyClass(java.lang.Class<?> theClass)
          Set the key class for the (final) job output data.
 void setOutputKeyDBType(java.lang.String theType)
          Set the key database type for the (final) job output data.
 void setOutputSchemaName(java.lang.String theSchema)
          Set the optional schema name for output tables.
 void setOutputTableName(java.lang.String theTable)
          Set the output table name.
 void setOutputValueClass(java.lang.Class<?> theClass)
          Set the value class for the (final) job output data.
 void setOutputValueDBType(java.lang.String theType)
          Set the value database type for the (final) job output data.
 void setPartitionerClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Partitioner> cls)
          Set the Partitioner for the job.
 void setReducerClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Reducer> cls)
          Set the Reducer for the job.
 void setSchemaName(java.lang.String theSchema)
          Set the schema name for input tables.
 void setSerializedSplitsTableName(java.lang.String theTable)
          Set the name of the table to hold serialized InputSplits.
 void setSerializedSplitsTableRawSize(int value)
          For GetInputSplits operation, sets the size of RAW column used in output table.
 void setTypeMap(java.util.Map<java.lang.String,java.lang.Class<?>> map)
          Map a User Defined Type defined in SQL to the Java Class to be used as Hadoop Key and Value.
 boolean singleColumn()
           
 long store()
          Create or update the stored version of the configuration.
 long update()
          Update the stored version of the configuration created by a previous call to the init, run or store methods, so that the stored version matches the current settings in the job.
 boolean waitForCompletion(boolean verbose)
          Submit the job and wait for it to finish.
 
Methods inherited from class org.apache.hadoop.mapreduce.Job
addArchiveToClassPath, addCacheArchive, addCacheFile, addFileToClassPath, cleanupProgress, createSymlink, failTask, getCluster, getCompletionPollInterval, getCounters, getFinishTime, getHistoryUrl, getInstance, getInstance, getInstance, getInstance, getInstance, getInstance, getInstance, getJobFile, getJobName, getJobState, getJobSubmitter, getPriority, getProgressPollInterval, getSchedulingInfo, getStartTime, getStatus, getTaskCompletionEvents, getTaskCompletionEvents, getTaskDiagnostics, getTaskOutputFilter, getTaskReports, getTrackingURL, isComplete, isRetired, isSuccessful, isUber, killJob, killTask, mapProgress, monitorAndPrintJob, reduceProgress, setCacheArchives, setCacheFiles, setCancelDelegationTokenUponJobCompletion, setCombinerClass, setCombinerKeyGroupingComparatorClass, setGroupingComparatorClass, setJar, setJarByClass, setJobSetupCleanupNeeded, setMapSpeculativeExecution, setMaxMapAttempts, setMaxReduceAttempts, setOutputFormatClass, setPriority, setProfileEnabled, setProfileParams, setProfileTaskRange, setReduceSpeculativeExecution, setSortComparatorClass, setSpeculativeExecution, setTaskOutputFilter, setupProgress, setUser, setUserClassesTakesPrecedence, setWorkingDirectory, submit, toString
 
Methods inherited from class org.apache.hadoop.mapreduce.task.JobContextImpl
getArchiveClassPaths, getArchiveTimestamps, getCacheArchives, getCacheFiles, getCombinerClass, getCombinerKeyGroupingComparator, getConfiguration, getCredentials, getFileClassPaths, getFileTimestamps, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobSetupCleanupNeeded, getLocalCacheArchives, getLocalCacheFiles, getMapperClass, getMaxMapAttempts, getMaxReduceAttempts, getNumReduceTasks, getOutputFormatClass, getProfileEnabled, getProfileParams, getProfileTaskRange, getReducerClass, getSortComparator, getSymlink, getTaskCleanupNeeded, getUser, getWorkingDirectory, setJobID, userClassesTakesPrecedence
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.mapreduce.JobContext
getArchiveClassPaths, getArchiveTimestamps, getCacheArchives, getCacheFiles, getCombinerClass, getCombinerKeyGroupingComparator, getConfiguration, getCredentials, getFileClassPaths, getFileTimestamps, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobSetupCleanupNeeded, getLocalCacheArchives, getLocalCacheFiles, getMapperClass, getMaxMapAttempts, getMaxReduceAttempts, getNumReduceTasks, getOutputFormatClass, getProfileEnabled, getProfileParams, getProfileTaskRange, getReducerClass, getSortComparator, getSymlink, getTaskCleanupNeeded, getUser, getWorkingDirectory, userClassesTakesPrecedence
 

Field Detail

OPERATION_ATTR

protected static final java.lang.String OPERATION_ATTR
See Also:
Constant Field Values

SCHEMA_NAME_ATTR

protected static final java.lang.String SCHEMA_NAME_ATTR
See Also:
Constant Field Values

OUTPUT_SCHEMA_NAME_ATTR

protected static final java.lang.String OUTPUT_SCHEMA_NAME_ATTR
See Also:
Constant Field Values

INPUT_TABLE_NAME_PROPERTY

protected static final java.lang.String INPUT_TABLE_NAME_PROPERTY
See Also:
Constant Field Values

OUTPUT_TABLE_NAME_PROPERTY

protected static final java.lang.String OUTPUT_TABLE_NAME_PROPERTY
See Also:
Constant Field Values

SERIALIZED_SPLITS_TABLE_NAME_PROPERTY

protected static final java.lang.String SERIALIZED_SPLITS_TABLE_NAME_PROPERTY
See Also:
Constant Field Values

DESERIALIZED_SPLITS_TABLE_NAME_PROPERTY

protected static final java.lang.String DESERIALIZED_SPLITS_TABLE_NAME_PROPERTY
See Also:
Constant Field Values

INPUT_KEY_CLASS_ATTR

protected static final java.lang.String INPUT_KEY_CLASS_ATTR
See Also:
Constant Field Values

INPUT_VALUE_CLASS_ATTR

protected static final java.lang.String INPUT_VALUE_CLASS_ATTR
See Also:
Constant Field Values

INPUT_CLASSES_ATTR

protected static final java.lang.String INPUT_CLASSES_ATTR
See Also:
Constant Field Values

MAP_OUTPUT_KEY_CLASS_ATTR

protected static final java.lang.String MAP_OUTPUT_KEY_CLASS_ATTR
See Also:
Constant Field Values

MAP_OUTPUT_VALUE_CLASS_ATTR

protected static final java.lang.String MAP_OUTPUT_VALUE_CLASS_ATTR
See Also:
Constant Field Values

OUTPUT_KEY_CLASS_ATTR

protected static final java.lang.String OUTPUT_KEY_CLASS_ATTR
See Also:
Constant Field Values

OUTPUT_VALUE_CLASS_ATTR

protected static final java.lang.String OUTPUT_VALUE_CLASS_ATTR
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_KEY_CLASS_ATTR

protected static final java.lang.String INPUT_SPLIT_RECORD_KEY_CLASS_ATTR
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_VALUE_CLASS_ATTR

protected static final java.lang.String INPUT_SPLIT_RECORD_VALUE_CLASS_ATTR
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_CLASSES_ATTR

protected static final java.lang.String INPUT_SPLIT_RECORD_CLASSES_ATTR
See Also:
Constant Field Values

SINGLE_COLUMN_ATTR

protected static final java.lang.String SINGLE_COLUMN_ATTR
See Also:
Constant Field Values

NUM_HINPUT_TASKS_ATTR

protected static final java.lang.String NUM_HINPUT_TASKS_ATTR
See Also:
Constant Field Values

NUM_INSERT_TASKS_ATTR

protected static final java.lang.String NUM_INSERT_TASKS_ATTR
See Also:
Constant Field Values

MAP_OUTPUT_KEY_DB_TYPE_PROPERTY

protected static final java.lang.String MAP_OUTPUT_KEY_DB_TYPE_PROPERTY
See Also:
Constant Field Values

MAP_OUTPUT_VALUE_DB_TYPE_PROPERTY

protected static final java.lang.String MAP_OUTPUT_VALUE_DB_TYPE_PROPERTY
See Also:
Constant Field Values

OUTPUT_KEY_DB_TYPE_PROPERTY

protected static final java.lang.String OUTPUT_KEY_DB_TYPE_PROPERTY
See Also:
Constant Field Values

OUTPUT_VALUE_DB_TYPE_PROPERTY

protected static final java.lang.String OUTPUT_VALUE_DB_TYPE_PROPERTY
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_KEY_DB_TYPE_PROPERTY

protected static final java.lang.String INPUT_SPLIT_RECORD_KEY_DB_TYPE_PROPERTY
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_VALUE_DB_TYPE_PROPERTY

protected static final java.lang.String INPUT_SPLIT_RECORD_VALUE_DB_TYPE_PROPERTY
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_DB_TYPES_PROPERTY

protected static final java.lang.String INPUT_SPLIT_RECORD_DB_TYPES_PROPERTY
See Also:
Constant Field Values

MAP_OUTTYPE_PROPERTY

protected static final java.lang.String MAP_OUTTYPE_PROPERTY
See Also:
Constant Field Values

MAP_OUTTYPESET_PROPERTY

protected static final java.lang.String MAP_OUTTYPESET_PROPERTY
See Also:
Constant Field Values

OUTTYPE_PROPERTY

protected static final java.lang.String OUTTYPE_PROPERTY
See Also:
Constant Field Values

OUTTYPESET_PROPERTY

protected static final java.lang.String OUTTYPESET_PROPERTY
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_OUTTYPE_PROPERTY

protected static final java.lang.String INPUT_SPLIT_RECORD_OUTTYPE_PROPERTY
See Also:
Constant Field Values

INPUT_SPLIT_RECORD_OUTTYPESET_PROPERTY

protected static final java.lang.String INPUT_SPLIT_RECORD_OUTTYPESET_PROPERTY
See Also:
Constant Field Values

MAPPER_IMPL

protected static final java.lang.String MAPPER_IMPL
See Also:
Constant Field Values

REDUCER_IMPL

protected static final java.lang.String REDUCER_IMPL
See Also:
Constant Field Values

MAP_FUNCTION

protected static final java.lang.String MAP_FUNCTION
See Also:
Constant Field Values

REDUCE_FUNCTION

protected static final java.lang.String REDUCE_FUNCTION
See Also:
Constant Field Values

HINPUT_FUNCTION

protected static final java.lang.String HINPUT_FUNCTION
See Also:
Constant Field Values

HINPUTANDMAP_FUNCTION

protected static final java.lang.String HINPUTANDMAP_FUNCTION
See Also:
Constant Field Values

MAP_METHOD

protected static final java.lang.String MAP_METHOD
See Also:
Constant Field Values

MAP_SIGNATURE

protected static final java.lang.String MAP_SIGNATURE
See Also:
Constant Field Values

CONF_KEY

protected static final java.lang.String CONF_KEY
See Also:
Constant Field Values

DEBUG_MODE_FLAG

protected static final java.lang.String DEBUG_MODE_FLAG
See Also:
Constant Field Values

DATA_TYPE_CHECK_FLAG

protected static final java.lang.String DATA_TYPE_CHECK_FLAG
See Also:
Constant Field Values

ID_STRING_ATTR

protected static final java.lang.String ID_STRING_ATTR
See Also:
Constant Field Values

CREATE_OUTPUT_TABLE_ATTR

protected static final java.lang.String CREATE_OUTPUT_TABLE_ATTR
See Also:
Constant Field Values

CREATE_SERIALIZED_SPLITS_TABLE_ATTR

protected static final java.lang.String CREATE_SERIALIZED_SPLITS_TABLE_ATTR
See Also:
Constant Field Values

CREATE_DESERIALIZED_SPLITS_TABLE_ATTR

protected static final java.lang.String CREATE_DESERIALIZED_SPLITS_TABLE_ATTR
See Also:
Constant Field Values

SERIALIZED_SPLITS_TABLE_RAW_SIZE_ATTR

protected static final java.lang.String SERIALIZED_SPLITS_TABLE_RAW_SIZE_ATTR
See Also:
Constant Field Values

SQLTYPES

protected static final java.lang.String SQLTYPES
See Also:
Constant Field Values

MULTI_VALUE

protected static final java.lang.String MULTI_VALUE
See Also:
Constant Field Values
Constructor Detail

Job

public Job()
    throws java.io.IOException
Constructor for the Job class

Throws:
java.io.IOException

Job

public Job(org.apache.hadoop.conf.Configuration conf)
    throws java.io.IOException
Constructor for the Job class

Parameters:
conf - the Hadoop configuration
Throws:
java.io.IOException

Job

public Job(org.apache.hadoop.conf.Configuration conf,
           java.lang.String jobName)
    throws java.io.IOException
Constructor for the Job class

Parameters:
conf - the Hadoop configuration
jobName - the job name
Throws:
java.io.IOException
Method Detail

singleColumn

public boolean singleColumn()
Returns:
whether input table has single column

update

public long update()
            throws java.lang.Exception
Update the stored version of the configuration created by a previous call to the init, run or store methods, so that the stored version matches the current settings in the job. Signal an error if there has been no previous call to create a stored version of the configuration for this job.

Returns:
key for the stored configuration
Throws:
java.lang.Exception

store

public long store()
           throws java.lang.Exception
Create or update the stored version of the configuration.

Returns:
key for the stored configuration.
Throws:
java.lang.Exception

drop

public void drop()
          throws java.lang.Exception
If there is no stored version of the configuration signal an error, otherwise remove the stored configuration and the other database state created for the job by the init or run method.

Throws:
java.lang.Exception

reopen

public void reopen(boolean dropGeneratedObjects)
            throws java.lang.Exception
Change state back to DEFINE so that fields can be modified. if dropGeneratedObjects is true, remove any database objects created when Job.init was called (if it had in fact been called in the past, as is indicated by state having a value other than DEFINE)

Parameters:
dropGeneratedObjects - remove any database objects created when Job.init was called
Throws:
java.lang.Exception

lookup

public static Job lookup(long key)
Retrieve the configuration stored under key, if any. If found construct and return a Job object j such that j.getConfKey() == key, otherwise return null. Does not intern jobs, so successive calls using the same key will return distinct objects.

Parameters:
key - job key
Returns:
Job

copy

public static Job copy(long key)
Retrieve the configuration stored under key, if any. If found construct and return a Job object j such that j.getConfKey() is not initialized, otherwise return null.

Parameters:
key - configuration key
Returns:
Job

getOperation

public Job.Operation getOperation()
Gets the operation value for Job

Returns:
operation value

setOperation

public void setOperation(Job.Operation operation)
Sets the operation property of Job.

Parameters:
operation - operation of job

getCreateOutputTable

public boolean getCreateOutputTable()
Return the boolean value indicating whether the output table will be created (or dropped and recreated) as part of executing the run method for MapReduce operations.

Returns:
value indicating whether the output table will be created

setCreateOutputTable

public void setCreateOutputTable(boolean value)
Set the boolean value determining whether the output table will be created (or dropped and recreated) as part of executing the run method for MapReduce operations.

Parameters:
value - boolean value determining whether the output table will be created (or dropped and recreated)

getCreateSerializedSplitsTable

public boolean getCreateSerializedSplitsTable()
Return the boolean value indicating whether the output table will be created (or dropped and recreated) as part of executing the run method for GetInputSplits operations.

Returns:
value indicating whether the output table will be created

setCreateSerializedSplitsTable

public void setCreateSerializedSplitsTable(boolean value)
Set the boolean value determining whether the output table will be created (or dropped and recreated) as part of executing the run method for GetInputSplits operations.

Parameters:
value - boolean value determining whether the output table will be created (or dropped and recreated)

getSerializedSplitsTableRawSize

public int getSerializedSplitsTableRawSize()
For the GetInputSplits operation, returns the size of the RAW column used in output table

Returns:
size of RAW column

setSerializedSplitsTableRawSize

public void setSerializedSplitsTableRawSize(int value)
For GetInputSplits operation, sets the size of RAW column used in output table. A value outside the range [1, 32767] is treated as 32767

Parameters:
value - size of RAW column used in output table,

getCreateDeserializedSplitsTable

public boolean getCreateDeserializedSplitsTable()
Return the boolean value indicating whether the output table will be created (or dropped and recreated) as part of executing the run method for ReadInputSplits operations.

Returns:
value indicating whether the output table will be created

setCreateDeserializedSplitsTable

public void setCreateDeserializedSplitsTable(boolean value)
Set the boolean value determining whether the output table will be created (or dropped and recreated) as part of executing the run method for ReadInputSplits operations.

Parameters:
value - boolean value determining whether the output table will be created (or dropped and recreated)

setNumMapTasks

public void setNumMapTasks(int tasks)
                    throws java.lang.IllegalStateException
Set the number of map tasks for the job.

Parameters:
tasks - the number of map tasks
Throws:
java.lang.IllegalStateException

getNumMapTasks

public int getNumMapTasks()
Get the number of map tasks for the job.

Returns:
the number of map tasks

setNumReduceTasks

public void setNumReduceTasks(int tasks)
                       throws java.lang.IllegalStateException
Set the number of reduce tasks for the job.

Overrides:
setNumReduceTasks in class org.apache.hadoop.mapreduce.Job
Parameters:
tasks - number of reduce tasks
Throws:
java.lang.IllegalStateException

setNumHInputTasks

public void setNumHInputTasks(int tasks)
                       throws java.lang.IllegalStateException
Set the number of hinput tasks for the job.

Parameters:
tasks - the number of hinput tasks
Throws:
java.lang.IllegalStateException

getNumHInputTasks

public int getNumHInputTasks()
Get the number of hinput tasks for the job.

Returns:
the number of hinput tasks

setJobName

public final void setJobName(java.lang.String name)
                      throws java.lang.IllegalStateException
Set the user-specified job name.

Overrides:
setJobName in class org.apache.hadoop.mapreduce.Job
Parameters:
name - the job's new name
Throws:
java.lang.IllegalStateException

setIdString

public void setIdString(java.lang.String theId)
                 throws java.lang.IllegalStateException
Set the job id string used for constructing schema object names

Parameters:
theId - the id string
Throws:
java.lang.IllegalStateException

getIdString

public java.lang.String getIdString()
Get the job id string used for constructing schema object names

Returns:
the id string

setSchemaName

public void setSchemaName(java.lang.String theSchema)
                   throws java.lang.IllegalStateException
Set the schema name for input tables.

Parameters:
theSchema - the schema name
Throws:
java.lang.IllegalStateException

getSchemaName

public java.lang.String getSchemaName()
Get the schema name for input tables.

Returns:
the schema name

setOutputSchemaName

public void setOutputSchemaName(java.lang.String theSchema)
                         throws java.lang.IllegalStateException
Set the optional schema name for output tables.

Parameters:
theSchema - the schema name
Throws:
java.lang.IllegalStateException

getOutputSchemaName

public java.lang.String getOutputSchemaName()
Get the schema name for output tables. If it is not set, return the default schema name.

Returns:
the output schema name

setInputTableName

public void setInputTableName(java.lang.String theTable)
                       throws java.lang.IllegalStateException
Set the input table name.

Parameters:
theTable - the input table name
Throws:
java.lang.IllegalStateException

getInputTableName

public java.lang.String getInputTableName()
Get the input table name.

Returns:
the input table name

setOutputTableName

public void setOutputTableName(java.lang.String theTable)
                        throws java.lang.IllegalStateException
Set the output table name.

Parameters:
theTable - the output table name
Throws:
java.lang.IllegalStateException

getOutputTableName

public java.lang.String getOutputTableName()
Get the output table name.

Returns:
the output table name

setSerializedSplitsTableName

public void setSerializedSplitsTableName(java.lang.String theTable)
                                  throws java.lang.IllegalStateException
Set the name of the table to hold serialized InputSplits.

Parameters:
theTable - the name of the table to hold serialized InputSplits
Throws:
java.lang.IllegalStateException

getSerializedSplitsTableName

public java.lang.String getSerializedSplitsTableName()
Get the name of the table to hold serialized InputSplits.

Returns:
the name of the table to hold serialized InputSplits

setDeserializedSplitsTableName

public void setDeserializedSplitsTableName(java.lang.String theTable)
                                    throws java.lang.IllegalStateException
Set the name of the table to hold deserialized InputSplits.

Parameters:
theTable - the name of the table to hold deserialized InputSplits
Throws:
java.lang.IllegalStateException

getDeserializedSplitsTableName

public java.lang.String getDeserializedSplitsTableName()
Get the name of the table to hold deserialized InputSplits.

Returns:
the name of the table to hold deserialized InputSplits

setInputFormatClass

public void setInputFormatClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.InputFormat> cls)
                         throws java.lang.IllegalStateException
Set the InputFormat for the job.

Overrides:
setInputFormatClass in class org.apache.hadoop.mapreduce.Job
Parameters:
cls - the InputFormat to use
Throws:
java.lang.IllegalStateException - if the job is submitted

setMapperClass

public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls)
                    throws java.lang.IllegalStateException
Set the Mapper for the job.

Overrides:
setMapperClass in class org.apache.hadoop.mapreduce.Job
Parameters:
cls - the Mapper to use
Throws:
java.lang.IllegalStateException

setReducerClass

public void setReducerClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Reducer> cls)
                     throws java.lang.IllegalStateException
Set the Reducer for the job.

Overrides:
setReducerClass in class org.apache.hadoop.mapreduce.Job
Parameters:
cls - the Reducer to use
Throws:
java.lang.IllegalStateException

setPartitionerClass

public void setPartitionerClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Partitioner> cls)
                         throws java.lang.IllegalStateException
Set the Partitioner for the job.

Overrides:
setPartitionerClass in class org.apache.hadoop.mapreduce.Job
Parameters:
cls - the Partitioner to use
Throws:
java.lang.IllegalStateException - if the job is submitted

getPartitionerClass

public java.lang.Class<? extends org.apache.hadoop.mapreduce.Partitioner<?,?>> getPartitionerClass()
                                                                                            throws java.lang.ClassNotFoundException
Get the Partitioner class for the job. Overrides the JobContext version to change the default from HashPartitioner.class to null.

Specified by:
getPartitionerClass in interface org.apache.hadoop.mapreduce.JobContext
Overrides:
getPartitionerClass in class org.apache.hadoop.mapreduce.task.JobContextImpl
Returns:
the Partitioner class for the job.
Throws:
java.lang.ClassNotFoundException

setInputKeyClass

public void setInputKeyClass(java.lang.Class<?> theClass)
                      throws java.lang.IllegalStateException
Set the input key class for the job.

Parameters:
theClass - the input key class
Throws:
java.lang.IllegalStateException

getInputKeyClass

public java.lang.Class<?> getInputKeyClass()
Get the input key class for the job. If it is not set, return the default LongWritable class.

Returns:
the input key class

setInputValueClass

public void setInputValueClass(java.lang.Class<?> theClass)
                        throws java.lang.IllegalStateException
Set the input value class for the job.

Parameters:
theClass - the input value class
Throws:
java.lang.IllegalStateException

getInputValueClass

public java.lang.Class<?> getInputValueClass()
Get the input value class for the job. If it is not set, return the default Text class.

Returns:
the input value class

setInputSplitRecordKeyClass

public void setInputSplitRecordKeyClass(java.lang.Class<?> theClass)
                                 throws java.lang.IllegalStateException
Set the InputSplit record key class for the job.

Parameters:
theClass - the InputSplit record key class
Throws:
java.lang.IllegalStateException

getInputSplitRecordKeyClass

public java.lang.Class<?> getInputSplitRecordKeyClass()
Get the InputSplit record key class for the job. If it is not set, return the default LongWritable class.

Returns:
the InputSplit record key class

setInputSplitRecordValueClass

public void setInputSplitRecordValueClass(java.lang.Class<?> theClass)
                                   throws java.lang.IllegalStateException
Set the InputSplit record value class for the job.

Parameters:
theClass - the InputSplit record value class
Throws:
java.lang.IllegalStateException

getInputSplitRecordValueClass

public java.lang.Class<?> getInputSplitRecordValueClass()
Get the InputSplit record value class for the job. If it is not set, return the default Text class.

Returns:
the InputSplit record value class

setMapOutputKeyClass

public void setMapOutputKeyClass(java.lang.Class<?> theClass)
                          throws java.lang.IllegalStateException
Set the key class for the map output data. This allows the user to specify the map output key class to be different than the final output key class.

Overrides:
setMapOutputKeyClass in class org.apache.hadoop.mapreduce.Job
Parameters:
theClass - the map output key class
Throws:
java.lang.IllegalStateException

getMapOutputKeyClass

public java.lang.Class<?> getMapOutputKeyClass()
Get the key class for the map output data. If it is not set, use the (final) output key class. This allows the map output key class to be different than the final output key class.

Specified by:
getMapOutputKeyClass in interface org.apache.hadoop.mapreduce.JobContext
Overrides:
getMapOutputKeyClass in class org.apache.hadoop.mapreduce.task.JobContextImpl
Returns:
the map output key class

setMapOutputValueClass

public void setMapOutputValueClass(java.lang.Class<?> theClass)
                            throws java.lang.IllegalStateException
Set the value class for the map output data. This allows the user to specify the map output value class to be different than the final output value class.

Overrides:
setMapOutputValueClass in class org.apache.hadoop.mapreduce.Job
Parameters:
theClass - the map output value class
Throws:
java.lang.IllegalStateException

getMapOutputValueClass

public java.lang.Class<?> getMapOutputValueClass()
Get the value class for the map output data. If it is not set, use the (final) output value class. This allows the map output value class to be different than the final output value class.

Specified by:
getMapOutputValueClass in interface org.apache.hadoop.mapreduce.JobContext
Overrides:
getMapOutputValueClass in class org.apache.hadoop.mapreduce.task.JobContextImpl
Returns:
the map output value class

setOutputKeyClass

public void setOutputKeyClass(java.lang.Class<?> theClass)
                       throws java.lang.IllegalStateException
Set the key class for the (final) job output data.

Overrides:
setOutputKeyClass in class org.apache.hadoop.mapreduce.Job
Parameters:
theClass - the output key class
Throws:
java.lang.IllegalStateException

getOutputKeyClass

public java.lang.Class<?> getOutputKeyClass()
Get the key class for the (final) job output data. If it is not set, return the default LongWritable class.

Specified by:
getOutputKeyClass in interface org.apache.hadoop.mapreduce.JobContext
Overrides:
getOutputKeyClass in class org.apache.hadoop.mapreduce.task.JobContextImpl
Returns:
the output key class

setOutputValueClass

public void setOutputValueClass(java.lang.Class<?> theClass)
                         throws java.lang.IllegalStateException
Set the value class for the (final) job output data.

Overrides:
setOutputValueClass in class org.apache.hadoop.mapreduce.Job
Parameters:
theClass - the output value class
Throws:
java.lang.IllegalStateException

getOutputValueClass

public java.lang.Class<?> getOutputValueClass()
Get the output value class. If it is not set, return the default Text class.

Specified by:
getOutputValueClass in interface org.apache.hadoop.mapreduce.JobContext
Overrides:
getOutputValueClass in class org.apache.hadoop.mapreduce.task.JobContextImpl
Returns:
the output value class

setMapOutputKeyDBType

public void setMapOutputKeyDBType(java.lang.String theType)
                           throws java.lang.IllegalStateException
Set the key database type for the map output data.

Parameters:
theType - the map key database type
Throws:
java.lang.IllegalStateException

getMapOutputKeyDBType

public java.lang.String getMapOutputKeyDBType()
Get the map output key database type. If it is not set, return the (final) output key database type.

Returns:
the map output key database type

setMapOutputValueDBType

public void setMapOutputValueDBType(java.lang.String theType)
                             throws java.lang.IllegalStateException
Set the value database type for the map output data.

Parameters:
theType - the map value database type
Throws:
java.lang.IllegalStateException

getMapOutputValueDBType

public java.lang.String getMapOutputValueDBType()
Get the map output value database type. If it is not set, return the (final) output value database type.

Returns:
the map output value database type

setOutputKeyDBType

public void setOutputKeyDBType(java.lang.String theType)
                        throws java.lang.IllegalStateException
Set the key database type for the (final) job output data.

Parameters:
theType - the output key database type
Throws:
java.lang.IllegalStateException

getOutputKeyDBType

public java.lang.String getOutputKeyDBType()
Get the output key database type. If it is not set, return the default NUMBER type.

Returns:
the output key database type

setOutputValueDBType

public void setOutputValueDBType(java.lang.String theType)
                          throws java.lang.IllegalStateException
Set the value database type for the (final) job output data.

Parameters:
theType - the output value database type
Throws:
java.lang.IllegalStateException

getOutputValueDBType

public java.lang.String getOutputValueDBType()
Get the output value database type. If it is not set, return the default VARCHAR2(4000) type.

Returns:
the output value database type

setInputSplitRecordKeyDBType

public void setInputSplitRecordKeyDBType(java.lang.String theType)
                                  throws java.lang.IllegalStateException
Set the key database type for the input split record data.

Parameters:
theType - the input split record key database type
Throws:
java.lang.IllegalStateException

getInputSplitRecordKeyDBType

public java.lang.String getInputSplitRecordKeyDBType()
Get the input split record key database type. If it is not set, return the default NUMBER type.

Returns:
the input split record key database type

setInputSplitRecordValueDBType

public void setInputSplitRecordValueDBType(java.lang.String theType)
                                    throws java.lang.IllegalStateException
Set the value database type for the input split record data.

Parameters:
theType - the input split record value database type
Throws:
java.lang.IllegalStateException

getInputSplitRecordValueDBType

public java.lang.String getInputSplitRecordValueDBType()
Get the input split record value database type. If it is not set, return the default VARCHAR2(4000) type.

Returns:
the input split record value database type

getMapOutputDBType

public java.lang.String getMapOutputDBType()
Get the database object type (row type of the table) for the map output data.

Returns:
the map output database type

getMapOutputDBTypeSet

public java.lang.String getMapOutputDBTypeSet()
Get the database table type for the map output data.

Returns:
the map output database table type

getOutputDBType

public java.lang.String getOutputDBType()
Get the database object type (row type of the table) for the (final) job output data.

Returns:
the (final) output database type

getOutputDBTypeSet

public java.lang.String getOutputDBTypeSet()
Get the database table type for the (final) job output data.

Returns:
the (final) job output database table type

getInputSplitRecordDBType

public java.lang.String getInputSplitRecordDBType()
Get the database object type (row type of the table) for the input split record data.

Returns:
the input split record database type

getInputSplitRecordDBTypeSet

public java.lang.String getInputSplitRecordDBTypeSet()
Get the database table type for the input split record data.

Returns:
the input split record database table type

getInputSplitRecordDBTypes

public java.lang.String[] getInputSplitRecordDBTypes()
For the ReadInputSplits operations, return an array of the database column types in the output record type. When the data source is Hive the values in this array are initially derived from the Hive input field types obtained from the Hive metadata. When the data source is not Hive the values in the array reflect those given by getInputSplitRecordKeyDBType and getInputSplitRecordValueDBType. The values in the array may be overridden using the method setInputSplitRecordValueDBType.

Returns:
an array of the database column types in the output record type

setInputSplitRecordDBTypes

public void setInputSplitRecordDBTypes(java.lang.String value)
For the ReadInputSplits operations, set the values in the array returned by getInputSplitRecordDBTypes. The format of the input value argument is a comma separated list of the values that are to be used in the array. The values established by this method are used in the output record type if the data source is Hive. Otherwise, in order to affect the output record type one must use setInputSplitRecordKeyDBType and setInputSplitRecordValueDBType.

Parameters:
value - comma separated list of the values that are used to set the values in the array returned by getInputSplitRecordDBTypes

getInputSplitRecordClasses

public java.lang.Class[] getInputSplitRecordClasses()
For the ReadInputSplits operations, return an array of the java class types in the output record type. When the data source is Hive the values in this array are initially derived from the Hive input field types obtained from the Hive metadata. When the data source is not Hive the values in the array reflect those given by getInputSplitRecordKeyClass and getInputSplitRecordValueClass. The values in the array may be overridden using the method setInputSplitRecordValueClass.

Returns:
an array of the java class types in the output record type

setInputSplitRecordClasses

public void setInputSplitRecordClasses(java.lang.String value)
For the ReadInputSplits operations, set the values in the array returned by getInputSplitRecordClasses. The format of the input value argument is a comma separated list of the values that are to be used in the array. The values established by this method are used when converting data to the output record type if the data source is Hive. Otherwise, in order to affect the conversion of output data one must use setInputSplitRecordKeyClass and setInputSplitRecordValueClass.

Parameters:
value - comma separated list of the values that are to be used to set the values in the array returned by getInputSplitRecordClasses

getInputClasses

public java.lang.Class[] getInputClasses()
Returns an array of classes provided for identifying the argument types of map method. For the GetReadAndMapReduceInputSplits and ReadAndMapReduceInputSplits operations, in order for the fields read from the InputSplits to be passed to the map method at the java level, the map method needs to be identified. In the case where the name of the method is not sufficient to identify it, this method returns an array of classes provided for identifying the argument types.

Returns:
an array of classes provided for identifying the argument types.

setInputClasses

public void setInputClasses(java.lang.String value)
                     throws java.lang.IllegalStateException,
                            java.lang.ClassNotFoundException
Sets the class names for identifying the argument types of map method. For the GetReadAndMapReduceInputSplits and ReadAndMapReduceInputSplits operations, in order for the fields read from the InputSplits to be passed to the map method at the java level, the map method needs to be identified. In the case where the name of the method is not sufficient to identify it, this method sets the class names for identifying the argument types. Also in case of multi-value map method it will identify the input types of job.

Parameters:
value - a single string comprised of comma separated class names
Throws:
java.lang.IllegalStateException
java.lang.ClassNotFoundException

setMultiValue

public void setMultiValue(boolean value)

getMapperImpl

public java.lang.String getMapperImpl()
Get the database object type for the Mapper call specification.

Returns:
the name of the Mapper class specification

getReducerImpl

public java.lang.String getReducerImpl()
Get the database object type for the Reducer call specification.

Returns:
the name of the Reducer class specification

setMapMethodName

public void setMapMethodName(java.lang.String methodName)
                      throws java.lang.IllegalStateException
Set the Map method name

Parameters:
methodName - the name of the Map method
Throws:
java.lang.IllegalStateException

getMapMethodName

public java.lang.String getMapMethodName()
Get the Map method name

Returns:
the name of the Map method, defaulting to "map"

getMapMethodSignature

public java.lang.Class[] getMapMethodSignature()
For the GetReadAndMapReduceInputSplits and ReadAndMapReduceInputSplits operations, this method provides for querying the actual arguments types of the map method that will be used.

Returns:
array of Class objects for the argument types

getConfKey

public long getConfKey()
Get the retrieval key for the Hadoop job configuration.

Returns:
the retrieval key

setDataTypeCheck

public void setDataTypeCheck(java.lang.Boolean flag)
                      throws java.lang.IllegalStateException
Set if data type checks between SQL and Hadoop writable types for map/reduce input data are performed at the beginning of each stage. This only checks on the top level whether the SQL data type can be converted to the Hadoop type. This check is very basic and might not cover all convertible types.

Parameters:
flag - the flag for data type check
Throws:
java.lang.IllegalStateException

checkDataType

public java.lang.Boolean checkDataType()
Return the flag for data type check.

Returns:
the flag for data type check

resetConfigDBStore

public void resetConfigDBStore()
                        throws java.sql.SQLException
Reset the sequence and table for storing Hadoop job configurations. This method first calls dropConfigDBStore() to drop the corresponding sequence and table, and then calls createConfigDBStore() to create the corresponding sequence and table. This is a very strong way to reset the configuration storage in the database. A more graceful approach is to rely on a logoff trigger to remove the entries created in the current session from the table.

Throws:
java.sql.SQLException

getUsingHive

public boolean getUsingHive()
                     throws java.lang.Exception
Wrapper over getHiveMetaStoreData that forces reconnection and retrieval of metadata if properties required for connection are present, and returns whether hive is in use (ie whether those properties are set).

Returns:
value indicating whether Hive is in use
Throws:
java.lang.Exception

initializeLogging

public static void initializeLogging()
Initializes log4j from the oc4hadoop_log4j.properties file loaded as a JAVA Resource


init

public long init()
          throws java.lang.Exception
Initialize a job by creating the database state required for executing the job. This includes creating the output types, the table functions and storing the configuration in an internal table from which it can be retrieved using a numeric key. The return value from this method is that key. The job must be in the DEFINE state when the init method is called, else an IllegalStateException is thrown. The init method moves the job to the INITIALIZED state. A job is in the DEFINE state when created and can be moved back to that state with the reopen method. The job is initialized for the operation previously specified in a call to the setOperation method, or if no such call has occurred, for the MapReduce operation.

Returns:
the retrieval key
Throws:
java.lang.Exception

init

public long init(Job.Operation operation)
          throws java.lang.Exception
Initialize a job by creating the database state required for executing the job. This includes creating the output types, the table functions and storing the configuration in an internal table from which it can be retrieved using a numeric key. The return value from this method is that key. The job must be in the DEFINE state when the init method is called, else an IllegalStateException is thrown. The init method moves the job to the INITIALIZED state. A job is in the DEFINE state when created and can be moved back to that state with the reopen method. The job is initialized for the operation specified in operation argument.

Parameters:
operation - operation of job
Returns:
the retrieval key
Throws:
java.lang.Exception

run

public void run()
         throws java.lang.Exception
Run the job with the table names as set in the configuration.

Throws:
java.lang.Exception

run

public void run(java.lang.String inTable,
                java.lang.String outTable)
         throws java.lang.Exception
Run the job with the given tables.

Parameters:
inTable - the input table name
outTable - the output table name
Throws:
java.lang.Exception

run

public void run(java.lang.String table)
         throws java.lang.Exception
Runs the job for GetInputSplits operation

Parameters:
table - output table
Throws:
java.lang.Exception

waitForCompletion

public boolean waitForCompletion(boolean verbose)
                          throws java.io.IOException
Submit the job and wait for it to finish.

Overrides:
waitForCompletion in class org.apache.hadoop.mapreduce.Job
Parameters:
verbose - print the progress to the user
Returns:
true if the job succeeded
Throws:
java.io.IOException - thrown if the Job execution fails

setTypeMap

public void setTypeMap(java.util.Map<java.lang.String,java.lang.Class<?>> map)
Map a User Defined Type defined in SQL to the Java Class to be used as Hadoop Key and Value. This mapping needs to be defined if one is using custom writable for a Hadoop Key and Value and has defined a corresponding SQL Object type. The user defined classes specified as Mapper or Reducer input/output are automatically mapped by the framework. This method should be used to map other classes that require conversion. (e.g. a field inside Custom Writable class which also implements SQLData interface)

Parameters:
map - the java.util.Map> map object that maps the SQL object TypeName to the Java Class which implements the java.sql.SQLData interface

Oracle® In-Database Container for Hadoop Java API Reference
Release 1.0.1

E54638-01

Copyright © 2014, Oracle and/or its affiliates. All rights reserved.