Two sets of configurable properties set the behavior of the Deployment Template fault tolerance mechanism and the frequency of status checks for components.
You can now configure fault tolerance (i.e., retries) for any component (such as Forge, Dgidx, and Dgraph) when invoked through the EAC. This functionality also extends to the CAS server when running a crawl with the CAS component. The name of the fault-tolerance property is maxMissedStatusQueriesAllowed.
When components are run, the Deployment Template instructs the EAC to start a component, then polls on a regular interval to check if the component is running, stopped, or failed. If one of these status checks fails, the Deployment Template assumes the component has failed and the script ends. The maxMissedStatusQueriesAllowed property allows a configurable number of consecutive failures to be tolerated before the script will end.
<forge id="Forge" host-id="ITLHost">
<properties>
<property name="numStateBackups" value="10"/>
<property name="numLogBackups" value="10"/>
<property name="maxMissedStatusQueriesAllowed" value="10"/>
</properties>
...
</forge>
The default number of allowed consecutive failures is 5. Note that these status checks are consecutive, so that every time a status query returns successfully, the counter is reset to zero.
Keep in mind that you can use different fault-tolerance settings for your components. For example, you could set a value of 10 for the Forge component, a value of 8 for Dgidx, and a value of 6 for the Dgraph.
As described in the previous section, the Deployment Template polls on a regular interval to check if a started component is running, stopped, or failed. A set of four properties is available to configure each component for how frequently the Deployment Template polls for status while the component is running. Because each property has a default value, you can use only those properties that are important to you.
<forge id="Forge" host-id="ITLHost">
<properties>
<property name="numStateBackups" value="10"/>
<property name="numLogBackups" value="10"/>
<property name="standardPollingIntervalMs" value="60000"/>
<property name="slowPollingIntervalMs" value="600000"/>
<property name="minWaitSeconds" value="28800"/>
<property name="maxMissedStatusQueriesAllowed" value="10"/>
</properties>
...
</forge>
The result of this configuration would be that for the first 8 hours (minWaitSeconds=28800), Forge’s status would be checked every 10 minutes (slowPollingIntervalMs=600000), after which time the status would be checked every minute (standardPollingIntervalMs=60000). If a status check fails, a maximum of 10 consecutive retries will be attempted, based on the standardPollingIntervalMs setting.
Keep in mind that these values can be set independently for each component.
If you do not use any of these methods, then the utility will use the default values listed in the two previous sections.
// create the target dir, if it doesn't already exist mkDirUtil = new CreateDirUtility(CAS.getAppName(), CAS.getEacHost(), CAS.getEacPort(), CAS.isSslEnabled()); mkDirUtil.init(Forge.getHostId(), destDir, CAS.getWorkingDir()); mkDirUtil.run();
// create the target dir, if it doesn't already exist mkDirUtil = new CreateDirUtility(CAS.getAppName(), CAS.getEacHost(), CAS.getEacPort(), CAS.isSslEnabled()); mkDirUtil.init(Forge.getHostId(), destDir, CAS.getWorkingDir()); mkDirUtil.setMinWaitSeconds(30); mkDirUtil.setMaxWaitSeconds(120); mkDirUtil.setMaxMissedStatusQueriesAllowed(10); mkDirUtil.setPollingIntervalMs(5000); mkDirUtil.setSlowPollingIntervalMs(30000); mkDirUtil.run();
<copy id=”MyCopy” src-host-id=”ITLHost” dest-host-id=”MDEXHost” recursive=”true”> <src>./path/to/files</src> <dest>./path/to/target</dest> </copy>
MyCopy.setMaxMissedStatusQueriesAllowed(10); MyCopy.run();
For more information on the Utility methods, see the Javadocs for the EAC Toolkit package.