Wednesday 14 April 2010

Array jobs and orders arrays

The current multiple job interface - the array job semantics - was pulled from SGE/Torque, without a lot of reflection.

This is not necessarily a bad thing, given that it matched existing interfaces, and provided functionality that was needed by users. However it offers only a single model of sub-job interactions - the case of trivially parallel.

Within the home context of qsub, this is not unreasonable. If there are other dependency models, they could be handled separately, with low overhead. For example, an intial calculation needed by subsequent jobs would be run first, a final gather step run afterword, etc. In general, if a user had complex job chaining needs, they could always get qsub installed on the cluster, and have jobs launch jobs. That's not available from Grid contexts, and the latency on jobs makes doing initial/final jobs more painful.

So, I'm thinking about doing four specific modes of operation for array jobs. Firstly, the current and what will be the default - bag of tasks (unordered). The next two are slight changes on that - first-first and last-last; guaranteeing that the indicated task will be run to completion before any other job starts, or the final job won't be started until all the other tasks have completed. This would typically be used with the environment variables, so the same script will have all the job control built in, and select the behaviour at run time. Clearly, first-first and last-last could both be used at the same time.

The final sequencing mode would be strict-ordering. Each task in perfromed in order, and all tasks will complete before the next job is started. This allows any task to refer to the results of previous tasks. It's obvious that this also means first-first and last-last, as side effect.

I can compile these sequencing modes down to the WMS's DAG jobs - as they are clearly a subset of the Directed Acyclic Graph of tasks. My gut feeling is that these modes will give the most common sequencing types that are used, without having to handle complex specifications - although I think I need to see if I can dig out some data on that.

At the moment, I'm tagging these job ordering features as unscheduled - so many ideas, working on them ordered by user feedback. If you'd use this ordered sub tasks, let me know, and I'll bump it up the list!

No comments:

Post a Comment