Skip to content
fstagni edited this page Jun 12, 2015 · 20 revisions

The main point of this version is the introduction of a new type of pilot, that is, for most parts, an implementation of the points discussed within https://github.com/DIRACGrid/DIRAC/wiki/Pilots-2.0:-generic,-configurable-pilots. These changes will be transparent to VOs. Also, several changes of the Data Management system are done.

Changes for the pilot

In case your VO only uses Grid resources, and the pilots are only sent by SiteDirector and TaksQueueDirector agents, and you don't plan to have any specific pilot behaviour, you can stop reading here: the new pilot won't have anything different from the old pilot that you will notice.

Instead, in case you want, for example, to install DIRAC in a different way, or you want your pilot to have some VO specific action, you should carefully read the RFC 18, and what follows. You should also keep reading if your resources include IAAS and IAAC type of resources, like Virtual Machines.

The files to consider are in https://github.com/DIRACGrid/DIRAC/tree/rel-v6r12/WorkloadManagementSystem/PilotAgent The main file in which you should look is https://github.com/DIRACGrid/DIRAC/blob/rel-v6r12/WorkloadManagementSystem/PilotAgent/dirac-pilot.py that also contains a good explanation on how the system works.

The system works with "commands", as explained in the RFC. Any command can be added. If your command is executed before the "InstallDIRAC" command, pay attention that DIRAC functionalities won't be available.

We have introduced a special command named "GetPilotVersion" in https://github.com/DIRACGrid/DIRAC/blob/rel-v6r12/WorkloadManagementSystem/PilotAgent/pilotCommands.py that you should use, and possibly extend, in case you want to send/start pilots that don't know beforehand the (VO)DIRAC version they are going to install. In this case, you have to provide a json file freely accessible that contains the pilot version. This is tipically the case for VMs in IAAS and IAAC.

Beware that, to send pilots containing a specific list of commands via SiteDirector agents need a SiteDirector extension.

Changes in the WMS

There are a number of changes in the DB classes of WMS, especially for what concerns JobDB. These changes are not strictly necessary, and have actually been introduced within patch release v6r11p14, together with some changes at the code level. The changes have been introduced with PR https://github.com/DIRACGrid/DIRAC/pull/2093. These changes are anyway highly recommended, and will make sure your DB reacts faster and is more reliable. We recommend you to update the MySQL schema according to what it is in the py and JDL DB files in DIRAC.WorloadManagementSystem.DB, starting from JobDB. Note: some tables have been dropped: if your extension needs them, please open a GitHub issue.

It is now possible to set the delay at which jobs in final states are removed from the WMS, via the JobCleaningAgent CS parameters RemoveStatusDelay/Done, RemoveStatusDelay/Killed, RemoveStatusDelay/Failed (default is 7 days).

Changes for the DFC (DIRAC File Catalog)

As visible in https://github.com/DIRACGrid/DIRAC/pull/1983, some fixes and improvements of the DFC requires the tables to be INNODB. It is thus necessary to update your DB so that all the tables use that engine (ALTER TABLE myTable ENGINE = INNODB;)

See also these notes: https://github.com/DIRACGrid/DIRAC/wiki/DIRAC-v6r12p14

Changes for ResourceManagementDB (Resource Status System)

As committed within https://github.com/DIRACGrid/DIRAC/pull/1950 there is a new field in the DowntimeCache table: 'GOCDBServiceType' : 'VARCHAR(32) NOT NULL'

Changes for SystemLoggingDB (Framework)

As committed within https://github.com/fstagni/DIRAC/commit/23c0c741014e0589fcdbd6ba17dabfdf3558c8e4 SystemLoggingDB.FixedTextMessages.FixedTextString moved to VARCHAR(767)

Multi-DB accounting

Since v6r12 each accounting type can be stored in a different DB. By default all accounting types data will be stored in the database defined under /Systems/Accounting/Instance/Databases/AccountingDB. To store a type data in a different database (say WMSHistory) define the data base location under the databases directory. Then define /Systems/Accounting/Instance/Databases/MultiDB and set an option with the type name and value pointing to the database to use. For instance:

Systems
{
  Accounting
  {
    Development
    {
      AccountingDB
      {
        Host = localhost
        User = dirac
        Password = dirac
        DBName = accounting
      }
      Acc2
      {
        Host = somewhere.internet.net
        User = dirac
        Password = dirac
        DBName = infernus
      }
      MultiDB
      {
        WMSHistory = Acc2
      }
    }
  }
}

With the previous configuration all accounting data will be stored and retrieved from the usual database except for the WMSHistory type that will be stored and retrieved from the Acc2 database.

Changes in the configuration of RequestExecutingAgent (RequestManagementSystem)

There is a new RequestOperation to be added, so in the list of OperationHandlers found in CS you should add: SetFileStatus { Location = DIRAC/TransformationSystem/Agent/RequestOperations/SetFileStatus MaxAttempts = 256 }

Changes in the run configuration of few agents

The following agents/executors should be run with CSEC_MECH=ID (add to "run" file):

  • All Optimizers
  • All WorkflowTask agents
  • TransformationAgent
  • ValidateOutputDataAgent

After changes done in https://github.com/DIRACGrid/DIRAC/pull/2199

Changes for the Transformation System WorkflowTaskAgent and RequestTaskAgent

If these agents are submitting tasks, they need a shifterProxy option defined in the CS

Clone this wiki locally