events and event issues
things we want the event system to do and some of the issues we need to address. (transcript of pictures)
Model of event communication:
"server" side:
(e.g. swim.gat.com)
- web portal (displays messages from db)
- mySQL db(messages get stored here)
- runID service
- message service (AKA fgm) (gets messages to post on portal)
"client" side:
(e.g. mhd@pppl.gov)
- IPS framework (and associated physics codes, etc.) - gets runID from runID service on server side
- event service from IPS framework generates messages and sends them to the message service (fgm) on the server side
------------------------------
Things that we want to express in events to portal:
* Job start/end/submit
* component start/end
- method start/end
* exceptions/faults (levels of severity)
* file movement/creation
* time consuming operations
* links to log file (most useful)
* security issues
* physics timestep
* metadata
--------------------------------
event/notification/pubsub system desiderata:
1. Wide(r) community support (best: wider than just the sci comp community).
2. Easy to use via Python. Secondarily directly from C/C++/Fortran.
3. Not require Java (but a plus if it also supports clients in Java).
4. Need multiple clients/listeners supported. Secondarily, same for publishers.
5. Remote communication to/from portal or database, ...
6. Robust, easy, quick to start/stop.
7. Security provided but not onerous. Secondarily, can be used without security enabled.
-----------------------------
Nomenclature Discussion:
* tag - additional info to identify the run (will be in config file and propagated to portal).
* runID - unique ID for a single instance of a run
* output prefix - differentiates between runs in single tree (the string gets appended to all file names within tree) - <output prefix> + <filename>
* <<<ELIMINATED>>> IPS runID - another descriptor, but redundant
* simulation name - (root) name of top level folder of a run
"server" side:
(e.g. swim.gat.com)
- web portal (displays messages from db)
- mySQL db(messages get stored here)
- runID service
- message service (AKA fgm) (gets messages to post on portal)
"client" side:
(e.g. mhd@pppl.gov)
- IPS framework (and associated physics codes, etc.) - gets runID from runID service on server side
- event service from IPS framework generates messages and sends them to the message service (fgm) on the server side
------------------------------
Things that we want to express in events to portal:
* Job start/end/submit
* component start/end
- method start/end
* exceptions/faults (levels of severity)
* file movement/creation
* time consuming operations
* links to log file (most useful)
* security issues
* physics timestep
* metadata
--------------------------------
event/notification/pubsub system desiderata:
1. Wide(r) community support (best: wider than just the sci comp community).
2. Easy to use via Python. Secondarily directly from C/C++/Fortran.
3. Not require Java (but a plus if it also supports clients in Java).
4. Need multiple clients/listeners supported. Secondarily, same for publishers.
5. Remote communication to/from portal or database, ...
6. Robust, easy, quick to start/stop.
7. Security provided but not onerous. Secondarily, can be used without security enabled.
-----------------------------
Nomenclature Discussion:
* tag - additional info to identify the run (will be in config file and propagated to portal).
* runID - unique ID for a single instance of a run
* output prefix - differentiates between runs in single tree (the string gets appended to all file names within tree) - <output prefix> + <filename>
* <<<ELIMINATED>>> IPS runID - another descriptor, but redundant
* simulation name - (root) name of top level folder of a run