Instructions for running the DAQ

Responsible for this sheet: Vladimir Rybnikov.      Last Update:  January 30th  2003.       Further information: DAQ WWW page
Some internals of the DAQ system can be found in
HERA-B Run and Slow Control Tour by Vladimir Rybnikov

What to do if the expert is not reachable?     Contact your favorite DAQ expert: see expert shift list.


Contents


How to start a DAQ run setup.

 DAQ should be run from  the DAQ terminal (hb-cr10) closest to the Shift crew terminals.
 

This chapter contains instructions to set up all processes required to take data with the complete hera-b detector or parts of it. In this documentation processes are printed bold. The code for buttons on graphical user interfaces is this. Names of states are printed such.

prcmanag is responsible to setup the processes which are common for all ongoing runs. There should be always a main prcmanag running in hbdaq0 .

This process is launched automatically when crcui is started. While running the number of prcmanag should increase depending on the run type. The abnormal termination of one of them implies a crash in one of its child processes (see trouble shouting) . In this
 case a list of killing process message will appear in the error logger window.

The user interface is called crcui and can be found in the usual online path. Steps to create a run
 
 
THESE ARE THE INSTRUCTIONS TO BOOT ALL PROCESSES IN THE RUN.
IN CASE THE SYSTEM IS ALREADY RUNNING PLEASE FOLLOW INSTRUCTIONS IN  change RUN.
 

 

  1. logon to hbshift on hb-cr10 (passwd is noted in paper logbook).
  2. start crcui from the Button bar at the upper left corner of the display.  (RUN_control).
  3. press new run on the popped up window
  4. select the run configuration "switch" .
  5. after selecting the detector and run components ( red button means selected)  press  the key OK and confirm the pop-up.
  6. Kill the yellow Error-Logger Window with (Ctrl-C) in case it is there.
  7. select the run type in the RUN cfg menu. Every option stands for a different Second Level Trigger code. The configuration of the different running condictions are shown below.
  8. FOR PRETRIGGER RUNS ---> Select  FLT (should turn red)  in the Run conf menu on the control panel.
  9. press MENU on the Process Control section of the control panel  (a pop-up will appear).
  10. press Create on the pop-up to create processes.
  11. wait till "state:" on the control panel is INITIALIZED (and watch the two error loggers; the white one is global and it is runnig permanetly in the hb-cr10 DAQ node. the yellow one is specific to the run).
  12. the "Trigger Monitor"  will appear  to monitor and control rates for non-FLT runs.
  13. A green window"DAQ Monitor"   should appear to monitor the data stream just before logging.
  14. Press PAUSEin the control panel  and you have a new run.  The state should change to STANDBY.
  15. Please be patient, the transition could last for few minutes.
  16. Enable FLT triggers for PHYSICS RUNS. (Ignore this point in other cases and execute the next step).
  17. Go  to RUN pressing START button. You are taking data now!!!!!.
  18. From now on you can handle new runs following instructions in change RUN.
  19. The FED error monitor provides ONLINE the state of the subdetector readout. It will generate alarms in case of severe errors.
  20. Normaly there is a  predefined value for the trigger rate..... go to "Trigger Generator" and type the one you wish.

RUN TYPE CONFIGURATION.
 
RUN MODE TYPE FLT BIT FLT RUN type Random Trigger Test Trigger
PRETRIGGER PHYSICS ON ON /Production/Physics/Archive/F ON(4Hz) OFF
MINIMUM BIAS BX OFF OFF /Production/BX/Archive/ ON ON
CALIBRATION CALIBRATION OFF OFF /Production/Calibration/Archive/ ON ON/OFF
INTERACTION INTERACTION OFF OFF /Production/Interaction/Archive/
ON
ON

 Three logging configurations are available (how to described here):
 

LOGGING MODE  Logging path Shared Memory partition ROBOT copy  Comments
ARCHIVE /hb/data/REAL/ daq YES PHYSICS RUNS
NONE NONE daq NO DEBUGGING  MODE
DISK /hb/data/TEST/ daq NO TEMPORARY  DATA STORAGE 



Change RUN number:Remember that the run configuration will remain the same. This is needed in the typical cases:
  • Changes in any parameter external to the DAQ...(i.e. Target rate).
  • A detector in the run stops for a while (change gas,....)
  • HOW TO CHANGE A RUN
    1. Go to the control pannel running in DAQ terminal (hb-cr10).
    2. Go to READYpressing STOP button.
    3. Disable FLT triggers if running.(Ignore this point in other cases and execute the next step).
    4. Go back to STANDBY pressing PAUSE button.
    5. Enable FLT triggers for PHYSICS RUNS.
    6. Go back to RUN pressing START button.



    Steps to terminate run processes: (KILL ALL PROCESSES)
    1. This is needed only in case of error or change of the RUN configuration.
    2. disable the trigger in the "Trigger Generator".
    3. disable the FLT trigger (look in FLT documentation ).
    4. Write down the number of events and run remarks in the RUN checklist.
    5. VERY IMPORTANT  = Change state to READY pressing the buttonSTOP in the run control. Wait until the state in  the run control   changes to READY.
    6. press MENU on the control panel and a pop-up will appear.
    7. press Terminate on the pop-up.
    Further Documentation on Standalone running Top

    How to Select a run configuration.

     Select the RUN configuration.

    SLT process type selects the SLT code to be booted at the SLT farm. That means it does not apply before the run processes are killled and restarted.
  • Physics -> triggered events.
  • Callibration  ->ECAL calibration run.
  • Interaction ->Event Assembly with no trigger.
  • BX -> not enabled.
  • TEST -> not  enabled.
  • Logging:
  • Archive -> data saved in /hb/data/REAL/... copied by robot to tape.
  • None -> data not saved in disk.
  • Disk ->data saved in /hb/data/TEST/.... not copied by the robot to tape.
  • BX -> not enabled.
  • TEST -> not  enabled.
  • The different Trigger options are: Further Documentation Top

    How to Start/Pause/Stop an already setup run

    The HERA-B Run Control Panel (crcui) is the top level run control for the moment being. It contains following important user controls: 
    the create and terminate run buttons allow the setup and removal of processes required to run. They should only be operated when a data taking poeriod is starting or after the likely event of a system crash. In this case start from "cleanall switch" in the run setup instructions.


    How to select a RUN type.

    To select the RUN type go to Run Cfg in the pannel and select the Type menu,

    SLT process type selects the SLT code to be booted at the SLT farm. That means it does not apply before the run processes are killled and restarted.

  • Physics -> triggered events.
  • Callibration  ->ECAL calibration run.
  • Interaction ->Event Assembly with no trigger.
  • BX -> not enabled.
  • TEST -> not  enabled.
  • Logging:
  • Archive -> data saved in /hb/data/REAL/... copied by robot to tape.
  • None -> data not saved in disk.
  • Disk ->data saved in /hb/data/TEST/.... not copied by the robot to tape.
  • BX -> not enabled.
  • TEST -> not  enabled.
  • The different Trigger options are: Further Documentation Top 

    Trigger Generator Control Panel

    The Trigger Generator Control Panel allows to control the random trigger and displays the data taking rate. The Controls are only needed for non standardt interventions.
     


     

    The controls of the Trigger Generator Control Panel are:

    The  two monitors show the status of the DAQ rate:



    Fast Control System counters monitor.

    The FCS monitor displays the rates of events that are rejected for different reasons . At the same time it is an easy way of displaying the actual settings of the
    FCS mother and daughters. This monitor is normally runnning in the upper terminal attached to hb-cr10.

     
     


    TRIGGER MASK























             The definition  of the trigger masks as they are generated at the Fast Control System:
     
     

    TRIGGER MASK HARDWARE MEANING COMMENTS
    0 0000 0001 LEMO 1 ECAL Internal LED calibration
    0 0000 0010 LEMO 2 Front Panel trigger for test
    0 0000 0100 LEMO 3 Front Panel trigger for test
    0 0000 1000 LEMO 4 Front Panel trigger for test.
    0 00010000 VME Standalone trigger
    0 0010 0000 RANDOM Random trigger
    0 0100 0000 FLT FLT trigger
    0 1000 0000 PRNDM Pseudorandom trigger
    1 0000 0000 TP Test Pulse trigger

    The example of composite trigger mask:
     

    Trigger mask
    HARDWARE MEANING
    0 0010 0001
    LEMO 1 + RANDOM
    0 0010 0010
    LEMO 2 + RANDOM
    0 0010 0100
    LEMO 3 + RANDOM
    0 0010 1000
    LEMO 4 + RANDOM
    0 0000 1111
    LEMO1+LEMO2+LEMO3+LEMO4

     



    FED Error Monitor


    DAQ Readout Monitor


    The DAQ monitor looks at the event structure and reports information about the event content and errors. The display has three sections:


    How to know that a run setup is there and ready

    Top


    NAME SERVER

    This tool displays the process names running under the HERA-B DAQ enviorement.  (Be extremely careful when you use it, since you can disturb other  people work).


    How to get online errors reported.























    The error logger "erwin" is started by typeing "erwin" at the command line. It displays a window which contains the last some error/warning print outs. In general an error entry looks like:

    17:28:58 [Error] evc - Flt-id hasn't changed after 3 retries
    1. The first field is the time the error occured at
    2. The secoond field is the severity level
    3. Inform for run progress information
    4. Warning for signs of potential misbehaviour which do not affect the dta taking from a DAQ point of view
    5. Error running is seriously effected (data most likely useless)
    6. Fatal continue running is impossible
    7. next the process produceing the error message is shown
    8. then a message string describing the cause of the message
    Further Documentation
    Top

    Trouble shooting
     Action to be taken when some of the DAQ processes die.

        During the running few processes are allowed to die. A pop-up yellow window will tell you that the process died and offers you three options:


          Known actions to some dead processes:
     
     

    Process Name Action
    /RUN_switch/SLP/_* Ignore
    /RUN_switch/ECAL/PC_0 Ignore
    /RUN_switch/SLT/PC_0 Ignore
    /RUN_switch/SVD/PC_0 Ignore
    /RUN_switch/SVD/dmon_slt_svd Restart
    /RUN_switch/*/dmon_* Ignore
    /RUN_switch/ECAL/FLT/pt_moni.hb-vme* Terminate RUN.( Contact ECAL expert)
    /RUN_switch/ECAL/FLT/pt_init.hb-vme* Terminate RUN.( Contact ECAL expert)
    /RUN_switch/ECAL/fed_init.hb-vme* Terminate RUN.( Contact ECAL expert)
    /RUN_switch/ECAL/status_agent Ignore
    /RUN_switch/RHP/gatherer_4lt_* Ignore
    /RUN_switch/FARM/l4aSender_* Ignore


    How to cure few known problems.

                    Use prcdir GUI to check and  remove  the processes on the corresponding nodes
     
     

    08:48:10 [FATAL] DIST_0 - send_to_rps: Couldn't send message of type 0x1c00ff to 1331: (unknown class -1) error 65533
    08:48:10 [FATAL] DIST_0 - send_to_rps: Couldn't send message of type 0x270001 to 1331: (unknown class -1) error 65533
    08:48:10 [FATAL] DIST_0 - send_to_rps: Couldn't send message of type 0x120014 to 1337: (unknown class -1) error 65533
    08:48:11 [FATAL] DIST_0 - send_to_rps: Couldn't send message of type 0x270001 to 1331: (unknown class -1) error 65533

    What to do if the system stacks in one state machine transition and you cannot terminate the run:

    Please,  write in the logbook the processes wich are still in transition, and dismiss.


    Cick on all the green windows until they turn yellow.

    Click on ChgCfg . The state in Central Run control should change to EXIT.



    What to do if logger fails:

    Get DMA countfrom SLBs of a component

    Get VDS controlregister dump
    EXPERT INFORMATION.

    Where to find the subdetector specific processes:
     
     
    SUBDETECTOR VME CONTROLLERS
    ECAL Pretrigger hb-vme42, hb-vme43
    FLT hb-vme63
     TRD hb-vme58
    ITR
    hb-vme59, hb-vme72, hb-vme74
    SVD hb-vme33, hb-vme34,hb-vme57
    ECAL  hb-vme45, hb-vme41, hb-vme62
    OTR hb-vme39
    RICH hb-vme36
    HIPT hb-vme49
    MUON hb-vme40
    EVC hb-vme25.1(the one with controller)
    SWITCH hb-vme25.1, hb-vme25.2, hb-vme27

    Where to find the SPC_GUIprocesses:
     
     
     
    COMPONENT PROCESS NODE
    COMMON HV_veto hb-cr21
    DB_Monitor hb-cr21
    4LT_Control hbdaq10
    Central_crate_control hbdaq10
    DAQ Nodes_Monitor hbdaq9
    TARG targ_db_mirror hbdaq9
    SVD High_Voltage hb-cr26
    Temp_Scan hb-cr26
    Chip_Power hb-cr26
    OTR Crate_control hbdaq9
    Gas_Gain hb-cr20
    ASD8_Control hbdaq9
    High_Voltage hbdaq9
    ITR High_Voltage hbdaq9
    High_Voltage_Current hbdaq9
    High_Voltage_GEM hbdaq9
    Low_Voltage hbdaq9
    Gas_Monitor hbdaq9
    spc_itr_lhcbvcur hb-cr21
    HighPT High_Voltage hbdaq9
    Crate_Control hbdaq9
    Low_Voltage hbdaq9
    RICH High_Voltage hbdaq9
    Low_Voltage hbdaq9
    ECAL Crate_Control hbdaq9
    MUON Crate_Control hbdaq9
    High_Voltage hbdaq9

     
     
     
     

     Where to find the rest of the processes:
     
     
     NODE  FUNCTION
    hbdaq2 Error logger
    hbdaq2 RHP gatherers and savers.
    hbdaq3 target poller
    hbdaq2 cna_manager
    hbdaq2 SMC branch
    hbdaq2 snc
    hbdaq2
    Name server
    hb-cr17 ECAL monitor node
    hbdaq1 sub prc managers for the run
    hb-cr10 crcui
    hbserv/hera-b/hb2ltlog LOGGING
    hb4ltctl l2 to l4 controllers
    hb4ltxxx 4LT
    hbsltxxx FLTE/EVA/SLT
    hbslt238 TARGET DISTRIBUTOR
    hbdaq0 TARGET POLLER, main prcmanager
    hbslt239 SLT to 4LT bridge controller.

     

    LAYOUT FOR VME CRATES DISTRIBUTION IN THE FIRST FLOOR.




    What to do if I don't understand the instructions on this page

    If you feel it could adversely affect the quality of the data being taken, contact the on-call expert. In any case, send an e-mail to the responsible for this page with your suggestions for improvements.
     

    Top 
    Run configuration by experts

    CRC GUI has been change recently to prevent shift crew from doing mistakes during run configurations. It means that CRCUI contains default settings for all HERA-B standard runs.

    The following CRC parameters are defined by the default settings:

    To  configure a run  with the above mentioned parameters, different from the default settings, one has to do the following:
      In case of the run termination, you have to configure the next run again as described.

    Running in case of problems with one of LOGGING nodes

    There are three nodes used for Event Logging : hb4ltlog, hb-delfi4, hb-delfi5.
    There are employed by the DAQ in the same order. In case of any problems with hb4ltlog or hb-delfi4 ,
    runs have to be configured in such a way that the faulty node is not used. To configure the run with exluded
    hb4ltlog or hb-delfi4 one has to do the following:
      logging type (see figure bellow) (no_delfi4 or no_4ltlog )
    Logging type selection
     
     

    Problems with  the Central Network Switch in 201

    In case  all the windows on hb-cr10 and hb-cr21 are frosen, you cannot ping any machine in the Control room, there is most probably a problem
    with the Central Network Switch in room 201. It 's hanging. And since all the computers in the West Hall are connectes via this switch, no more
    network for the experiment.

    To solve the problem, please, follow the instructions bellow:
     

    1. Go to room 201 and re-plug the power cable of the switch.
    2. After you see, that  computers in the Control Room  are back to life (one can ping them), you have to execute the script located  ~hbshift/recover_vmes/recover_vmes on hb-cr10
    3. After the script is finished, kill the old runs (slow, switch or repro)
    4. Re-start the Slow Control.
    5. Now you can start the data taking 'switch'or  reprocessing 'repro' run.

    Top