Author: Jose Hernandez, e-mail: Jose.Hernandez@desy.de,
office: 7224 (Zeuthen), 4846 (Hamburg), mobile 01704509401
Starting the DAQ gui for reprocessing
In hb-cr10 in the hbshift account click on the "Reprocessing"
button (or execute directly /online/ONLINE/pro/Linux_intel/bin/start_repro)
Configuring the DAQ for running reprocessing
Click on "New Run". The run configuration
windows will pop up. In the list of run types click
on "repro". The repro menu will pop up:
1: Experiment number
and reprocessing number (fixed)
2: To select
manually a range of runs to be reprocessed, one can enter the first and
last run numbers and click "All Runs" button.
3: One can also add
a run number manually by typing the run number and pressing <return>.
The run will appear in the list. Click on the run to select it.
4: If a run has already
been partially reprocessed the background turns pink when selected
5: If a run has not
been reprocessed at all, the background turns green when selected
6: If the background
of a run does not change of color when selected, then the run has already
been completely reprocessed.
7: One can change
the order of reprocessing of the runs using the buttons top, bottom, up
8: By pressing this
button the runs completely reprocessed are removed from the list
9: The standard way
of selecting the runs to be reprocessed is pressing this button. A pop-up
window appears where one can select the list of runs available for the
current reprocessing number.
10: List of runs and total
number of events selected in the list of runs (runs with pink or green
11: Timeout for changing
from one run to the next one when all events of the run have been provided
to the reconstruction node but the total number of reprocessed events is
smaller than the total number of events in the run.
12: This button shows the
list of the runs reprocessed since the gui was started.
- Make sure the logging type is "Archive" and the maximum number
of nodes in the SLT and 4LT farms are selected.
Selecting runs to be reprocessed
Click on "Runs from DB" to select the
list of runs.
- The number of runs selected and the total number of events will be
displayed under "Runs:" and "Events:"
- Click OK in the repro menu and click OK in the window that pops us
asking you if you are ready to start the run.
Starting the run
- In the run control window press "Menu" and then "create". All the
processes will be created in the same way as in a normal /RUN_switch/ run.
When the system is in the INITIALIZED state, press "Ready" to bring the
system to the READY state and then press "Start" to bring the system to
the RUN state.
- When the reprocessing of a run is completed, the system will automatically
move to the next run doing automatically the transitions RUN->STANDBY->
- The repro menu is accessible from the RunCfg->REPROCESSING->Reprocessing
menu of the run control window. At any moment runs can be added or deselected.
Monitoring the reprocessing
-The reprocessing monitor looks like this:
It shows the input rate (upper left) and the output rates in every logger
(middle and lower left). The number of free ARTE processes is displayed
in the upper right plot.
- The error logger yellow window displays useful info and error messages.
In many cases one can diagnose a problem reading these messages.
- The OSM Robot staging and archiving status and queues can be checked
here: Usage, Queues.
Known problems and solutions:
- Some times the RHP gatherer 4LT processes have problems with the state
transitions. When changing from one run to the next one, if the transition
gets stuck for few minutes you might have to terminate and restart the
reprocessing. In the "expert" menu, select the "checkDaughtherList"
item to check which process is holding the transition. To terminate the
reprocessing the state cannot be STANDBY or RUN. If the state transition
is stuck, you'll will have to skip all the branches by selecting the item
"CheckDautherCfg" in the "expert" menu and changing from "active" to "skipped"
all the components.
- It might happen that the number of active FARM nodes (caption of the
upper right plot) gets too small compared to the total number of booted
nodes. The reason is that the ARTE processes die from time to time due
to crashes in the reconstruction. The input and output rates decrease as
less ARTE processes are running. If the reprocessing rate has significantly
decreased, one should terminate and restart the reprocessing.