|Information collected after the meeting on 26.04.07 [message #4184]
|Thu, 03 May 2007 09:36
Registered: May 2006
I summarize here the information collected during the past week, after our first meeting on April 26, 2007. Points highlighted in green require the action from Kilian's, Horst's or Carsten's side.
1. Running more AliRoot jobs in parallel on the D-Grid machines
I have used the machines lxb255-6-7-8-9 and submitted at the SAME time up to 4 jobs (simulation and reconstruction of 100 events, with configuration file and macros as in the PDC06). All jobs were completed without exceeding the available memory!!!
At the moment a test with 6 jobs on lxb255 is on-going.
The tests were done with AliRoot v4-05-08, the only AliRoot version officially installed for 64 bit machines. The PDC06 was running with v4-04-Rev-08 and then v4-04-Rev-10. I am still waiting for their installation: then I will continue further testing.
I plan to use only lxb255 for further studies. All other machines can be made normally available for GRID usage.
2. Exact monitoring of memory usage and possible leaks in AliRoot
We are using the tool valgrind to monitor the memory consumption of AliRoot in order to identify possible problems. The source of such problems would then be either directly addressed by Marian or communicated to the responsibles.
Results are still partially pending because of the very long time taken by valgrind during execution.
3. Size of PDC06 samples
Alien is back since May 2. I copied a few of the latest samples produced (run 5255), copying all files I could see on the file catalogue. The size of one sample (100 events, a run contains about 1000 samples) is of about 100 MB.
This list of files does not contain raw data. I do not know if the center producing the sample (e.g. GSI) is supposed to copy to CERN more files than what I see now on the file catalogue. This information should actually be in Kilian's or Carsten's hands!
The few samples copied are at /s/sma/pdc06/5255/.
4. Copy to the xrootd cluster at GSI
On Monday April 23, Horst made a modification in the configuration, removing all connections to the MSS (mass storage). I did systematic checks at the end of last week, and could find (iuppy! iuppy!) 0 failures in the transfer of a few hundreds of GB. Horst proposes to leave the system as is now, and resume investigations connected to the MSS after Kilian's return.
I need a mass delete action of the old data copied there, which are all corrupted. I need that to take place before continuing the new copy, to avoid a terrible confusion. No further action will be done before the deletion will take place.