Research

We try to gather information about the publications that describe the GWA use the data and/or tools from the GWA, or are related to the GWA. If you have any suggestions (e.g., additions, incorrect data), please contact us at gwa@tudelft.nl.

Published work

Research in grid resource management

  • A. Iosup, C. Dumitrescu, D. H. Epema, H. Li, L.Wolters, How are real grids used? The analysis of four grid traces and its implications, in: Intíl. Conf. on Grid Computing (GRID), IEEE Computer Society, 2006, pp. 262-269. (Analysis of four grid traces from the GWA, with a focus on virtual organizations, on users, and on individual jobs characteristics. The main finding is that real grid workloads differ significantly from those used in grid simulation research, and in particular that grid workloads comprise mostly single processor jobs.)
  • A. Iosup, M. Jan, O. Sonmez, D. Epema, The characteristics and performance of groups of jobs in grids, in: Intíl. European Conference on Parallel and Distributed Computing (Euro-Par), Lecture Notes in Computer Science, Springer, 2007, pp. 382-393. (Analysis of the existence of batches of jobs in grids. Main finding: batches are responsible for 85-95% of the jobs, and for 30-96% of the total consumed CPU.)
  • A. Iosup, D. Epema, T. Tannenbaum, M. Farrellee, M. Livny, Inter-operating grids through delegated matchmaking, in: Proc. of the ACM/IEEE Conference on High Performance Networking and Computing (SC), ACM Press, 2007. (Assesses the imbalance of job arrivals in multi-cluster grids. Main finding: overall imbalance of 5:1 between the most used and the most unused clusters.)
  • H. Li, R. Heusdens, M. Muskulus, L. Wolters, Analysis and synthesis of pseudoperiodic job arrivals in grids: A matching pursuit approach, in: IEEE/ACM Intl. Symp. on Cluster Computing and the Grid (CCGrid), IEEE Computer Society, 2007, pp. 183-196. (Statistical analysis of cluster and grid level workload data from LCG.)
  • H. Li, M. Muskulus, L. Wolters, Modeling job arrivals in a data-intensive grid, in: E. Frachtenberg, U. Schwiegelshohn (Eds.), Intíl. Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), Revised Selected Papers, Vol. 4376 of Lecture Notes in Computer Science, Springer, 2007, pp. 210-231. (Statistical analysis of cluster and grid level workload data from LCG.)
  • H. Li, L. Wolters, Towards a better understanding of workload dynamics on data-intensive clusters and grids, in: Intíl. Parallel & Distributed Processing Symposium (IPDPS), IEEE Computer Society, 2007, pp. 1-10. (Statistical analysis of cluster and grid level workload data from LCG.)
  • H. Li, Long range dependent job arrival process and its implications in grid environments, in: Proc. of MetroGrid Workshop, Intíl. Conference on Networks for Grid Applications (GridNets07), ACM Press, 2007, (in press). (Main finding: The performance impact of the correlations between the workload characteristics influences significantly the system performance, both at the local and at the grid level.)
  • A. Iosup, P. Garbacki, D. H. Epema, Provisioning and scheduling resources for world-wide data-sharing services, in: IEEE Intíl. Conf. on e-Science and Grid Computing (e-Science), IEEE Computer Society, 2006, pp. 84-84. (Provisioning and scheduling policies for data-sharing services are evaluated using trace-based simulation; traces are taken from the GWA.)
  • H. Li, D. L. Groep, J. Templon, L. Wolters, Predicting job start times on clusters, in: IEEE/ACMIntl. Symp. on Cluster Computing and the Grid (CCGrid), IEEE Computer Society, 2004, pp. 301-308. (Investigates the factors that affect the job start time prediction in (grid) clusters.)
  • C. Stratan, C. Cirstoiu, A. Iosup, On the accuracy of off-line monitoring information in grids, in: Proc. of the 17th Intl. Conference on Control Systems and Computer Science (CSCS-17), 2007, may, Bucharest, Romania. (Assesses the trade-off between the quality of information and the monitoring overhead; simulation results show that a reduction of 90% in monitoring overhead can be achieved with a loss in accuracy of at most 10%.)

Grid maintenance and operation

Note: this section includes work on trace-based performance evaluation of real systems.

  • A. Iosup, D. H. J. Epema, GRENCHMARK: A framework for analyzing, testing, and comparing grids, in: IEEE/ACM Intl. Symp. on Cluster Computing and the Grid (CCGrid), IEEE Computer Society, 2006, pp. 313-320. (GrenchMark, a framework for testing, analyzing, and comparing grids, uses traces from the GWA to assess the job success rate in a multi-cluster grid.)

Education

If you (plan to) use the GWA data or tools in classroom (e.g., use the data as input for the grid simulator in student assignments), please contact us at gwa@tudelft.nl.

Side content