| LOFAR (the Low-Frequency Array) is a conceptually new radio telescope currently being constructed in the Netherlands, extending into Europe (e-LOFAR). The multi-directional observing-mode of LOFAR allows astronomers for the first time to probe large areas of the sky at long radio-wavelengths (~2 m) with sufficient sensitivity and speed. However, the large data-sets, instrumental and astrophysical challenges posed by LOFAR long-wavelength observations requires analysis-tools and software that go beyond what traditional techniques and analysis packages offer or can handle. All these challenges come together in the LOFAR Epoch-of-Reionization Key-Science-Project (EoR-KSP), lead by our team. This project aims to detect the red-shifted 21-cm radio emission coming from hydrogen at the Epoch-of-Reionization (EoR; the Universe's first billion years), during which the first stars, quasars and galaxies formed. This epoch influenced all subsequent cosmic events and as such is one of the most important phases of the Universe, largely determining how we perceive it today. The goal of our KSP is to statistically detect and quantify this signal as function of cosmic time and constrain different scenarios of early baryonic structure formation (e.g. formation of proto-galaxies, large-scale structure and quasars). Analysing petabyte (PB) data-sets, however, is as challenging as building the instrument itself. The ~1 PB data-set from the LOFAR EoR-KSP will be reprocessed after initial calibration, compressed to ~100 terrabyte (TB), and then studied using analysis-tools that we are currently developing. Analysing the LOFAR EoR data involves solving >10^4 independent large linear systems of complex-numbers, using computationally extraordinarily demanding maximum-likelihood (ML) techniques. The currently, less complex, processing-pipeline requires access to a 1000+ CPU cluster for >1 year, and analysing the results requires a further 10-100 Tflop/s processing power for ~1 year to perform ML inversions. Neither the purchase of a CPU-only-cluster, nor access to general-purpose supercomputers in the Netherlands or abroad for a sustained period of 1-2 years, is feasible and realistic. However, the linear equations describing the LOFAR data-model lend themselves perfectly to be solved, not on classical CPUs, but on Graphical Processor Units (GPUs). Our KSP is already implementing basic simulation, inversion and analysis codes on a mini-cluster of 3 NVIDIA-Tesla S870 units and in several tests we obtain GPU/CPU speed up ratios of 30-100 in the relevant linear operations, including I/O, similar to test by other groups. Based on the 0.1-1 PB data-set, the demanding computational requirements (1021-22 flops) and our scaling-tests, we request funding to (1) purchase a dedicated cluster of 50 NVIDIA-Tesla S870 plus Quadcore-PC units reaching >10 Tflop/s effective performance, 2-4 TB/unit temporary storage and supporting hardware (racks, cables, connections) and (2) hire a GPU programmer for two years to fully optimise the final LOFAR EoR-KSP analysis software-pipeline ported on this architecture. This allows us to analyse our data in 1-2 years (including overhead) and accomplish the scientific goals set out by the LOFAR EoR-KSP to detect, for the first time, the emission of neutral hydrogen from the first billion years after the Big Bang and start studying the Universe in earliest infancy. |