Year: 2019
Role: Researcher
Duration: 3 Months
Relevant links: Publication, MSc Thesis, Github
Summary: I developed a parallelized Gaussian process-based algorithm for optimal sensor placement in large-scale systems, significantly improving accuracy and efficiency over existing methods.
I did this work as part of my Master's thesis at the Data Science Institute at Imperial College London.
Optimal sensor placement is crucial for many complex systems, yet existing algorithms struggle to handle the large-scale data generated by computational simulations. These algorithms typically rely on small-scale, expensive data, leading to imprecise results for large domains. My thesis addresses this issue by parallelizing computations of a Gaussian process (GP)-based sensor placement algorithm, allowing for the unprecedented use of GPs with big data in spatial statistics.
In my research, I optimized sensor placements for a test site around London South Bank University. Instead of running the algorithm for the entire domain, I focused on relevant sub-domains obtained through Fluidity's native domain decomposition. By employing a mix of multi-processing and multi-threading, I ensured full utilization of all available resources.
The results were validated by comparing the pollution levels predicted by the posterior GP to actual pollution levels and measuring performance in data assimilation. Both measures demonstrated that my algorithm achieved near-perfect sensor placements. For example, in one sub-domain, the mean estimation error was only 6.15e-03, significantly outperforming random placement, which had an error of over 1.93.