Today, I attended a meeting at the Paris observatory about the incoming launch of the Gaia satellite and the associated data (mega-)challenges. To borrow from the webpage, “To create the largest and most precise three dimensional chart of our Galaxy by providing unprecedented positional and radial velocity measurements for about one billion stars in our Galaxy and throughout the Local Group.” The amount of data that will be produced by this satellite is staggering: Gaia will take pictures of roughly 1Giga pixels that will be processed both on-board and on Earth, transmitting over five years a pentabyte of data that need to be processed fairly efficiently to be at all useful! The European consortium operating this satellite has planned for specific tasks dedicated to data handling and processing, which is a fabulous opportunity for would-be astrostatisticians! (Unsurprisingly, at least half of the tasks are statistics related, either at the noise reduction stage or at the estimation stage.) Another amazing feature of the project is that it will result in open data, the outcome of the observations being open to everyone for analyse… I am clearly looking forward the next meeting to understand better the structure of the data and the challenges simulation methods could help to solve!