« Back to talks

European Space Agency: Living Planet Symposium

European Space Agency: Living Planet Symposium

Nextflow in the stars ✨

Invited speaker 2025-06-25 Vienna

Invited to give a talk at the ESA Living Planet Symposium 2025: “Nextflow: Reproducible and scalable data analysis pipelines”.

The presentation is paper number 2092 in session D.03.03: “Impact through Reproducibility in Earth Observation Science and Applications”.

Abstract

Reproducibility is a cornerstone of robust science, and in the field of Earth Observation (EO), it is more critical than ever. The increasing complexity and scale of EO data necessitate tools and practices that ensure research findings can be replicated, validated, and applied across diverse domains. This presentation highlights the utility of Nextflow, an open-source workflow manager, as a transformative tool for reproducible and scalable EO research.

Traditionally rooted in the life sciences, Nextflow (https://nextflow.io) has established itself as a robust and generalist platform, with over 10 years of development and a thriving user base. Central to this success is the nf-core community (https://nf-co.re), a global network of more than 10,000 contributors dedicated to open-source workflows and collaborative development. The nf-core community organises regular hackathons and events, fostering innovation, knowledge-sharing, and collective problem-solving across disciplines. Its ethos of transparency, openness, and accessibility has helped create a rich ecosystem of high-quality, reusable workflows that serve as a model for Open Science practices.

One such example is nf-core/rangeland (https://nf-co.re/rangeland), a pipeline developed to process remotely sensed satellite imagery alongside auxiliary data in multiple steps, to arrive at a set of trend files related to land-cover changes. This pipeline demonstrates Nextflow’s potential to facilitate EO applications, from the desktop to High-Performance Computing (HPC) clusters and cloud platforms.

The portability of Nextflow workflows ensures they can run seamlessly across various infrastructures, bridging the gap between researchers in different institutions and geographical regions. Furthermore, its scalability allows for the processing of massive datasets, a common requirement in EO, encompassing millions of files and terabytes of data.

This presentation will address the following key aspects:

  1. Reproducibility in EO Science: How Nextflow ensures that workflows produce consistent results across different computing environments and collaborators.
  2. Portability Across Infrastructures: Demonstrating the seamless execution of workflows on local systems, HPCs, and major cloud providers.
  3. Scalability to Meet EO Challenges: Case studies showcasing Nextflow’s ability to handle the increasing data volume and complexity in EO applications.
  4. Community-Driven Innovation: Lessons from the nf-core community and the life sciences that can be applied to EO, leveraging Nextflow’s mature ecosystem and existing resources.

The FAIR principles (Findable, Accessible, Interoperable, and Reusable) are embedded within Nextflow workflows, aligning closely with the growing emphasis on Open Science in EO. We will explore how tools like nf-core/rangeland and the broader Nextflow ecosystem can be catalysts for reproducible science, enabling researchers to extend their impact and share knowledge effectively across domains. By bridging the gap between life sciences and EO, this work champions a cross-disciplinary approach to reproducibility. Nextflow’s proven reliability and adaptability, coupled with the thriving nf-core community, offer a model for other scientific domains to emulate. Together, they empower the EO community to tackle pressing global challenges with confidence and collaboration at their core.

Website built using Astro and TailwindCSS
See website source code