CHAIMELEON Project Concludes: Tracing Its Legacy and Lasting Impact

Each year, Europe sees 3.7 million new cancer cases, highlighting the urgent need for better tools to improve diagnosis and treatment. Artificial Intelligence (AI) applied to health data offers great potential for cancer care. However, the lack of large, high-quality imaging datasets limits the development of effective AI tools. Creating these imaging biobanks is difficult due to technical, institutional, legal, and ethical challenges.

The CHAIMELEON project was created to tackle these obstacles by building a cancer imaging repository. It includes 12,384 complete cases and 10,189 “imaging-only” cases covering lung, prostate, breast, colon, and rectum cancers, helping to fill a significant gap in databases available to researchers in Europe and beyond. The repository provides secure access to anonymized, high-quality imaging data relevant for clinical use. Now complete, CHAIMELEON offers a centralised data infrastructure that works with existing biobanks and supports safe data sharing and reuse, serving as a single access point for AI developers.

To ensure consistency across images acquired at different sites, with different scanners and acquisition protocols, innovative harmonization solutions were developed and validated. This improves the reliability of imaging biomarkers and demonstrated to enhance the reproducibility and performance of AI-based solutions.

Key achievements of CHAIMELEON include:

  • A secure data repository comprising over 12,000 complete cancer cases and 10,000 image-only cases.
  • A secure cloud-based platform that integrates the entire AI development lifecycle from data ingestion and dataset creation to AI models training and validation.
  • AI-driven harmonization solutions to minimize differences across scanners and enhance AI models’ generalization.
  • An in-silico clinical validation interface that enables the evaluation of AI models by simulating real-world clinical environments.
  • New AI models to support clinical decisions across each cancer type.
  • Strategies for long-term impact and sustainability through its federation within the European Cancer Imaging (EUCAIM) initiative.
  • Generation and dissemination of scientific knowledge through 16+ publications and active contributions to the research community

Figure 1. Main axis of the CHAIMELEON Repository.

Benchmarking outcomes

CHAIMELEON’s benchmarking study showed it is one of Europe’s largest cancer imaging repositories, with over 13,000 cases focused specifically on oncology. While larger biobanks like NAKO and the UK Biobank include more participants, they focus on general epidemiology with fewer cancer-specific cases. Unlike the US-based TCIA, which has many cases but outdated images and limited clinical data, CHAIMELEON offers high-quality, harmonized imaging combined with detailed clinical metadata, supporting the development of reliable and clinically relevant AI tools.

Clinical relevance

The CHAIMELEON project developed a user-friendly, cloud-based in-silico clinical interface that allows clinicians to test AI models for five major cancers in real cases, receiving strong positive feedback. The platform was successfully deployed for 54 clinicians across more than 12 centers in 8 countries: Croatia, Italy, France, Germany, Portugal, Lithuania, Türkiye, and Spain. They provide valuable insights that improve the platform’s usability and integration, while highlighting the need for better trust and explainability.  AI models have been rigorously validated, showing effectiveness in diagnosis and treatment planning. Among the participants, 54% were female clinicians, and data on years of specialty experience were recorded.

In the following figure, we can see the in-silico clinical validation in a) the main screen accessed by users performing the validations, featuring a user-friendly interface that allows them to track their progress and access the different cancer types to be reviewed. In b), the user has accessed the prostate cancer review section, where several cases are shown that have been clinically validated (without support from AI). In c), we observe the view displaying the customized DICOM viewer tailored for prostate cancer clinical review. In d), the viewer is further customized to support AI-assisted clinical review, including detailed model explainability information. Finally, in e), we see the customization of each AI-assisted clinical review adapted to each specific cancer type within the CHAIMELON project.

Figure 2. In-silico clinical validation. (a) Main screen. (b) Prostate cancer review section. (c) DICOM viewer tailored for prostate cancer clinical review. (d) AI-assisted clinical review. (e) AI-assisted clinical review for each specific cancer type.

When supported by AI, clinicians experienced a reduction in analysis time per case. Improvements were noted across multiple metrics, including overall survival (lung), histological classification (breast), risk group identification (prostate), TNM staging evaluation (colon), and structural invasion assessment (rectum). Overall, clinical validation demonstrated a 55% improvement in outcomes.

Figure 3. Flowchart (Sankey plots) illustrating the performance of clinicians with AI assistance for each cancer type. The left side shows the percentage of clinicians participating for each cancer type, while the right side displays the changes in clinical decision-making influenced by the AI: those in green improved (55%), those in grey remained unchanged (17%), and those in red worsened (28%). The ground truth was used as reference to determine the clinicians’ relative performance.

These efforts also contributed to create guidelines and standards for AI use in medical imaging, influencing policy at EU and international levels.

Sustainability

CHAIMELEON’s sustainability plan balances immediate visibility with long-term stability by integrating its metadata into the EIBIR Imaging Biobank Catalogue and hosting Open Challenges to engage users. For lasting impact, it is partnering with European health data initiatives through a phased integration into EUCAIM, providing continued free and secure data-sharing infrastructure. This strategy aligns with the European Health Data Space vision, promoting secure, open data reuse across Europe and supporting ongoing innovation in medical imaging AI.

Generated knowledge

The CHAIMELEON project developed a robust infrastructure to support AI-driven cancer research, including a virtual processing environment with traceability services, an end-to-end workflow data ingestion, a web platform for dataset creation and exploration, and an in-silico clinical validation system. It also introduced advanced tools for data processing—such as deidentification, quality control, automatic MR series classification, data subset selection, and protection against adversarial attacks. To ensure ethical and secure data use, the project established legal frameworks and built expertise in data access governance and sustainability. Additionally, it implemented harmonisation tools, including self-supervised and GAN-based methods, to standardize imaging data across different sources.

In terms of AI development, CHAIMELEON conducted an Open Challenge, open to the entire research community, to externally validate the CHAIMELEON dataset and platform to build AI models tailored to clinical oncology needs, based on real-world data. Finally, new solutions were developed by CHAIMELEON partners. including NM staging, risk stratification for prostate cancer, TNM staging for colon cancer and overall survival prediction for lung cancer. The project also contributed to explainable AI and established a sandbox environment for testing imaging AI model guidelines. Notably, in breast cancer, the team developed segmentation models from a multicentric, multivendor and geographically heterogeneous digital mammography dataset and models for N staging that integrate imaging biomarkers with clinical data, enhancing diagnostic precision and clinical utility.

Figure 4. CHAIMELEON developed solutions. Through an Open Challenge, five different AI models were created during the project. From left to right, they include: NM staging and risk stratification for prostate cancer, overall survival prediction for lung cancer, lymph node involvement in breast cancer, TNM staging for colon cancer, and invasion detection in rectal cancer.

The CHAIMELEON project has significantly contributed to the scientific community through 12 peer-reviewed publications, with more than six additional papers currently under review or in preparation.

Lessons learned and Next steps

CHAIMELEON offered valuable experience applicable to EUCAIM and other initiatives. From the start, the project proved to be a collaborative effort involving all project work packages, requiring strong coordination with the management board to ensure long-term success. Engaging key partners—such as data centers and tool developers—early on was crucial. Their involvement helped identify challenges quickly and brought in experts when needed to address them effectively. It was also learned that timely consultation with local ethics committees and Data Protection Officers (DPOs) is essential. Their guidance helped navigate local regulations and ensured compliance with national laws, which is vital for sustainable implementation.

The next steps for CHAIMELEON focus on deepening technical integration, ensuring long-term sustainability, and expanding its impact across Europe. This includes the interoperability of CHAIMELEON’s cancer imaging repository at tier 3 level within EUCAIM’s federated infrastructure, aligning authentication systems, federated search and distributed processing capabilities. The project will broaden its dataset coverage, onboard new clinical partners and cancer types, and support real-world pilot use cases. AI innovation will be accelerated through federated learning and collaboration with SMEs, while governance and data access models will be refined to meet EU legal and ethical standards. Outreach efforts and stakeholder engagement will further extend CHAIMELEON’s reach and adoption across the research, healthcare, and industry communities.