A new analysis by Phesi, a global provider of patient-centric data analytics, has revealed that colorectal cancer now boasts the largest volume of real-world patient data, with almost six million patient records from more than 18,000 cohorts. This finding comes from the company's examination of 167 million contextualized patient data records via its AI-powered Trial Accelerator™ platform.
The report, released ahead of ASCO 2025, shows that colorectal cancer leads the data volume rankings, followed by breast cancer (4.8 million), lung cancer (3.0 million), liver cancer (2.3 million), and prostate cancer (2.2 million).
Gap Between Data Availability and Trial Modernization
Despite this unprecedented wealth of information, Phesi warns that clinical trial design and execution have not evolved at the same pace as our understanding of cancer biology and patient genetics.
"Oncology R&D is increasingly defined by a more precise understanding of the molecular mechanisms of cancer and of individual patients. But clinical development operations, from planning to implementation, are yet to reflect the same level of precision," said Dr. Gen Li, CEO of Phesi.
Dr. Li further explained the growing challenge: "The data haystack is growing as our knowledge of cancer expands, but the needle is shrinking as we understand the critical role patient genetics play in cancer outcomes – as we've seen with breakthroughs like Keytruda for non-small-cell lung cancer."
Persistent Inefficiencies in Trial Operations
The report highlights significant operational inefficiencies in oncology trials. A previous Phesi analysis revealed that almost one-fifth of investigator sites contribute just 3% of patients, while 16% of the best-performing trial sites contribute almost half of all enrollments.
Additionally, in an analysis of 471 recruiting Phase I non-small cell lung cancer trials, Phesi found that while these trials targeted more than 20 specific biomarkers, 20% of investigators lacked a background in lung cancer recruitment.
Jonathan Peachey, Chief Operating Officer at Phesi, criticized the current approach: "Historically, sponsors have taken a scattergun approach to oncology development that results in costly and poorly performing trials – but with the right data and technology, they can be laser guided."
The Path to Precision Oncology
Phesi advocates for a shift toward what they call "precision oncology" in clinical development. This approach would leverage the significant volume of available data to create digital twins that could accelerate and modernize clinical trials.
According to Peachey, there are four elements sponsors need to optimize to achieve greater precision:
- The targeted patient profile
- The program
- The protocol
- The operations plan
"Leveraging clinical data science will enable them to optimize these elements, as well as facilitating accurate scenario and prediction modelling before a wet trial ever gets underway, and to develop accurate digital twins," Peachey explained.
Biomarker-Driven Success Stories
The report points to breast cancer as an example of how precise patient profiling has advanced clinical development. Several biomarkers such as HER2, PD-L1, PIK3CA, BRCA1, and BRCA2 are now established, along with knowledge of hormone receptors like ER and PR, guiding the use of hormone therapies and leading to improved patient outcomes.
This stands in contrast to pancreatic cancer, which remains challenging due to asymptomatic early presentation and where only non-specific biomarkers like CA 19-9 are established, resulting in poorer patient outcomes.
Regulatory Support for Real-World Evidence
The push for modernization comes as regulators are actively encouraging the use of real-world evidence and digital health tools. The UK's Medicines and Healthcare products Regulatory Agency (MHRA) launched its Real-World Evidence Scientific Dialogue Programme in 2025, aiming to help innovators refine evidence generation strategies and clarify regulatory expectations for using real-world data.
New Tools for Oncology Research
In conjunction with the report, Phesi announced the launch of the latest version of its Digital Patient Profile (DPP) Catalogue at ASCO. The updated catalogue contains 40 DPPs across multiple therapeutic areas, including new oncology profiles for breast cancer with PIK3CA and diffuse large B-cell lymphoma (DLBCL).
Each DPP provides a comprehensive view of the patient population for specific disease areas, including key demographics, comorbidities, outcome measures, and concomitant medications.
As the volume of available patient data continues to grow, the challenge for the oncology research community will be to harness this information effectively, moving from what Phesi describes as a "scattergun approach" to a more precise, data-driven methodology that can accelerate the development of new cancer treatments.