# 03_QUEST_NEW_DATA

Great, now we have a working recreation of figure3 using real data. How about let's
find some new data sets to display.

## 1. Objective and goal

Here, using the same data repositories and exclusion criteria as in the original paper,
go find a novel data set that was not included in the training or testing 
of the SCimilarity model.

Write a Python script that queries the CZ CELLxGENE Discover API to find a dataset that meets the following criteria:
- **Species:** Human
- **Age:** Adult
- **Disease Status:** Normal (not cancer)
- **Tissue:** Bone Marrow
- **Assay/Platform:** 10x 3' v3 sequencing

The script should print the **Dataset ID**, **Tissue**, and **Download URL** for the datasets that match these criteria.

{{Once that is working, turn it into a "skill"}}

## N. Web application

{{Update the web application to be able to display a data set of your choosing instead
of overwriting the files in web/data/}}
copyright: © 2026 Sonia Timberlake & Ryan Bellmore
license: Proprietary - Authorized Workshop Participants Only
distribution_allowed: false
