# Project Handoff Document

**Welcome!** Your colleague has handed this workshop session over to you. You are currently participating in the Bio-IT World 2026 workshop: *AI Upskilling for Computational Biologists*.

We are using an "Agentic Workflow" (Spec-Driven Development) to build a single-cell visualization pipeline. 

---

## 1. What Has Been Completed So Far

You are stepping in after the successful completion of **Quest 1** and **Quest 2**. 

### The Application State
*   **Web App:** We have built a functioning, single-page web application (`web/index.html` & `web/visualization.js`) using Plotly to visualize single-cell UMAPs and a Concordance Heatmap. It is styled using an art-inspired "Starry Night" color palette. 
*   **Data Processing:** We wrote a Python script (`src/process_real_data.py`) that takes raw kidney single-cell data (`.h5ad`), runs it through the SCimilarity foundation model to predict cell types, harmonizes the predicted labels with the author's original labels using a tiered ontology-mapping strategy, and exports the data to JSON format for the web app.
*   **Testing:** We have automated tests (`tests/test_data.py`) validating the biological accuracy (>75% concordance) and structural integrity of the data.

### Important Project Files
*   **`01_QUEST_START_HERE.md` & `02_QUEST_FIG3.md`:** The completed Specification (SPEC) documents that define the rules and goals for the work we just finished. 
*   **`WORKFLOW_EXPLANATION.md`:** A plain-language guide explaining how the AI agent (me) and the human (you) collaborate using the Plan->Act->Validate loop.
*   **`label_mapping.csv`:** The output of our label harmonization, mapping raw cell names to standard Cell Ontology (CL) IDs.

---

## 2. Where You Are Taking Over: Quest 3

It is time to start **Quest 3: New Data**.

The SCimilarity paper demonstrated that their model works across many tissues. Our goal now is to prove that our pipeline works on a completely new dataset.

### Your Immediate Next Steps:

1.  **Read the New Spec:** Open and read `03_QUEST_NEW_DATA.md`.
2.  **Choose a Dataset:** We need to find a novel single-cell dataset (one not used in the SCimilarity training) from a repository like the CZ CELLxGENE Discover portal.
    *   *Action:* Tell me (the AI agent) what tissue, disease, or system you want to study (e.g., "Let's find a dataset for human pancreas cells" or "Let's look at Melanoma").
3.  **Draft the SPEC:** Together, we will fill in the `{{ }}` placeholders in `03_QUEST_NEW_DATA.md` to define how we will process this new data.
4.  **Update the Web App:** We will modify the HTML/JS so that the web page can toggle between displaying the Quest 2 Kidney dataset and your new Quest 3 dataset.
5.  **Create a "Skill":** Finally, we will encapsulate this whole data-fetching and processing pipeline into a reusable AI "Skill."

When you are ready, just tell me what kind of biology you want to investigate for Quest 3!