# 04_QUEST_OPEN_TARGETS

Congratulations on reaching the final, open-ended quest! In the previous steps, we reproduced the core analysis from the SCimilarity paper and extended it to new datasets. Now, we're transitioning from pure data analysis to **Competitive Intelligence and Translational Strategy**.

In this quest, you will investigate the clinical viability of the biological markers identified in your earlier analysis. You will utilize advanced LLM concepts, such as Model Context Protocols (MCPs) or external APIs (like Open Targets or ClinicalTrials.gov), to generate a comprehensive clinical landscape report.

In this document, overwrite anything in `{{}}` to get Gemini to work with you. Because this is the most open-ended quest, the LLM will rely heavily on the specifics of your SPEC to assist you. 

---

## 1. Background and Goals

**Objective:** Extract actionable competitive intelligence for key targets identified via SCimilarity (e.g., Fibrosis-Associated Macrophage markers) by querying external databases and structuring the results into a visual report.

**Example Scenarios:**
- **Target Intel:** Identify competitive crowding and current clinical phases for `SPP1`, `MARCO`, and `CD163`.
- **Diligence/Whitespace:** Assess if the identified novel targets are already under development, and if trials are actually stratifying patients using these biomarkers.

**Deliverables:**
1. {{Describe the specific Python script, Agent logic, or skill you intend to build}}
2. {{Describe the final output format. E.g., An HTML dashboard with specific charts, or a Markdown landscape report, that is incorporated as a sub-page of the original HTML report}}

## 2. Tech Stack and Project Structure

You can continue using the existing project structure and virtual environment. For this quest, you will need to integrate external APIs or Agentic frameworks. 

**Suggested Tooling Approaches (Choose one or propose your own):**
- **Direct API querying:** Using `requests` and Python to query the Open Targets GraphQL API or ClinicalTrials.gov REST API.
- **K-Dense AI:** An open-source Agent Skills framework for querying public databases (ClinicalTrials.gov, Open Targets, ChEMBL, FDA).
- **Gosset MCP:** A commercial MCP providing curated biotech competitive intelligence (if you have access/API keys).

{{Specify which tools, APIs, and libraries you will use to build your pipeline.}}

## 3. Inputs & Target Selection

You need to select specific targets from the SCimilarity paper (or your novel dataset analysis) to investigate. For example, the paper highlights Fibrosis-Associated Macrophage markers.

**Targets Chosen:**
- **Target 1:** `CD163` (A marker often associated with alternatively activated macrophages and fibrosis in single-cell datasets).

**Research Questions:**
- What are the top diseases/indications currently in clinical trials involving CD163?
- What is the highest phase of clinical development for drugs targeting or involving CD163?

## 4. Outputs

Generate a clean **Markdown Report** (`web/cd163_intelligence.md`) containing:
- **Landscape Overview:** Number of trials and the highest clinical phase.
- **Top Indications:** A list of the diseases most frequently targeted.
- **Web UI Update:** Add a link to this Markdown report (or a rendered HTML version of it) to the `web/index.html` Hub page.

*Example considerations for an HTML report:*
*   **Landscape Overview:** Known drugs and highest phase.
*   **Trial Context:** Number of trials by disease in descending order with stacked barplots (Phase 0/1/2/3/Approved).
*   **Precision Medicine:** Analysis of inclusion criteria (e.g., checking for "Biopsy confirmed" or "SPP1 high").
*   **Outcomes:** Efficacy and safety stats if available.

## 5. Implementation Strategy

### Connecting to APIs / MCPs
Use the **Open Targets GraphQL API** (`https://api.opentargets.io/api/v4/graphql`). Write a Python script (`scripts/fetch_open_targets.py`) using the `requests` library to execute a GraphQL query for the target `CD163` (Ensembl ID: ENSG00000177575).

### Data Processing
Extract the `knownDrugs` object from the JSON response. Aggregate the drugs to find the highest clinical phase and count the frequencies of the `disease.name` to identify top indications.

### Visualizations & Report Generation
The Python script will write the aggregated data and a brief LLM-style summary directly into a static HTML file (`web/cd163_intelligence.html`). Update `web/index.html` to include a new card linking to this report.

## 6. Open Exploration

Because you are at the end of the workshop, there are few restrictions! If you finish your target analysis early, use this space to outline other advanced AI workflows, custom agents, or biological questions you'd like to explore with Gemini.

{{Your open-ended ideas here}}
copyright: © 2026 Sonia Timberlake & Ryan Bellmore
license: Proprietary - Authorized Workshop Participants Only
distribution_allowed: false
