User Guide¶
Thema provides a topological data analysis pipeline that transforms raw tabular data into representative graph models through preprocessing, dimensionality reduction, and Mapper graph construction.
How to Use This Guide¶
This guide is organized for both learning and reference:
New users: Start with Installation → Quickstart → Getting Started
YAML-driven workflows: See Quickstart and Getting Started
Programmatic control: See Programmatic Pipeline and component guides (Preprocessing, Embeddings, Graphs & Selection)
Parameter tuning: See Tuning and Selection
Advanced customization: See Customizing Thema
Guides by Task¶
Getting Started
Installation - Install Thema via pip or set up development environment
Quickstart - Run the full pipeline with a minimal YAML configuration
Getting Started - Complete walkthrough from setup to results with uv
Pipeline Components
Data Preprocessing - Clean, encode, scale, and impute data with Planet
Embeddings - Generate low-dimensional projections with Oort (t-SNE, PCA)
Graphs & Selection - Build Mapper graphs and select representatives with Galaxy
Workflows
Manual Configuration Guide - Build pipelines programmatically without YAML, with complete parameter reference
Tuning and Selection - Fine-tune parameter grids, apply filters, and optimize selection strategies
Customizing Thema - Write custom filters and graph builders, scale to large datasets
Reference
Best Practices - Recommended workflows, parameter choices, and troubleshooting
Testing - Test suite information
Overview - High-level architecture and terminology