Embedding-Space CoT (GPT-2/nanoGPT)
Started 8/24/2025
Experimenting with inducing chain-of-thought-like computation inside the embedding space of small models (GPT-2, nanoGPT) without explicit verbose text CoT.
Ideas:
- Inject auxiliary loss to encourage multi-step latent reasoning trajectories.
- Use projection heads to read out intermediate steps; supervise with synthetic CoT targets.
- Contrastive objectives across latent steps for iterative refinement.
- Probe token vs. latent-step alignment (token-free inner loop).
Planned:
- nanoGPT fork with latent controller + step budget.
- Tasks: arithmetic, commonsense (GSM8K-style small splits), synthetic graph tasks.
- Eval: accuracy vs. step count; visualize latent paths (t-SNE/UMAP).
Links (todo): Repo + experiment notes.