Long-Context Corpus-Reasoning Suite
Task complexity (CTC) legend
Sections
Data Explorer — every task in the suite, with real examples sampled across its context-length (or item-count) ladder.
Experiments — the CPT data-mixing runs that test whether mixing continued-pretraining text back into SFT recovers long-context (RULER) ability.