Bruce N.H.
mechanistic interpretability / quantitative systems
Links
GitHub:
github.com/brucenh
LinkedIn:
linkedin.com/in/brucenh
Paper:
Detecting and Steering Deception Representations in LLM Reasoning Traces
Resume:
view resume