About Me

Hyungjoo Chae is a Research Scientist at LG AI Research and a Master student at Yonsei University. He works on enhancing how AI agents interact with computers through code generation and within GUI environments. His notable projects include developing world models for web navigation and creating COFFEE-GYM, a platform for improving AI feedback on code. Currently pursuing his M.S. under Professor Jinyoung Yeo, Chae contributes to major conferences like EMNLP and ACL while also working at LG AI Research on inference-time scaling projects. His research aims to create more capable digital agents that can handle complex tasks autonomously.

Download CV
Interests
  • Digital Agents
  • Code LLMs
  • RL for Long-Horizon Tasks
Education
  • MS in Computer Science

    Yonsei University

  • BS in Computer Science

    Yonsei University

Featured Publications
Recent Publications
(2024). Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation. ICLR 2025.
(2024). Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code. EMNLP 2024.
(2024). Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models . EMNLP 2024.
(2024). Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering. EMNLP 2024.
(2024). VERIFINER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models. ACL 2024.
(2023). Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents. EMNLP 2023.
(2023). TUTORING: Instruction-grounded Conversational Agent for Language Learners. AAAI 2023 Demo.
(2023). CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification. EACL 2023 Demo.
(2022). Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization. COLING 2022.
Preprints
(2024). Evaluating Robustness of Reward Models for Mathematical Reasoning. arXiv preprint / Under review at ICLR 2025.
(2024). Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics. arXiv preprint / Under review at NAACL 2025.
(2024). Towards Lifelong Dialogue Agents via Relation-aware Memory Construction and Timeline-augmented Response Generation. arXiv preprint / Under review at NAACL 2025.
(2024). The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models. arXiv preprint / Under review at NAACL 2025.
Recent News