Rongwu Xu (许融武)

me

0xrwxu@gmail.com or
xrw22@mails.tsinghua.edu.cn

Github | X (Twitter) | LinkedIn
Google Scholar

Research

I am an artificial intelligence (AI) researcher with pretty interdisciplinary interests. I try to understand how AI's design and its interactions with humans can lead to unexpected behaviors and increased societal risks. I approach this work through the lens of behavioral experiments, machine learning (ML), interpretability tools, and psychology. I publish my findings in the Natural Language Processing (NLP), AI, and ML communities.

Currently, I am interested in the following research topics:

  1. AI Safety and Alignment: Identifying potential safety and ethics risks associated with AI R&D and developing strategies to align AI systems with human values, behaviors, and expectations.
  2. Machine Behavior: Investigating the similarities and differences between AI models and human behaviors, and utilizling psychology-inspired experiments to test and understand machines.
  3. AI and Psychology: Studying both the understanding the psychological impacts of AI systems on humans (Psychology of AI, a subset of Psychology of Technology) and the application of AI in psychological research (AI for Psychology, a subset of AI for Science).

My other general interests include model evaluation and real-world applications of such models.

News

AI/NLP research can be challenging for newcomers. If you're interested in my work or have ideas to explore, I'd be happy to guide you. We can work on submitting papers to top venues. Feel free to drop me an Email if interested.

  • Jan 2025 I am looking for PhD opportunities starting 2025. Don't hesitate to reach out if you think I can be a good candidate. Mar 2025 Got accepted to UIUC CS, UW CSE and JHU CS. Grateful to the opportunities!
  • Oct 2024 Six papers accepted to EMNLP 2024! Thanks to my collaborators!
  • Sep 2024 I received the National Scholarship by the Ministry of Education of China!
  • Aug 2024 My paper "The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation" recieved an Outstanding Paper Award at ACL 2024!
  • Jul 2024 Check out our talk (Chinese) on knowledge conflicts for (RAG) LLMs! [Paper][Resource][机器之心][Slides]
  • May 2024 Two papers accepted to ACL 2024! Thanks to my collaborators!
  • May 2024 Check out LLMs' safety vulnerabilities discovered by tricking them to believe in misinformation! [Paper][Resource][机器之心][Video]
  • Apr 2024 I passed the PhD qualification exam (preliminary+oral) at IIIS, Tsinghua!
  • Dec 2023 I recieved the overall execellence scholarship at Tsinghua!
  • Apr 2023 One paper accepted to EuroS&P 2023! Thanks to my collaborators!
  • Dec 2022 Debut of my academic homepage.
  • Aug 2022 Enrolled as a graduate student at IIIS, Tsinghua University.

Selected Publications

* Equal Contribution, ^ Advising Role

Awards

Talks

Professional Service