Abstract
Successful teaching requires an assumption of how the learner learns - how the learner uses experiences from the world to update their internal states. We investigate what expectations people have about a learner with a behavioral experiment: Human teachers were asked to teach a sequential decision-making task to an artificial dog in an online manner using rewards and punishments. The artificial dogs were implemented with either an Action Signaling agent or a Q-learner with different discount factors. Our findings are threefold: First, we used machine teaching to prove that the optimal teaching complexity across all the learners is the same, and thus the differences in human performance was solely due to the discrepancy between human teacher’s theory of mind and the actual student model. Second, we found that Q-learners with small discount factors were easier to teach than action signaling agents, challenging the established conclusion from prior work. Third, we showed that the efficiency of teaching was monotonically increasing as the discount factors decreased, suggesting that humans’ theory of mind bias towards myopic learners.
Original language | English (US) |
---|---|
Pages | 1159-1165 |
Number of pages | 7 |
State | Published - 2021 |
Event | 43rd Annual Meeting of the Cognitive Science Society: Comparative Cognition: Animal Minds, CogSci 2021 - Virtual, Online, Austria Duration: Jul 26 2021 → Jul 29 2021 |
Conference
Conference | 43rd Annual Meeting of the Cognitive Science Society: Comparative Cognition: Animal Minds, CogSci 2021 |
---|---|
Country/Territory | Austria |
City | Virtual, Online |
Period | 7/26/21 → 7/29/21 |
Keywords
- machine teaching
- reinforcement learning
- theory of mind
ASJC Scopus subject areas
- Cognitive Neuroscience
- Artificial Intelligence
- Computer Science Applications
- Human-Computer Interaction