Optimizing Personalized Robot Actions with Ranking of Trajectories

Hao Huang, Yiyun Liu, Shuaihang Yuan, Congcong Wen, Yu Hao, Yi Fang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Intelligent robots designed for real-world human interactions need to adapt to the diverse preferences of individuals. Preference-based Reinforcement Learning (PbRL) offers promising potential to teach robots personalized behaviors by learning through interactions with humans, eliminating the need for intricate, manually crafted reward functions. However, the current PbRL approaches are hampered by sub-optimal feedback efficiency and limited exploration within state and reward spaces, resulting in subpar performance in complex interactive tasks. To enhance the effectiveness of PbRL, we integrate prior task knowledge into the PbRL framework. Subsequently, we develop a reward model based on ranking a set of multiple robot trajectories. This acquired reward is then utilized to refine the robot’s policy, ensuring alignment with human preferences. To validate our method, we showcase its versatility in different human-robot assistive tasks. The experimental results demonstrate that our approach offers a useful, effective, and broadly applicable solution for personalized human-robot interaction.

Original languageEnglish (US)
Title of host publicationPattern Recognition - 27th International Conference, ICPR 2024, Proceedings
EditorsApostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal
PublisherSpringer Science and Business Media Deutschland GmbH
Pages1-16
Number of pages16
ISBN (Print)9783031781094
DOIs
StatePublished - 2025
Event27th International Conference on Pattern Recognition, ICPR 2024 - Kolkata, India
Duration: Dec 1 2024Dec 5 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15329 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Pattern Recognition, ICPR 2024
Country/TerritoryIndia
CityKolkata
Period12/1/2412/5/24

Keywords

  • Assistive Gym
  • Human-robot interaction
  • Multiple trajectory ranking
  • Preference-based reinforcement learning (PbRL)

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Optimizing Personalized Robot Actions with Ranking of Trajectories'. Together they form a unique fingerprint.

Cite this