University of Lincoln
Browse

Learning of non-parametric control policies with high-dimensional state features

Download all (1.5 MB)
journal contribution
posted on 2023-10-29, 11:25 authored by Herke Van Hoof, Jan Peters, Gerhard Neumann
<p>Learning complex control policies from highdimensional sensory input is a challenge forreinforcement learning algorithms. Kernel methods that approximate values functionsor transition models can address this problem. Yet, many current approaches rely oninstable greedy maximization. In this paper, we develop a policy search algorithm thatintegrates robust policy updates and kernel embeddings. Our method can learn nonparametriccontrol policies for infinite horizon continuous MDPs with high-dimensionalsensory representations. We show that our method outperforms related approaches, andthat our algorithm can learn an underpowered swing-up task task directly from highdimensionalimage data.</p>

History

School affiliated with

  • School of Computer Science (Research Outputs)

Publication Title

Journal of Machine Learning Research: Workshop and Conference Proceedings

Volume

38

Pages/Article Number

995-1003

Publisher

MIT Press

ISSN

1532-4435

eISSN

1533-7928

Date Submitted

2017-02-24

Date Accepted

2015-05-12

Date of First Publication

2015-05-12

Date of Final Publication

2015-05-12

Event Name

18th International Conference on Artificial Intelligence and Statistics (AISTATS)

Date Document First Uploaded

2017-01-12

ePrints ID

25757

Usage metrics

    University of Lincoln (Research Outputs)

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC