This forum uses cookies
This forum makes use of cookies to store your login information if you are registered, and your last visit if you are not. Cookies are small text documents stored on your computer; the cookies set by this forum can only be used on this website and pose no security risk. Cookies on this forum also track the specific topics you have read and when you last read them. Please confirm whether you accept or reject these cookies being set.

A cookie will be stored in your browser regardless of choice to prevent you being asked this question again. You will be able to change your cookie settings at any time using the link in the footer.

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
automatically selecting autopilot gains
#18
I come from the MATLAB world so I can't suggest an opensource tool, though I know some good ones exist, and don't currently have the time or the data to train a machine learning autopilot. However this is an awesome project and I have some experience with machine learning, I think a good approach would be to do it in a few steps.

1) Train a deep neural net to predict boat behaviour.

-use the last x seconds of data available to the autopilot as input
     -more time risks overfitting and will require more data, less time risks being too shortsighted
-train it to predict the next set of sensor inputs to the autopilot.

This will require a lot of data and a lot of compute time to generate but running the generated neural net is fast. Doing this first lets you guarantee you have enough data for machine learning and lets you try different hidden layer architectures without messing up your boats steering.

2) Use the boat behaviour neural net to train one of the reinforcement learning agents that use a neural net as a function approximator rather than a state matrix (probably some variant of Q-learning, markov decision process or SARSA).

-train by choosing your x seconds initial state from your real data at random, then running the neural net from there with the reinforcement learning agent controlling the rudder inputs at each time step. With enough initial states/commanded VMG combinations the reinforcement learning agent will learn how to drive your neural net boat model. The neural net model lets you run this for as long as it takes to converge, this type of learning takes a even more data than the neural net from step one, so we're using that to kind of cheat.

-The reward function for the agent would be some weighted combination of maximizing commanded VMG, minimizing rudder inputs, and minimizing course deviations. It will probably work best with one of real or apparent wind commands,  it's probably best to use that mode for all piloting but convert GPS or the worse kind of wind direction into the better type of wind direction command to do GPS mode autopilot.

3) Once the learning agent learns to steer the neural net boat model it's time to move it onto the boat, where it will continue learning online and continuously get better.

4) If and when there is a working solution, repeating the steps with a larger data set that includes more boats and boat types might make it possible to create a general agent that can drive any boat without doing stupid things, which would allow people who aren't interested in machine learning to just load that general pre-trained agent and have it learn to sail their boat specifically out on the water with little input from them.
Reply


Messages In This Thread
RE: automatically selecting autopilot gains - by someboatguy - 2020-08-18, 01:52 AM

Forum Jump:


Users browsing this thread: 1 Guest(s)