Projects Jams Discord News
Resources
Unwind Fishbowls Forums
About
Manifesto Our values About
Log In

Findings about Unsupervised Behavioral Learning (UBL)

CireNeikual November 25, 2021

Hi,

We have been working on many different variants of AOgmaNeo that support Unsupervised Behavioral Learning (UBL).

For those that don't know, UBL is a sort of alternative to classic reinforcement learning (RL). It's a bit different of a paradigm, but instead of optimizing a reward function, it learns the dynamics of the environment and provides a kind if programmable interface to it. The main motivation behind UBL is that we want an agent that is easier to use with real-world robotics which may require a lot of hand-crafting. It also is able to handle instantaneously changing objectives, which regular RL cannot really (even with goal conditioning).

Currently, the best performing UBL branches are able to reproduce some of the results from the original RL version, but not yet all. There is still work to do!

As a result, it will still take a bit before UBL has a chance at making it into the master branch. If you wish to try it anyways, the latest experimental branches are "

Read more

New release 1.5.0 + a possible paradigm shift?

CireNeikual October 23, 2021

Hi all!

Sorry for the late update! We have some new developments to share.

We just released AOgmaNeo/PyAOgmaNeo 1.5.0, which contains several new features as well as internal optimizations.

In this new release, the most notable change is the inclusion of a new ImageEncoder. This new ImageEncoder is now itself (optionally) hierarchical, allowing for higher quality image representations. You can get some pretty crisp image reconstructions with this new ImageEncoder!

The Python bindings now have a ton of runtime checks that give (hopefully) useful information to the user. This is good for both debugging and learning how to use the library. If high performance is desired, one should use the C++ version (regular AOgmaNeo), which doesn't have any runtime checks.

Internally we started using more handles to make copying hierarchies less error-prone.

We are also working on some cool new demos, that use our UBL (Unsupervised Behavioral Learning) paradigm instead of regular RL (

Read more

The Return of Topology + Experiments with Fast Weights

CireNeikual September 5, 2021

Hi all,

Since the last update, we have mostly been researching new encoders again. In particular, we are interested in encoders that perform well with recurrent versions of AOgmaNeo. Recurrent branches seem to have better compression on many tasks (but not all), so it will likely be an optional thing to enable.

Two new encoders are looking promising, the more general one being a topology-preserving encoder, and the more narrow one being a "fast weights" encoder.

The topology-preserving encoder is once again based on self-organizing maps, much like our previous experiments in this area. However, this time it minimizes reconstruction error in one pass as opposed to several iterations. This both performs better in terms of encoding quality and also runs a fair bit faster. Right now this encoder is a suitable candidate to replace the old ESR encoder still used in AOgmaNeo at this time: It seems to either perform better or the same on just about every task, and has the added benefi

Read more

UBL demo + collab!

CireNeikual August 4, 2021

Hi all,

We recently published a proof-of-concept demonstration of something called Unsupervised Behavioral Learning (UBL), something we have been researching before and finally got in a good enough condition to do some navigation tasks.

So, we made a little T-maze "rat" demo, where a tiny robot (the "rat") must figure out how to navigate in the T-maze given various "goal states". UBL is a method that allows us to specify arbitrary goal states and the rat will attempt to match them as well as it can. It can be trained completely passively, so we trained it by driving around the maze semi-randomly, and then testing various goal states.

If you want to try it out: The branch "goal4_nodestride" was used to make the video.

Here is the video of that:

Also, we helped out James Bruton in one of his amazing projects on YouTube with AOgmaNeo!

Read more

"GAN Theft Auto" but with AOgmaNeo

CireNeikual July 2, 2021

Hi,

Recently I tried to use the "GAN Theft Auto" sample dataset from YouTuber Sentdex to make my own simulated GTA stretch of road. Here is his GitHub repository: GANTheftAuto and his original video: video

Of course, instead of using Nvidia's "GameGAN" framework, I use AOgmaNeo!

Using AOgmaNeo for this task has one huge advantage - it runs much, much faster. Sentdex trained on an Nvidia DGX A100 workstation, which is currently one of the most powerful workstations Nvidia makes. With AOgmaNeo however, one only needs a regular desktop CPU.

Here is a link to my results (using AOgmaNeo).

It's not as detailed as that of Sentdex, since I only have the sample dataset and I am also not using upscaling. However, I am surprised it works this well given the vast difference in compute needed.

The code for training uses t

Read more

Moving Away from Dual-Encoder Setup - New Findings + An Interactive Demo!

CireNeikual June 1, 2021

Hi all,

So this last month there were two major developments for AOgmaNeo.

First, there is a new interactive demo hosted on the Ogma website that you can try. I compiled AOgmaNeo with Emscripten (WebAssembly), and it runs pretty nicely in the browser. The demo is a real robot that was then turned into an web demo by learning a "simulator" for it with AOgmaNeo (simply by observing its response to motor commands through a camera). It's a fun demo that showcases the world-modeling capabilities of AOgmaNeo, give it a try! I called it "real2sim", a reversal of the more common "sim2real" paradigm in machine learning. It still uses the dual-encoder setup discussed in previous blog posts, but we now find that that is not needed, leading to the next point.

Second, new findings show that the dual-encoder setup is actually not strictly necessary. With the inclusion of a new "importance" setting for each IO layer, one can now manu

Read more

A guide, "imagination", LD48, and lots of behind-the-scenes experiments

CireNeikual May 1, 2021

Hello,

Since the last post we have been performing tons of experiments with various improvements to the dual-encoder setup. Most didn't work, but some made it into the upcoming version of AOgmaNeo.

Importantly, there is now a guide in the AOgmaNeo repository, that provides a brief overview of what AOgmaNeo is and what it does. It doesn't have code usage yet though, instead it is describes the algorithm. For code usage, the examples are still the main resource at the moment.

We also trained reinforcement learning (RL) agents in the DonkeyCar simulator (Website), which drove around the track quite nicely. Here is the "imagination" of the RL agent creating its own little simulation of the environment:

Finally, we made a Ludum Dare 48 entry that uses AOgmaNeo to control enemies. There wasn't enough time to really get it working well (the creature generation often cre

Read more

Dual Encoder Setup

CireNeikual April 1, 2021

Hi,

Since the last post, AOgmaNeo has had several important updates - most notably, it now uses a dual-encoder setup. This means that each layer in the hierarchy contains two encoders, one that is updated by minimizing reconstruction error w.r.t. the input (this one is generative), and another that is updated to minimize prediction errors (this one is discriminative).

Individually these encoders both had problems - when one improved upon some task, the other would fail. So, I decided to just combine these two seemingly complementary encoders, into the new dual-encoder setup.

It is a bit slower than it was before, but it performs a lot better on the tasks I have tested. The API has had some functions renamed, but the general usage remains the same more or less.

The user manual is also nearing completion, so that should be released as well soon - it explains things a bit more gently than before. Hopefully this makes these posts more accessible to a general audience as well!

Read more

More Error-Driven Encoders and Decoders + Upcoming Demos

CireNeikual March 2, 2021

Hello!

Recently I have been really buckling down on getting error-driven encoder/decoder pairs to work. There are currently two variants that seem promising - one that is similar to what I already had but with some minor (but important) modifications, and the other that uses feedback alignment (paper here).

Error-driven encoders/decoders promise much better compression than the older ones, as they only compress what is needed instead of everything.

With respect to AOgmaNeo as a whole, some minor changes include slightly reduced memory consumption (useful for embedded devices) by changing the way the receptive fields of columns project onto other layers.

I also plan on working on some more teaching resources, something a bit more simple than the old whitepaper.

Some upcoming demos:

  • AOgmaNeo-based Visual SLAM - allows us to map
Read more

ESR-light and conservative Q-learning

CireNeikual February 1, 2021

Hello,

Time for another update!

Since the last blog post, we have performed many new experiments with different encoders, decoders, recurrent versions, and flatter hierarchies. Out of these, the best new systems are:

  • New encoder - single-byte weights for ESR (exponential sparse reconstruction) encoder using a few re-scaling tricks. Great for Arduino!
  • New reinforcement learning decoder that performs conservative Q learning.

The latter in particular is quite nice to have. Previously, we used a type of ACLA algorithm (Actor-Critic Learning Automaton) to perform reinforcement learning. It worked well, but it had some downsides. For instance, the "passive learning" ability of this decoder was basically a hack, as it couldn't properly learn from the rewards it was provided passively, only the actions taken. It also did not function well with epsilon-greedy exploration.

We have tried Q-learning multiple times before, but this time we found the right method of updating the

Read more

"Do not use" inputs, multi-step predictions and error-driven encoders

CireNeikual December 31, 2020

Hello,

Time for another update!

We have added a new feature to the master branch of AOgmaNeo - the ability to supply "do not use" inputs. These are supplied by just setting a column index to -1. They signal to the hierarchy that you do not want to learn or activate from the input. The hierarchy will however still provide a prediction for this column.

This new feature can be used for cases where data is missing or known to be useless, and also allows AOgmaNeo to predict what those "missing" values should be.

We have also added new serialization features, which allow in-memory serialization of both network state and all weights. This feature allows one to perform multi-step prediction, by "checkpointing" the current state, predicting a few more steps, and then reverting to the previous state. This will keep updating the weights on new information as well, as these will not be reset (not part of the "state").

Aside from these features, we have of course also been researchi

Read more

Demo Work and Topology

CireNeikual December 2, 2020

Here is our first update!

We have been working on some demos for AOgmaNeo. The two currently of most interest to us are:

  • A new version of the "learning to walk faster" demo with a different (custom) quadruped robot with brushless motors (and of course the latest version of AOgmaNeo)
  • A robotic rat that must solve classic rat maze tasks (made of cardboard). Here we are using a new version of our "smallest self-driving car" as the "rat"

These are progressing well. We are also of course experimenting with new things w.r.t. the AOgmaNeo software itself. For instance, we have discovered a new encoder that exploits the topology of the input by using 1D distributed self-organizing maps. We have researched topology-preserving encoders before, but this is the first time we get one to work efficiently.

The idea is to have each column be a self-organizing map (1D). Each column also has a "priority" assigned to it (can be assig

Read more