Robot Behaviors

Exploring the T-Maze: Evolving Learning-Like Robot Behaviors using CTRNNs


3.2 Transfer to the Real Robot
A way of verifying the evolutionary robotics results obtained in simulation is to test the evolved neural controllers on a real robot. For this purpose the T-Maze shown in figure 7 was built. The best individual from each of the 10 replications of experiment 1 task was tested. Initially however, the results of the tests were rather poor. None of the 10 controllers were able to reliable navigate the robot. This result indicates that the functioning of the evolved CTRNNs was specific to the sensory-motor conditions encountered in the simulator. This observation confirms our earlier results that CTRNNs, despite of their ability to display learning-likes abilities, lack the sensory-motor adaptability found in e.g. plastic Hebbian synapse networks [2]. However, several techniques for reducing this “reality gap”-problem, by adding noise at different levels of the simulation, have been proposed [5][6]. With this in mind the simulator was changed in the following way: Sensor noise levels were increased from 5% to 10%. In addition, 10% uniform noise was added to the distance traveled by each wheel at each timestep. Furthermore, the initial conditions of each tested individual changed in the following way: The starting position was randomized within a 4 by 4 cm square, and the orientation was randomized within the range forward +/- 15 degrees. With these modifications an incremental evolution lasting 20 generations was launched, seeded with a population from one of the original runs. This time the transfer to the real robot was perfect. The best individual of the last generation was able score a fitness value of 20, i.e. finding the reward-zone in every trial. No significant behavioral differences compared to the simulation were observed.

4 Experiment 2: Simple T-Maze with Reward Switching

In order to further investigate the learning-like capabilities of CTRNNs the task for the robot was now made slightly more complex. In experiment 1 the robots could rely on the fact that the position of the reward-zone remained fixed during a whole epoch. Evolved robots were able to explore the whole environment during trial 1 of each epoch in order to locate this position, but would they also be able to adapt if the reward position was changed later on within the same epoch. This turned out not to be the case. When testing the individual analyzed in section 3.1 by placing the reward-zone to left for 5 trials and then switching

the reward position to the right without resetting the neural network, the robot would continue to turn left a the T-Junction in the trials after the switch took place.

A new experiment was now set up in order to check if this lack of adaptivity to environmental changes taking place later on in an epoch was due to a limitation in the learning capabilities of the network, or simply given by the fact this condition was never met during evolution. The evolved robots could simply have found a minimalistic solution. In this new evolution the duration of each epoch was increased to 10 trials. The reward-zone position remained fixed in the first 5 trials but was then switched to the other side in the 5 last trials of each epoch. Each individual was still tested for 4 epochs, 2 with the reward initially to the left and 2 with the reward initially to the right. The fitness function remained the same, and since each individual was tested for 40 trials in total the maximum possible fitness was now 40.

The average result of 10 replications of the experiment is shown in figure 6. The resulting best fitness in the 10 replications varied between 22 in the worst case and 38 in the best. In the latter case the best individual did realize that the reward position had changed in trial 6, and was able to locate the new position. However in some of the following trials it would still turn the wrong way thus ending up in a the poison-zone. In order to increase the performance the last bit an incremental evolutionary approach was now applied. The evolutionary conditions remained the same, but instead of seeding the evolution with a random population, it was initially seeded with a population consisting of the best individual of each of the 200 generations from the best replication of the previous evolution. Again 10 replications were performed. In most of the runs the fitness level stayed at 38, and even dropped to 32 in one case (graph not shown). It seemed that level 38 solution was a local optimum which was difficult to escape. In one replication, however, the fitness level reached the maximal value of 40, and when tested afterwards the best individual from this replication could reliably solve the task. When the reward position switched at trial 6 of each epoch, the

robot would at first move towards the previous location, but when not finding the reward-zone here anymore it would turn around and initiate a search until the new reward position was located. In the remaining trials of each epoch the robot then again turned directly towards the reward-zone, resulting in the total fitness of 40.

Page 1, 2, 3, 4, 5, 6

Tech Materials (Free)

Robot Behaviors Exploring the T-Maze: Evolving Learning-Like Robot Behaviors using CTRNNs
Humanoid Robotics A Biochemical Subsystem for a Humanoid Robot
Industrial Automation Systems Applying Agents for Engineering of Industrial Automation Systems
Robot Team Cooperation A Descriptive Model of Robot Team and the Dynamic Evolution of Robot Team Cooperation
Kuka Robots For ONU ONU Robotics Technology Center of Excellence, powered by KUKA Robotics Corporation
Augmented reality Annotation System for Robotic Application
Modular Robots Self-Reconfiguration Planning Of Identical Modules
Autonomous robots A New Approach To Robotics
Robotic Mounting Flat Panel Displays With Robotic Mounting
Calibration of Industrial Robots A Photogrammetric Robot Calibration System Based On Off-The-Shelf Low Cost Hardware Components

More...

Amazon Books
Creative Projects with LEGO MindstormsCreative Projects with LEGO Mindstorms by Benjamin Erwin
Buy new: $20.64 / Used from: $13.00
A good place to start, especially for kids, with Lego Mindstorms
RobotProgramming : A Practical Guide to Behavior-BasedRobotics A Practical Guide to Behavior-Based Robotics by Joe Jones
Buy new: $20.67 / Used from: $15.13
Very good for programming not so much behavior as control. Language and controller agnostic


Add to Google
Add to Yahoo

Robotics  What is Robotics?
     - Robotic Applications
     - Communication Types
     - Robo Structures
     - Grippers
     - Direction Control
     - Power Sources
     - Programming Methods
Human Robot Interaction  Interaction Dynamics Among Humans And Robots
     - Seal Robot
     - I-Blocks
     - LEGO Mindstorms
Industrial Automation  Modern trends in Industrial Automation, Process Control and Robotics
Design Priniciples  Design principles of Human Machine Interface Systems In Industrial automation
     - Design Process
Gallery  Industrial Robots Gallery
     - ABB Robots
     - Epson Robots
     - Faunc Robots
     - Humanoid Robots
     - Scara Robots