Project Sharing: Easy Imitation Learning using Pytorch with myCobot

Project Introduction

In this project, we will use a pre-trained imitation learning model to control myCobot in a simulated environment.

We will reference the project https://github.com/Sicelukwanda/simple-imitation-learning. The open-source project was authored by Sicelukwanda Zwane, who provided technical support, while Geraud Nangue Tasse, a supporter of the Robot Learning Workshop project, offered significant assistance to the project.

Project Objectives

● Simulate robot using PyBullet.

● Train the robot to imitate a given task using an imitation learning model.

● Implement control techniques for accurate manipulation, including joint velocity control and inverse kinematics.

● Visualize the robot’s actions in the simulation.

Project Setup

1. Setting Up the PyBullet Simulation

Firstly, we need to load the 6-axis robotic arm myCobot 320 in PyBullet and set up the physical simulation environment.

import pybullet as p
import pybullet_data as pd
import numpy as np
import time

# use PyBullet
client_id = p.connect(p.GUI)
p.setAdditionalSearchPath(pd.getDataPath())
p.setGravity(0, 0, -9.8)

# load 2 model  a plane and a robot
plane_id = p.loadURDF("plane.urdf")
robot_id = p.loadURDF("mycobot_description/urdf/mycobot/mycobot_urdf.urdf", useFixedBase=True)

# set step
time_step = 1/240
p.setTimeStep(time_step)

Alternatively, you can simply test robotic arm simulation using the code written by Sicelukwanda Zwane.

pip install pybullet numpygit clone https://github.com/Sicelukwanda/robot_learning_tutorial.gicd robot_learning_tutorial python3 move_basic.py

2. Collect demonstration data

In imitation learning, we need to collect a large amount of demonstration data, including:

● State: Joint angle and velocity of the robotic arm.

● Action: The target joint angle or velocity of the robotic arm.

Here we use pybullet to obtain the status of the robotic arm and define some demonstration actions with simple code.

states = []
actions = []

for i in range(100):
    joint_positions = [0, 0.3*np.sin(i/10), -np.pi/4, 0, np.pi/4, 0]
    states.append(joint_positions)
    actions.append(joint_positions)  # In imitation learning, the ideal action is the correct state
    
    p.setJointMotorControlArray(robot_id, range(6), p.POSITION_CONTROL, targetPositions=joint_positions)
    p.stepSimulation()
    time.sleep(time_step)

# save data
np.save("states.npy", np.array(states))
p.save("actions.npy", np.array(actions))

3. Training the Robot

We need to install PyTorch and start training with a neural network model.

import torch
import torch.nn as nn
import torch.optim as optim

# load 
X_train = np.load("states.npy")
y_train = np.load("actions.npy")

# trans to  PyTorch tensor
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)

# load a simple network
class ImitationNetwork(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(ImitationNetwork, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, 64),
            nn.ReLU(),
            nn.Linear(64, 64),
            nn.ReLU(),
            nn.Linear(64, output_dim)
        )
    
    def forward(self, x):
        return self.model(x)

model = ImitationNetwork(input_dim=6, output_dim=6)
optimizer = optim.Adam(model.parameters(), lr=0.001)
loss_fn = nn.MSELoss()

# start training
epochs = 100
for epoch in range(epochs):
    optimizer.zero_grad()
    output = model(X_train)
    loss = loss_fn(output, y_train)
    loss.backward()
    optimizer.step()
    print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")

# save
torch.save(model.state_dict(), "imitation_model.pth")

4. Test

# Load your model
model.load_state_dict(torch.load("imitation_model.pth"))
model.eval()

for i in range(100):
    joint_state = np.load("states.npy")[i]
    input_tensor = torch.tensor(joint_state, dtype=torch.float32).unsqueeze(0)
    predicted_action = model(input_tensor).detach().numpy().flatten()
    
    p.setJointMotorControlArray(robot_id, range(6), p.POSITION_CONTROL, targetPositions=predicted_action)
    p.stepSimulation()
    time.sleep(time_step)

Due to the motion of the robotic arm is generated using code rather than manually collected, it lacks authenticity and makes the learning results ideal. Therefore, in actual learning, real robotic arm can be used to collect better motion data.

Summary

This guide demonstrates how to use PyBullet for imitation learning with myCobot robotic arms, including data collection and validation.

In practical applications, you can extend this tutorial by:

● Using more advanced neural networks (such as LSTMs) to process time-series data.

● Integrating reinforcement learning (RL) to enable robots to autonomously optimize their behavior.

● Deploying the learned strategies to a real myCobot robotic arm.

You can try to modify the task objectives, such as allowing myCobot to perform more complex operations such as picking and placing. We’re excited to see more users and makers exploring the myCobot series robots to create innovative projects .

1 Like