[solved] Train initial hidden state of RNNs

raulpuric · October 10, 2017, 11:56pm

Yeah this doesn’t seem to be what you want to do.

I think what you want is what is done here in the original torch impl of an NTM.

kaishengtai/torch-ntm/blob/74a3d3a77cb67afdc7e780a7856fb202a4a135af/NTM.lua#L241




  mtable[layer] = nn.CMulTable(){o, nn.Tanh()(ctable[layer])}
end


mtable = nn.Identity()(mtable)
ctable = nn.Identity()(ctable)
return mtable, ctable
end


-- Create a new module to read/write to memory
function NTM:new_mem_module(M_p, wr_p, ww_p, m)
-- read heads
local wr, r
if self.read_heads == 1 then
  wr, r = self:new_read_head(M_p, wr_p, m)
else
  wr, r = {}, {}
  for i = 1, self.read_heads do
    local prev_weights = nn.SelectTable(i)(wr_p)
    wr[i], r[i] = self:new_read_head(M_p, prev_weights, m)
  end

You want to initialize your memory matrix with a vanilla variable (either normal distribution or all constant values).
Then pass it through a linear layer. You will thereby learn a layer that can initialize your hidden state.