Exploring Unsupervised Neural Networks with Pytorch Examples

Hi. I am quite new to Pytorch and learning it by trying out some example notebooks. The one I am busy with now involves an unsupervised neural network for solving an eigenvalue problem in Quantum Mechanics, a one-dimensional Schrodinger equation with an infinite square-well potential. The ipynb notebook is provided here:
eigeNN/BothBounds_Infinite.ipynb at master · henry1jin/eigeNN · GitHub.

In the context of this code, the neural network represents the eigenfunction or solution of the differential equation. The two inputs include one adjustable scalar, E, representing the eigenvalue (which is learned during training) and one independent variable, t, of the differential equation. The neural network’s one output is the eigenfunction, f(t). In code, this is:

ORIGINAL:

# Define the sin() activation function
class mySin(torch.nn.Module):
    @staticmethod
    def forward(input):
        return torch.sin(input)

class qNN1(torch.nn.Module):
    def __init__(self, D_hid=10):
        super(qNN1,self).__init__()

        # Define the activation 
        self.actF = mySin()
        
        # define layers
        self.Ein    = torch.nn.Linear(1,1)
        self.Lin_1  = torch.nn.Linear(2, D_hid)
        self.Lin_2  = torch.nn.Linear(D_hid, D_hid)
        self.out    = torch.nn.Linear(D_hid, 1)

    def forward(self,t):
        In1 = self.Ein(torch.ones_like(t))
        L1 = self.Lin_1(torch.cat((t,In1),1))
        h1 = self.actF(L1)
        L2 = self.Lin_2(h1)
        h2 = self.actF(L2)
        out = self.out(h2)
        return out, In1

The complete example code is able to compute the eigenvalues and eigenfunctions for the lowest bound states using a scanning mechanism. I’m trying out a slight addition to the code which, in theory, shouldn’t change the result. That is, I’m considering the eigenfunctions and eigenfunction as complex-valued scalars and not real scalars as given in the original code. This means I needed to include two inputs and two outputs for the real and imaginary parts of E and f(t), respectively. Here, I am not sure if I implemented that correctly:

MODIFIED:

# Define the sin() activation function
class mySin(torch.nn.Module):
    @staticmethod
    def forward(input):
        return torch.sin(input)

class qNN1(torch.nn.Module):
    def __init__(self, D_hid=10):
        super(qNN1,self).__init__()

        # Define the activation
        self.actF = mySin()
        
        # define layers
        self.E1in    = torch.nn.Linear(1,1)
        self.E2in    = torch.nn.Linear(1,1)
        self.Lin_1  = torch.nn.Linear(3, D_hid) # three inputs
        self.Lin_2  = torch.nn.Linear(D_hid, D_hid)
        self.out    = torch.nn.Linear(D_hid, 2) # two outputs

    def forward(self,t):
        In1 = self.E1in(torch.ones_like(t))
        In2 = self.E2in(torch.ones_like(t))
        L1 = self.Lin_1(torch.cat((t,In1,In2),1)) # In2 added
        h1 = self.actF(L1)
        L2 = self.Lin_2(h1)
        h2 = self.actF(L2)
        out = self.out(h2)
        return out, In1, In2

Boundary conditions are imposed as physical constraints on the neural network model and the differential equation itself is embedded in the loss function via autodifferentiation.

ORIGINAL:

def parametricSolutions(t, nn, t0, x1):
    # parametric solutions 
    N1,N2 = nn(t)
    dt =t-t0
    f = (1-torch.exp(-dt))*(1-torch.exp(dt-1)) # this parametric function ensure the neural network model satisfies the boundary conditions by construction
    psi_hat  = x1  + f*N1
    return psi_hat

def dfx(x,f):
    # Calculate the derivative with auto-differention
    return grad([f], [x], grad_outputs=torch.ones(x.shape, dtype=dtype), create_graph=True)[0]

def hamEqs_Loss(t,psi, E):
    # one-dimensional Schrodinger equation with an infinite square-well potential
    psi_dx = dfx(t,psi)
    psi_ddx= dfx(t,psi_dx)
    f = psi_ddx/2 + E*psi
    L  = (f.pow(2)).mean(); #part of the total loss function due to differential equation "residual"
    return L

Continuing with the assumption I made above concerning complex scalars, the changes I made to cater for that are:

MODIFIED:

def parametricSolutions(t, nn, t0, x1):
    # parametric solutions 
    N1 = nn(t)[0] # first neural network output
    N2 = nn(t)[1] # second neural network output
    dt =t-t0
    f = (1-torch.exp(-dt))*(1-torch.exp(dt-1))
    psi1_hat = x1  + f*N1
    psi2_hat  = x1 + f*N2
    return psi1_hat, psi2_hat

def dfx(x,f):
    # Calculate the derivative with auto-differention
    return grad([f], [x], grad_outputs=torch.ones(f.shape, dtype=dtype), create_graph=True)[0]

def hamEqs_Loss(t,psi1, psi2, E1, E2):
    # one-dimensional S.E.
    psi1_dx = dfx(t,psi1)
    psi1_ddx= dfx(t,psi1_dx)
    psi2_dx = dfx(t,psi2)
    psi2_ddx= dfx(t,psi2_dx)
    f  = (psi1_ddx + psi2_ddx*1j)/2 + (E1 + E2*1j)*psi
    f1 = torch.real(f)
    f2 = torch.imag(f)
    L  = (f1.pow(2)).mean() + (f2.pow(2)).mean(); 
    return L

My two main concerns here are with the syntax for indexing the two outputs of the neural network and the change I made to grad_outputs in def dfx(x,f) . I’m not sure whether I’ve correctly assigned the N1 and N2 of def parametricSolutions(t, nn, t0, x1): to the two neural network outputs separately. Another change I am worried about is the one I made to the shape of grad_outputs from x.shape to f.shape to make dfx compatible with a two output neural network.

All this comes together in the training stage of the code:

ORIGINAL:

# Train the NN
def run_Scan_finitewell(t0, tf, x1, neurons, epochs, n_train,lr, minibatch_number = 1):
    fc0 = qNN1(neurons)
    ...
    SE_loss_history = []
    ...
    for tt in range(epochs): 
    ...
#  Network solutions 
            nn, En = fc0(t_mb) #t_mb related to the training inputs   
            psi  = parametricSolutions(t_mb, fc0, t0, x1) 
            Ltot = hamEqs_Loss(t_mb, psi, En)
            SE_loss_history.append(Ltot) 

My version with real and imaginary parts:

MODIFIED:

# Train the NN
def run_Scan_finitewell(t0, tf, x1, neurons, epochs, n_train,lr, minibatch_number = 1):
    fc0 = qNN1(neurons)
    ...
    SE_loss_history = []
    ...
    for tt in range(epochs): 
    ...
#  Network solutions 
            nn, E1n, E2n = fc0(t_mb) #t_mb related to the training inputs   
            psi1, psi2  = parametricSolutions(t_mb, fc0, t0, x1) 
            Ltot = hamEqs_Loss(t_mb, psi1, psi2, E1n, E2n)
            SE_loss_history.append(Ltot) 

After making these changes I hoped to generate the same results as the original code except that the imaginary parts would be zero. However when I run my modified version with the same number of epochs (120K) as in the original example, the RAM is filled up faster than in the original code and the session crashes before the training has ended. I’d really appreciate suggestions from anyone regarding this.

Additionally, I would like to check if I’ve correctly added two neural networks outputs and properly indexed them in the other parts of the code. My modified version runs without errors and the only apparent issue is the aforementioned crashing after several epochs. Please assist.