I want to get output of different size from input in unet network.
What I am doing now is using unet to remove speech noise.
The input is a noisy speech and the target is a clean speech.
The input size is (4, 256, 1) where 4 is the number of frames.
I want to remove the noise of the current frame using the information of 4 frames including the past.
So I want the output to be (1, 256, 1) or (256, 1).
However, I can change the number of channels, but I’m having trouble getting an output size other than (4, 256, 1).
The model is nothing special.
Please tell me what processing should be done to the output.
def UNet(input_size=(FRAME, 256, 1)):
activation_func = 'relu'
filter_size = 5
inputs = Input(input_size)
# Encoder
conv1 = Conv2D(32, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(inputs)
conv1 = Conv2D(32, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2), padding='same')(conv1)
conv2 = Conv2D(64, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(pool1)
conv2 = Conv2D(64, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2), padding='same')(conv2)
conv3 = Conv2D(128, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(pool2)
conv3 = Conv2D(128, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(conv3)
drop3 = Dropout(0.2)(conv3)
# Decoder
up4 = Conv2D(64, 2, activation=activation_func, padding='same', kernel_initializer='he_normal')(
UpSampling2D(size=(2, 2))(drop3))
merge4 = concatenate([conv2, up4], axis=3)
conv4 = Conv2D(64, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(merge4)
conv4 = Conv2D(64, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(conv4)
up5 = Conv2D(32, 2, activation=activation_func, padding='same', kernel_initializer='he_normal')(
UpSampling2D(size=(2, 2))(conv4))
merge5 = concatenate([conv1, up5], axis=3)
conv5 = Conv2D(32, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(merge5)
conv5 = Conv2D(32, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(conv5)
conv5 = Conv2D(2, filter_size, activation=activation_func, padding='same', kernel_initializer='he_normal')(conv5)
conv6 = Conv2D(1, 1, activation='relu')(conv5)
model = Model(inputs = inputs, outputs=[conv6])
return model