conv-relu-batchnorm-layer

xconfig/convolution.py

src/nnet3/nnet-convolutional-component.h

nnet3-TimeHeightConvolution代码解读

1
2

conv-relu-batchnorm-layer name=cnn1 $cnn_opts height-in=50 height-out=50 time-offsets=-1,0,1 height-offsets=-1,0,1 num-filters-out=64
conv-relu-batchnorm-layer name=cnn3 $cnn_opts height-in=50 height-out=25 height-subsample-out=2 time-offsets=-1,0,1 height-offsets=-1,0,1 num-filters-out=128

参数解释

height-offsets：卷积核大小，dim方向的卷积（比如dim=50）（if height-offsets=-1,0,1 then height 10 at the output would take input from heights 9,10,11 at the input.）
num-filters-out：The output dimension of this layer is num-filters-out * height-out，卷积核的数量
num-filters-in 和 height-in：假设上一层的输出是2048，那么这里就会要求num-filters-in × \times× height-in == 2048。假设我们设height-in是1024，那么num-filters-in 就是2
height-out 和 height-subsample-out：期望的输出维度，是对于一个height-in来说的。height-subsample-out就是在维度上的一个降采样率。这里会有个限制 height-out × height-subsample-out <= height-in，这一层最终输出的维度是height-out × num-filters-out。
height-offsets, time-offsets: 这就是我们需要的卷积核的样子了。或者如果不想这么定义的话，可以直接定义offsets即可。
required-time-offsets：这个实际上是可以忽略的，一般来说，required-time-offsets和time-offsets是等价的。