Skip to content

MelSpectrogram cannot be detected #6546

@futz12

Description

@futz12

error log | 日志或报错信息 | ログ

model | 模型 | モデル

  1. original model
import torchaudio
from torchaudio.transforms import MelSpectrogram
import torch
# Load an audio file
waveform = torch.randn(1, 1, 44100)
sample_rate = 44100


# Create a MelSpectrogram transform
mel_transform = MelSpectrogram(
   sample_rate=sample_rate,
   n_fft=1024,
   hop_length=512,
   n_mels=128
)
# Apply the transform to the waveform
mel_spectrogram = mel_transform(waveform)
print(mel_spectrogram.shape) # Output: (channels, n_mels, time)

import pnnx

pnnx.export(mel_transform, "mel_transform.pnnx", waveform)

how to reproduce | 复现步骤 | 再現方法

  1. run this code
  2. then will show
torch.Size([1, 1, 128, 87])
pnnxparam = mel_transform.pnnx.param
pnnxbin = mel_transform.pnnx.bin
pnnxpy = mel_transform_pnnx.py
pnnxonnx = mel_transform.pnnx.onnx
ncnnparam = mel_transform.ncnn.param
ncnnbin = mel_transform.ncnn.bin
ncnnpy = mel_transform_ncnn.py
fp16 = 1
optlevel = 2
device = cpu
inputshape = [1,1,44100]f32
inputshape2 = 
customop = 
moduleop = 
get inputshape from traced inputs
inputshape = [1,1,44100]f32
############# pass_level0
inline module = torchaudio.transforms._transforms.MelScale
inline module = torchaudio.transforms._transforms.Spectrogram
inline module = torchaudio.transforms._transforms.MelScale
inline module = torchaudio.transforms._transforms.Spectrogram

----------------

############# pass_level1
############# pass_level2
############# pass_level3
############# pass_level4
############# pass_level5
############# pass_ncnn
force batch axis 233 for operand 1
fallback batch axis 233 for operand 0
fallback batch axis 233 for operand 2
fallback batch axis 233 for operand 3
fallback batch axis 233 for operand 4
fallback batch axis 233 for operand 6
fallback batch axis 233 for operand 7
fallback batch axis 233 for operand 8
fallback batch axis 233 for operand pnnx_expr_4_abs(4)
fallback batch axis 233 for operand pnnx_expr_4_pow(abs(4),2.0)
insert_reshape_linear 4
ignore torch.stft torch.stft_20 param center=True
ignore torch.stft torch.stft_20 param hop_length=512
ignore torch.stft torch.stft_20 param n_fft=1024
ignore torch.stft torch.stft_20 param normalized=False
ignore torch.stft torch.stft_20 param onesided=True
ignore torch.stft torch.stft_20 param pad_mode=reflect
ignore torch.stft torch.stft_20 param return_complex=True
ignore torch.stft torch.stft_20 param win_length=1024

pnnx

7767517
10 9
pnnx.Input               pnnx_input_0             0 1 0 #0=(1,1,44100)f32
pnnx.Attribute           spectrogram              0 1 1 @data=(1024)f32 #1=(1024)f32
Tensor.reshape           Tensor.reshape_9         1 1 0 2 shape=(1,44100) $input=0 #0=(1,1,44100)f32 #2=(1,44100)f32
torch.stft               torch.stft_20            2 1 2 1 3 center=True hop_length=512 n_fft=1024 normalized=False onesided=True pad_mode=reflect return_complex=True win_length=1024 $input=2 $window=1 #2=(1,44100)f32 #1=(1024)f32 #3=(1,513,87)c64
Tensor.reshape           Tensor.reshape_10        1 1 3 4 shape=(1,1,513,87) $input=3 #3=(1,513,87)c64 #4=(1,1,513,87)c64
pnnx.Expression          pnnx_expr_4              1 1 4 5 expr=pow(abs(@0),2.0) #4=(1,1,513,87)c64 #5=(1,1,513,87)f32
torch.transpose          torch.transpose_13       1 1 5 6 dim0=-1 dim1=-2 $input=5 #5=(1,1,513,87)f32 #6=(1,1,87,513)f32
nn.Linear                F_linear_0               1 1 6 7 bias=False in_features=513 out_features=128 @weight=(128,513)f32 $input=6 #6=(1,1,87,513)f32 #7=(1,1,87,128)f32
torch.transpose          torch.transpose_14       1 1 7 8 dim0=-1 dim1=-2 $input=7 #7=(1,1,87,128)f32 #8=(1,1,128,87)f32
pnnx.Output              pnnx_output_0            1 0 8 #8=(1,1,128,87)f32

ncnn

7767517
12 12
Input                    in0                      0 1 in0
MemoryData               spectrogram              0 1 1 0=1024
Reshape                  reshape_1                1 1 in0 2 0=44100 1=1
torch.stft               torch.stft_20            2 1 2 1 3
Reshape                  reshape_2                1 1 3 4 0=87 1=513 11=1 2=1
UnaryOp                  abs_0                    1 1 4 5 0=0
UnaryOp                  pow_1                    1 1 5 6 0=4
Permute                  transpose_5              1 1 6 7 0=1
Reshape                  reshape_3                1 1 7 8 0=513 1=87
Gemm                     gemm_0                   1 1 8 9 10=-1 2=0 3=1 4=0 5=1 6=1 7=87 8=128 9=513
Reshape                  reshape_4                1 1 9 10 0=128 1=87 11=1 2=1
Permute                  transpose_6              1 1 10 out0 0=1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions