MLP intermediate dimension

by shantanuagarwal - opened May 28, 2024

May 28, 2024

Thanks for the great work.
Can you please specify the details of the MLP layer. It is mentioned in the paper that "MLP consists of two linear transformations with a GELU activation in between". Is the MLP size:

What I am unsure about is the value of the intermediate_dim in the following pseudo-code for mlp:

import torch

intermediate_dim = 4096  # ??? 
mlp = torch.nn.Sequential(
    torch.nn.Linear(4096, intermediate_dim),
    torch.nn.GELU(),
    torch.nn.Linear(intermediate_dim, 4096),
)

Is the above pseudo-code similar to what was used in the expts?

Sorry if this detail is mentioned in the paper and I missed it.

Thanks.

jootanehorror

Jun 14, 2024

Check this file modeling_nvembed.py

shantanuagarwal

Jun 15, 2024

Thanks @jootanehorror .
For anyone else looking into this, see the class FeedForward in https://huggingface.co/nvidia/NV-Embed-v1/blob/main/modeling_nvembed.py#L244.
Specifically, the intermediate dim is 4 * 4096.

shantanuagarwal changed discussion status to closed Jun 15, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment