Neural Net: Hidden Dimensions

What hidden dimensions are in transformer architectures.

Published May 8, 2024 ET

You'll often hear, "X model has Y hidden dimensions."

The hidden dimension refers to the size of the vectors in the feed-forward neural networks within each transformer layer. It determines the capacity of each layer to process and store information.

Larger hidden dimension: Increases the model's capacity to learn and represent more intricate patterns in the data, enhancing performance on complex tasks. However, demands more memory and computational power.

Smaller hidden dimension: Reduces the model's capacity to handle complex patterns, potentially impairing performance but making the model more efficient in terms of memory and computational resources.

Source: ChatGPT 5/27/24