Neural Net: Hidden Dimensions
What hidden dimensions are in transformer architectures.
You'll often hear, "X model has Y hidden dimensions."
The hidden dimension refers to the size of the vectors in the feed-forward neural networks within each transformer layer. It determines the capacity of each layer to process and store information.
Larger hidden dimension: Increases the model's capacity to learn and represent more intricate patterns in the data, enhancing performance on complex tasks. However, demands more memory and computational power.
Smaller hidden dimension: Reduces the model's capacity to handle complex patterns, potentially impairing performance but making the model more efficient in terms of memory and computational resources.
Source: ChatGPT 5/27/24