Talk:Gated recurrent unit
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||
|
Fully gated unit picture
[edit]Unless I am mistaken, the picture given for the fully gated recurrent unit does not match up with the equation in the article for the hidden state. The 1- node should connect to the product of the output of tanh, not the product with the previous hidden state. In other words, instead of the 1- node being on the arrow above z[t], it should be on the arrow to the right.
--ZaneDurante (talk) 18:21, 2 June 2020 (UTC)
- Yes, you are right! I also noticed this already in 2016 when I prepared lecture slides based on the formulas and this picture. They do not match. 193.174.205.82 (talk) 14:56, 18 January 2023 (UTC)
Article requires clarification
[edit]Is not clear on the article how the cell connects to another cell, to his own layer, or to what else it connects.
Remove CARU section?
[edit]Lots of publicity for a paper by Macao authors from a Macao IP address, with limited relevance for the GRU article. 194.57.247.3 (talk) 11:45, 28 October 2022 (UTC) Than Please describe what is y_hat(t) in the figure (it does not appear in equations) — Preceding unsigned comment added by Geofo (talk • contribs) 11:15, 29 August 2023 (UTC)
$z$ or $1-z$?
[edit]Why does this article have $h_t=(1-z_t) \odot h_{t-1} + z_t \odot \hat{h}_t$? The original paper (reference [1]) has h_t = z_t \odot h_{t-1} + (1-z_t) \odot \hat{h}_t, which is also the convention used by PyTorch (see this page) and tensorflow (not documented in the obvious place, but clear if you write some code to test it.) — Preceding unsigned comment added by Neil Strickland (talk • contribs) 23:19, 28 January 2024 (UTC)