If you save numerator_t and denominator_t from a previous computation, then h_t ...

jostmey · on March 8, 2017

Yes, I do save the numerator and denominator.

The model I used is: h_t = f( (numerator_(t-1) + z x e^a) / (denominator_(t-1) + e^a) )

Equations (7) in the arxiv link (here it is again: https://arxiv.org/pdf/1703.01253.pdf) provide the update equations. When implementing the equations, you sometimes have to scale both the numerator and denominator back by a constant factor to avoid an overflow error

p51ngh · on March 12, 2017

I don't follow the intuition behind using an exponentiation of a as in e^a. You refer to this as a "context model" in your paper. Could you please elaborate? Thanks!

jostmey · on March 12, 2017

The name "concept model" is borrowed from one of the papers I cited. The concept model is just "a(x_i,h_{i-1})". It decides what is important. You can sort of think of it like a filter or gate. When the concept model returns a large value, creating a large exponent, the information encoded by "z(x_i,h_{i-1})" dominates the weighted average.