Efficient event-based delay learning in spiking neural networks

Table of Contents

Theory

Learning weights in networks with delay

We start by defining our two differential equations in the implicit form for the membrane potentials and input currents, respectively.

$$_\equiv _{{}}\dot{ }+{{ }}-{{{\bf{I}}}}=0$$

(7)

$${{{{\bf{f}}}}}_{I}\equiv {\tau }_{{{{\rm{s}}}}}\dot{{{{\bf{I}}}}}+{{{\bf{I}}}}=0$$

(8)

In the following, we will assume that all event times ${{{\mathscr{E}}}}$ are distinct, both in terms of spikes occurring and of spikes arriving. In continuous time, this is not unlikely, but also, as argued in ref. ¹⁸, the equations do not break down if spikes occur or arrive at the same time. Then,

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{w}_{ji}}=\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}{w}_{ji}}\left[{l}_{p}({{{\mathscr{S}}}})+{\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}\left[{l}_{V}({{{\bf{V}}}},t)+{{{{\boldsymbol{\lambda }}}}}_{V}\cdot {{{{\bf{f}}}}}_{V}+{{{{\boldsymbol{\lambda }}}}}_{I}\cdot {{{{\bf{f}}}}}_{I}\right]{{{\rm{d}}}}t\right]$$

(9)

where we have added the product of adjoint variables and dynamics functions to the loss function as the adjoint method dictates. This is possible because for solutions of the forward dynamics, f_V and f_I are identically zero at all times. Using

$$\frac{\partial {{{{\bf{f}}}}}_{V}}{\partial {w}_{ji}}={\tau }_{{{{\rm{m}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}+\frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}-\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}$$

(10)

$$\frac{\partial {{{{\bf{f}}}}}_{I}}{\partial {w}_{ji}}={\tau }_{{{{\rm{s}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}+\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}},$$

(11)

we can apply the derivative on the right-hand side of (9) to obtain

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{w}_{ji}}= {\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}\\ +{\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}\left[\frac{\partial {l}_{V}}{\partial {{{\bf{V}}}}}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}+{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \left({\tau }_{{{{\rm{m}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}+\frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}-\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}\right)\right.\\ \left.+{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \left({\tau }_{{{{\rm{s}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}+\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}\right)\right]{{{\rm{d}}}}t\\ +{l}_{V,k+1}^{-}\frac{{{{\rm{d}}}}{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{w}_{ji}}-{l}_{V,k}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{w}_{ji}}$$

(12)

Using partial integration, we can rewrite

$$\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}{{{\rm{d}}}}t=-\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{V}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}{{{\rm{d}}}}t+{\left[{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}$$

(13)

and

$$\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}{{{\rm{d}}}}t=-\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{I}\cdot \frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}{{{\rm{d}}}}t+{\left[{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}.$$

(14)

Inserting this into (12), we get

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{w}_{ji}}= {\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}\\ +{\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}\left[\left(\frac{\partial {l}_{V}}{\partial {{{\bf{V}}}}}-{\tau }_{{{{\rm{m}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{V}+{{{{\boldsymbol{\lambda }}}}}_{V}\right)\cdot \frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}+(-{\tau }_{{{{\rm{s}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{I}+{{{{\boldsymbol{\lambda }}}}}_{I}-{{{{\boldsymbol{\lambda }}}}}_{V})\cdot \frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}\right]{{{\rm{d}}}}t\\ +{\tau }_{{{{\rm{m}}}}}{\left[{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}+{\tau }_{{{{\rm{s}}}}}{\left[{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}\\ +{l}_{V,k+1}^{-}\frac{{{{\rm{d}}}}{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{w}_{ji}}-{l}_{V,k}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{w}_{ji}}$$

(15)

where the last two terms arise from the derivative of the bounds of the integral in the Leibniz rule. We now define the backwards dynamics of the adjoint variables as usual¹⁸,

$${\tau }_{{{{\rm{m}}}}}{{{{\boldsymbol{\lambda }}}}}_{V}^{{\prime} }=-{{{{\boldsymbol{\lambda }}}}}_{V}-\frac{\partial {l}_{V}}{\partial {{{\bf{V}}}}}$$

(16)

$${\tau }_{{{{\rm{s}}}}}{{{{\boldsymbol{\lambda }}}}}_{I}^{{\prime} }=-{{{{\boldsymbol{\lambda }}}}}_{I}+{{{{\boldsymbol{\lambda }}}}}_{V}$$

(17)

which cancels the terms containing $\frac{\partial {{{\bf{V}}}}}{\partial {w}_{ji}}$ and $\frac{\partial {{{\bf{I}}}}}{\partial {w}_{ji}}$, so that we get

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{w}_{ji}} = {\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}+{\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}\left({l}_{V,k}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{w}_{ji}}-{l}_{V,k}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{w}_{ji}}\right.\\ +\left.{\left.\left[{\tau }_{{{{\rm{m}}}}}\left({{{{\boldsymbol{\lambda }}}}}_{V}^{-}\cdot \frac{\partial {{{{\bf{V}}}}}^{-}}{\partial {w}_{ji}}-{{{{\boldsymbol{\lambda }}}}}_{V}^{+}\cdot \frac{\partial {{{{\bf{V}}}}}^{+}}{\partial {w}_{ji}}\right)+{\tau }_{{{{\rm{s}}}}}\left({{{{\boldsymbol{\lambda }}}}}_{I}^{-}\cdot \frac{\partial {{{{\bf{I}}}}}^{-}}{\partial {w}_{ji}}-{{{{\boldsymbol{\lambda }}}}}_{I}^{+}\cdot \frac{\partial {{{{\bf{I}}}}}^{+}}{\partial {w}_{ji}}\right)\right]\right\vert }_{{t}_{k}^{{{{\rm{event}}}}}}\right)$$

(18)

The sum over events in ${{{\mathscr{E}}}}$ extends over spike emission times ${t}_{k}^{{{{\rm{spike}}}}}$ and spike arrival times. We first focus on the spike emission times ${t}_{k}^{{{{\rm{spike}}}}}$. Before the jump at ${t}_{k}^{{{{\rm{spike}}}}}$ we have,

$${V}_{n(k)}^{-}-\vartheta=0,$$

(19)

where n(k) denotes the spiking neuron at event k. If we take the derivative of this equation, we get, using the chain rule,

$$\frac{\partial {V}_{n(k)}^{-}}{\partial {w}_{ji}}+{\dot{V}}_{n(k)}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}=0$$

(20)

$$\Rightarrow \quad \frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}=-\frac{1}{{\dot{V}}_{n(k)}^{-}}\frac{\partial {V}_{n(k)}^{-}}{\partial {w}_{ji}},$$

(21)

and after the jump,

$${V}_{n(k)}^{+}=0$$

(22)

$$\Rightarrow \quad \frac{\partial {V}_{n(k)}^{+}}{\partial {w}_{ji}}+{\dot{V}}_{n(k)}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}=0\,.$$

(23)

Inserting (21) into (23) we obtain as usual¹⁸

$$\frac{\partial {V}_{n(k)}^{+}}{\partial {w}_{ji}}=\frac{{\dot{V}}_{n(k)}^{+}}{{\dot{V}}_{n(k)}^{-}}\frac{\partial {V}_{n(k)}^{-}}{\partial {w}_{ji}}.$$

(24)

For the current I_n(k), there is no jump at ${t}_{k}^{{{{\rm{spike}}}}}$, and also not in its derivative: ${I}_{n(k)}^{+}={I}_{n(k)}^{-}$ and ${\dot{I}}_{n(k)}^{+}={\dot{I}}_{n(k)}^{-}$ implies

$$\frac{\partial {I}_{n(k)}^{+}}{\partial {w}_{ji}}=\frac{\partial {I}_{n(k)}^{-}}{\partial {w}_{ji}}.$$

(25)

Let us now consider what happens at the spike arrival times, when the spike k at ${t}_{k}^{{{{\rm{spike}}}}}$ is received at all the postsynaptic neurons m at times ${t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}$ (i.e. we look at ${{{\mathscr{E}}}}\setminus {{{\mathscr{S}}}}$). Note that this is where EventProp with delays becomes substantially different from standard EventProp, where spike emission and arrival times are the same. At spike arrival, the input current of the receiving neurons jumps,

$${I}_{m}^{+}={I}_{m}^{-}+{w}_{mn(k)}.$$

(26)

By taking the derivative with respect to w_ji, we get

$$\frac{\partial {I}_{m}^{+}}{\partial {w}_{ji}}+{\dot{I}}_{m}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}=\frac{\partial {I}_{m}^{-}}{\partial {w}_{ji}}+{\dot{I}}_{m}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}+{\delta }_{in(k)}{\delta }_{jm},$$

(27)

where we have used that $\frac{{{{\rm{d}}}}({t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)})}{{{{\rm{d}}}}{w}_{ji}}=\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}$. Now, using the dynamics equations for I, we also have

$${\tau }_{{{{\rm{s}}}}}{\dot{I}}_{m}^{+}={\tau }_{{{{\rm{s}}}}}{\dot{I}}_{m}^{-}-{w}_{mn(k)},$$

(28)

and hence,

$$\frac{\partial {I}_{m}^{+}}{\partial {w}_{ji}}= \frac{\partial {I}_{m}^{-}}{\partial {w}_{ji}}+{\tau }_{{{{\rm{s}}}}}^{-1}{w}_{mn(k)}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}+{\delta }_{in(k)}{\delta }_{jm}\\ = \frac{\partial {I}_{m}^{-}}{\partial {w}_{ji}}+{\left.\left[\frac{1}{{\tau }_{{{{\rm{s}}}}}{\dot{V}}_{n(k)}^{-}}{w}_{mn(k)}\frac{\partial {V}_{n(k)}^{-}}{\partial {w}_{ji}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}+{\delta }_{in(k)}{\delta }_{jm}$$

(29)

where we have used (21) to replace $\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}$. Since we have ${V}_{m}^{+}={V}_{m}^{-}$ for non-spiking neurons,

$$\frac{\partial {V}_{m}^{+}}{\partial {w}_{ji}}+{\dot{V}}_{m}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}=\frac{\partial {V}_{m}^{-}}{\partial {w}_{ji}}+{\dot{V}}_{m}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{w}_{ji}}.$$

(30)

From Eq. (26) and the dynamics equations for V we know

$${\tau }_{{{{\rm{m}}}}}{\dot{V}}_{m}^{+}={\tau }_{{{{\rm{m}}}}}{\dot{V}}_{m}^{-}+{w}_{mn(k)}.$$

(31)

Putting this together, we get

$$\frac{\partial {V}_{m}^{+}}{\partial {w}_{ji}}=\frac{\partial {V}_{m}^{-}}{\partial {w}_{ji}}-{\tau }_{{{{\rm{m}}}}}^{-1}{w}_{mn(k)}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{w}_{ji}}$$

(32)

$$=\frac{\partial {V}_{m}^{-}}{\partial {w}_{ji}}+{\left.\left[\frac{1}{{\tau }_{{{{\rm{m}}}}}{\dot{V}}_{n(k)}^{-}}{w}_{mn(k)}\frac{\partial {V}_{n(k)}^{-}}{\partial {w}_{ji}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}$$

(33)

We now can insert the expressions (21), (25), (24) and (33) into (18) and reorder terms according to which spike the jumps originate from, we get

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{w}_{ji}} = {\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}\left[\frac{\partial {V}_{n(k)}^{-}}{\partial {w}_{ji}}\left[{\tau }_{{{{\rm{m}}}}}\left({\lambda }_{V,n(k)}^{-}-\frac{{\dot{V}}_{n(k)}^{+}}{{\dot{V}}_{n(k)}^{-}}{\lambda }_{V,n(k)}^{+}\right)+\frac{1}{{\dot{V}}_{n(k)}^{-}}\left(-\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}+{l}_{V}^{+}-{l}_{V}^{-}\right)\right]\right.\\ {\left.\left.+{\tau }_{{{{\rm{s}}}}}({\lambda }_{I,n(k)}^{-}-{\lambda }_{I,n(k)}^{+})\frac{\partial {I}_{n(k)}^{-}}{\partial {w}_{ji}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}\\ +{\sum}_{m}\left[{\tau }_{{{{\rm{m}}}}}({\lambda }_{V,m}^{-}-{\lambda }_{V,m}^{+})\frac{\partial {V}_{m}^{-}}{\partial {w}_{ji}}+{\tau }_{{{{\rm{s}}}}}({\lambda }_{I,m}^{-}-{\lambda }_{I,m}^{+})\frac{\partial {I}_{m}^{-}}{\partial {w}_{ji}}\right]{| }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}\\ +{\left.\left[\frac{\partial {V}_{n(k)}^{-}}{\partial {w}_{ji}}\frac{1}{{\dot{V}}_{n(k)}^{-}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}{\left.\left[{w}_{mn(k)}({\lambda }_{I,m}^{+}-{\lambda }_{V,m}^{+})\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}-{\left.\left[{\tau }_{{{{\rm{s}}}}}{\delta }_{in(k)}{\delta }_{jm}{\lambda }_{I,m}^{+}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}.$$

(34)

Interestingly, after this detailed work, we find that the update of λ_V of the spiking neuron is the same as without delays, apart from taking the receiving neurons’ corresponding λ_V and λ_I at the delayed time.

$${\lambda }_{V,n(k)}^{-} = {\left.\left[\frac{{\dot{V}}_{n(k)}^{+}}{{\dot{V}}_{n(k)}^{-}}{\lambda }_{V,n(k)}^{+}+\frac{1}{{\tau }_{{{{\rm{m}}}}}{\dot{V}}_{n(k)}^{-}}\left(\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}+{l}_{V}^{-}-{l}_{V}^{+}\right)\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}\\ +{\left.\left[\frac{1}{{\tau }_{{{{\rm{m}}}}}{\dot{V}}_{n(k)}^{-}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}{\sum}_{m}{w}_{mn(k)}{\left.\left[({\lambda }_{V,m}^{+}-{\lambda }_{I,m}^{+})\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}$$

(35)

$${\lambda }_{V,m}^{-}={\lambda }_{V,m}^{+},\,{{\mbox{if}}}\,\,m\ne n(k)$$

(36)

$${{{{\boldsymbol{\lambda }}}}}_{I}^{-}={{{{\boldsymbol{\lambda }}}}}_{I}^{+}.$$

(37)

The gradient is then given by

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{w}_{ji}}=-{\tau }_{{{{\rm{s}}}}}{\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}{\delta }_{in(k)}{\lambda }_{I,j}{| }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{jn(k)}}=-{\tau }_{{{{\rm{s}}}}}{\sum}_{\left\{{t}_{k}^{{{{\rm{spike}}}}}\,| n(k)=i\right\}}{\lambda }_{I,j}{| }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{ji}}.$$

(38)

Learning delays

In the following, we will derive the gradients for delays d_ji similarly to our weight gradient derivations. We start again with the standard approach for the adjoint method,

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{d}_{ji}}=\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}{d}_{ji}}\left[{l}_{p}({{{\mathscr{S}}}})+{\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}\left[{l}_{V}({{{\bf{V}}}},t)+{{{{\boldsymbol{\lambda }}}}}_{V}\cdot {{{{\bf{f}}}}}_{V}+{{{{\boldsymbol{\lambda }}}}}_{I}\cdot {{{{\bf{f}}}}}_{I}\right]{{{\rm{d}}}}t\right]$$

(39)

$$\frac{\partial {{{{\bf{f}}}}}_{V}}{\partial {d}_{ji}}={\tau }_{{{{\rm{m}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}+\frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}-\frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}$$

(40)

$$\frac{\partial {{{{\bf{f}}}}}_{I}}{\partial {d}_{ji}}={\tau }_{{{{\rm{s}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}+\frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}.$$

(41)

Therefore,

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{d}_{ji}}= {\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{d}_{ji}}\\ +{\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}\left[\frac{\partial {l}_{V}}{\partial {{{\bf{V}}}}}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}+{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \left({\tau }_{{{{\rm{m}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}+\frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}-\frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}\right)\right.\\ \left.+{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \left({\tau }_{{{{\rm{s}}}}}\frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}+\frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}\right)\right]{{{\rm{d}}}}t\\ +{l}_{V,k+1}^{-}\frac{{{{\rm{d}}}}{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}-{l}_{V,k}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}.$$

(42)

Then, using partial integration,

$$\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}{{{\rm{d}}}}t=-\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{{{{\bf{V}}}}}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}{{{\rm{d}}}}t+{\left[{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}$$

(43)

$$\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \frac{{{{\rm{d}}}}}{{{{\rm{d}}}}t}\frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}{{{\rm{d}}}}t=-\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{I}\cdot \frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}{{{\rm{d}}}}t+{\left[{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}$$

(44)

and hence,

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{d}_{ji}}= {\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in S}\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{d}_{ji}}\\ {\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}\left[\int_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}\left(\frac{\partial {l}_{V}}{\partial {{{\bf{V}}}}}-{\tau }_{{{{\rm{m}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{V}+{{{{\boldsymbol{\lambda }}}}}_{V}\right)\cdot \frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}+(-{\tau }_{{{{\rm{s}}}}}{\dot{{{{\boldsymbol{\lambda }}}}}}_{I}+{{{{\boldsymbol{\lambda }}}}}_{I}-{{{{\boldsymbol{\lambda }}}}}_{V})\cdot \frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}\right]{{{\rm{d}}}}t\\ +{\tau }_{{{{\rm{m}}}}}{\left[{{{{\boldsymbol{\lambda }}}}}_{V}\cdot \frac{\partial {{{\bf{V}}}}}{\partial {d}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}+{\tau }_{{{{\rm{s}}}}}{\left[{{{{\boldsymbol{\lambda }}}}}_{I}\cdot \frac{\partial {{{\bf{I}}}}}{\partial {d}_{ji}}\right]}_{{t}_{k}^{{{{\rm{event}}}}}}^{{t}_{k+1}^{{{{\rm{event}}}}}}+{l}_{V,k+1}^{-}\frac{{{{\rm{d}}}}{t}_{k+1}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}-{l}_{V,k}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}.$$

(45)

If we now define the adjoint dynamics as usual, the terms in the integral disappear, and we are left with

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{d}_{ji}}= {\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{d}_{ji}}\\ +{\sum}_{{t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}}{l}_{V,k}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}-{l}_{V,k}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}} \\ +{\left.\left[{\tau }_{{{{\rm{m}}}}}\left({{{{\boldsymbol{\lambda }}}}}_{V}^{-}\cdot \frac{\partial {{{{\bf{V}}}}}^{-}}{\partial {d}_{ji}}-{{{{\boldsymbol{\lambda }}}}}_{V}^{+}\cdot \frac{\partial {{{{\bf{V}}}}}^{+}}{\partial {d}_{ji}}\right)+{\tau }_{{{{\rm{s}}}}}\left({{{{\boldsymbol{\lambda }}}}}_{I}^{-}\cdot \frac{\partial {{{{\bf{I}}}}}^{-}}{\partial {d}_{ji}}-{{{{\boldsymbol{\lambda }}}}}_{I}^{+}\cdot \frac{\partial {{{{\bf{I}}}}}^{+}}{\partial {d}_{ji}}\right)\right]\right\vert }_{{t}_{k}^{{{{\rm{event}}}}}}.$$

(46)

Let’s now again first consider the spike emission times ${t}_{k}^{{{{\rm{spike}}}}}$ and the spiking neuron n(k). Before the jump:

$$\frac{\partial {V}_{n(k)}^{-}}{\partial {d}_{ji}}+{\dot{V}}_{n(k)}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{d}_{ji}}=0$$

(47)

$$\Rightarrow \quad \frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{d}_{ji}}=-\frac{1}{{\dot{V}}_{n(k)}^{-}}\frac{\partial {V}_{n(k)}^{-}}{\partial {d}_{ji}},$$

(48)

and after the jump:

$$\frac{\partial {V}_{n(k)}^{+}}{\partial {d}_{ji}}+{\dot{V}}_{n(k)}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{spike}}}}}}{{{{\rm{d}}}}{d}_{ji}}=0$$

(49)

$$\Rightarrow \quad \frac{\partial {V}_{n(k)}^{+}}{\partial {d}_{ji}}=\frac{{\dot{V}}_{n(k)}^{+}}{{\dot{V}}_{n(k)}^{-}}\frac{\partial {V}_{n(k)}^{-}}{\partial {d}_{ji}}.$$

(50)

There is no jump in I_n(k) or its time derivative at ${t}_{k}^{{{{\rm{spike}}}}}$ which analogous to above implies

$$\frac{\partial {I}_{n(k)}^{+}}{\partial {d}_{ji}}=\frac{\partial {I}_{n(k)}^{-}}{\partial {d}_{ji}}.$$

(51)

Turning to spike arrival times ${t}_{k}^{{{{\rm{event}}}}}\in {{{\mathscr{E}}}}\backslash {{{\mathscr{S}}}}$, when the spike at ${t}_{k}^{{{{\rm{spike}}}}}$ arrives at the post-synaptic neurons m, we get

$${I}_{m}^{+}={I}_{m}^{-}+{w}_{mn(k)},$$

(52)

and hence,

$$\frac{\partial {I}_{m}^{+}}{\partial {d}_{ji}}+{\dot{I}}_{m}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}=\frac{\partial {I}_{m}^{-}}{\partial {d}_{ji}}+{\dot{I}}_{m}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}.$$

(53)

Using the dynamics of I, (52) implies

$${\tau }_{{{{\rm{s}}}}}{\dot{I}}_{m}^{+}={\tau }_{{{{\rm{s}}}}}{\dot{I}}_{m}^{-}-{w}_{mn(k)},$$

(54)

and hence

$$\frac{\partial {I}_{m}^{+}}{\partial {d}_{ji}}=\frac{\partial {I}_{m}^{-}}{\partial {d}_{ji}}+{\tau }_{{{{\rm{s}}}}}^{-1}{w}_{mn(k)}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}$$

(55)

$$=\frac{\partial {I}_{m}^{-}}{\partial {d}_{ji}}-\frac{1}{{\tau }_{{{{\rm{s}}}}}{\dot{V}}_{n(k)}^{-}}{w}_{mn(k)}\frac{\partial {V}_{n(k)}^{-}}{\partial {d}_{ji}}+{\delta }_{in(k)}{\delta }_{jm}\frac{{w}_{mn(k)}}{{\tau }_{{{{\rm{s}}}}}},$$

(56)

where the term involving the spiking neuron n(k) stems from the derivative of the spike time ${t}_{k}^{{{{\rm{event}}}}}$ with respect to d_ji using (48) and the last term from the derivative of the delay by itself (since $\frac{\partial {t}_{k}^{{{{\rm{event}}}}}}{\partial {d}_{ji}}=\frac{\partial ({t}_{k}^{{{{\rm{spike}}}}}+{d}_{ji})}{\partial {d}_{ji}}=\frac{\partial {t}_{k}^{{{{\rm{spike}}}}}}{\partial {d}_{ji}}+1$). Note that this is where the derivations begin to differ from when we were taking the derivative with respect to w_ji. For the voltages,

$$\frac{\partial {V}_{m}^{+}}{\partial {d}_{ji}}+{\dot{V}}_{m}^{+}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}=\frac{\partial {V}_{m}^{-}}{\partial {d}_{ji}}+{\dot{V}}_{m}^{-}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}},$$

(57)

and using the dynamics of V and (52),

$${\tau }_{{{{\rm{m}}}}}{\dot{V}}_{m}^{+}={\tau }_{{{{\rm{m}}}}}{\dot{V}}_{m}^{-}+{w}_{mn(k)},$$

(58)

which put together gives

$$\frac{\partial {V}_{m}^{+}}{\partial {d}_{ji}}=\frac{\partial {V}_{m}^{-}}{\partial {d}_{ji}}-{\tau }_{{{{\rm{m}}}}}^{-1}{w}_{mn}\frac{{{{\rm{d}}}}{t}_{k}^{{{{\rm{event}}}}}}{{{{\rm{d}}}}{d}_{ji}}$$

(59)

$$=\frac{\partial {V}_{m}^{-}}{\partial {d}_{ji}}+\frac{1}{{\tau }_{{{{\rm{m}}}}}{\dot{V}}_{n(k)}^{-}}{w}_{mn(k)}\frac{\partial {V}_{n(k)}^{-}}{\partial {d}_{ji}}-{\delta }_{in(k)}{\delta }_{jm}\frac{{w}_{mn(k)}}{{\tau }_{{{{\rm{m}}}}}},$$

(60)

where the last term again arises from the derivative of the delay d_mn(k) in ${t}_{k}^{{{{\rm{event}}}}}$ with respect to d_ji. Taking everything together, we get

$$\frac{d{{{\mathscr{L}}}}}{d{d}_{ji}}={\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}\left[\frac{\partial {V}_{n(k)}^{-}}{\partial {d}_{ji}}\left[{\tau }_{{{{\rm{m}}}}}\left({\lambda }_{V,n(k)}^{-}-\frac{{\dot{V}}_{n(k)}^{+}}{{\dot{V}}_{n(k)}^{-}}{\lambda }_{V,n(k)}^{+}\right)+\frac{1}{{\dot{V}}_{n(k)}^{-}}\left(-\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}+{l}_{V}^{+}-{l}_{V}^{-}\right)\right]\right.$$

(61)

$${\left.\left.+{\tau }_{{{{\rm{s}}}}}({\lambda }_{I,n(k)}^{-}-{\lambda }_{I,n(k)}^{+})\frac{\partial {I}_{n}^{-}}{\partial {d}_{ji}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}$$

(62)

$$+{\sum}_{m}{\left.\left[{\tau }_{{{{\rm{m}}}}}({\lambda }_{V,m}^{-}-{\lambda }_{V,m}^{+})\frac{\partial {V}_{m}^{-}}{\partial {d}_{ji}}+{\tau }_{{{{\rm{s}}}}}({\lambda }_{I,m}^{-}-{\lambda }_{I,m}^{+})\frac{\partial {I}_{m}^{-}}{\partial {d}_{ji}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}$$

(63)

$$+{\left.\left[\frac{\partial {V}_{n}^{-}}{\partial {d}_{ji}}\frac{1}{{\dot{V}}_{n(k)}^{-}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}{\left.\left[{w}_{mn(k)}({\lambda }_{I,m}^{+}-{\lambda }_{V,m}^{+})\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}$$

(64)

$$-{\left.\left[{w}_{mn(k)}{\delta }_{in(k)}{\delta }_{jm}({\lambda }_{I,m}^{+}-{\lambda }_{V,m}^{+})\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}.$$

(65)

So, using the usual trick

$$\frac{{\dot{V}}_{n(k)}^{+}}{{\dot{V}}_{n(k)}^{-}}=\frac{\vartheta }{{\tau }_{{{{\rm{m}}}}}{\dot{V}}_{n(k)}^{-}}+1,$$

(66)

we again arrive at the same jump conditions as usual,

$$\begin{array}{rcl}{\lambda }_{V,n(k)}^{-}&=&{\left.\left[{\lambda }_{V,n(k)}^{+}+\frac{1}{{\tau }_{{{{\rm{m}}}}}{\dot{V}}_{n(k)}^{-}}\left[\vartheta \cdot {\lambda }_{V,n(k)}^{+}+\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}+{l}_{V}^{-}-{l}_{V}^{+}\right]\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}\\ &&+{\left.\left[\frac{1}{{\tau }_{{{{\rm{m}}}}}{\dot{V}}_{n(k)}^{-}}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}}{\sum}_{m}{w}_{mn(k)}{\left.\left[{\lambda }_{V,m}^{+}-{\lambda }_{I,m}^{+}\right]\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}\end{array}$$

(67)

$${\lambda }_{V,m}^{-}={\lambda }_{V,m}^{+},\,{{\mbox{if}}}\,\,m\ne n(k)$$

(68)

$${{{{\boldsymbol{\lambda }}}}}_{I}^{-}={{{{\boldsymbol{\lambda }}}}}_{I}^{+},$$

(69)

but the gradient updates take the form

$$\frac{{{{\rm{d}}}}{{{\mathscr{L}}}}}{{{{\rm{d}}}}{d}_{ji}}=-{\sum}_{{t}_{k}^{{{{\rm{spike}}}}}\in {{{\mathscr{S}}}}}{w}_{ji}{\delta }_{in(k)}{\left.({\lambda }_{I,j}-{\lambda }_{V,j})\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{jn(k)}} \\=-{w}_{ji}{\sum}_{\left\{{t}_{k}^{{{{\rm{spike}}}}}\,| n(k)=i\right\}}{\left.({\lambda }_{I,j}-{\lambda }_{V,j})\right\vert }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{ji}}.$$

(70)

Time-invariant mean squared error loss

Following Göltz et al.³⁹, we use the time-invariant mean squared error loss of output spike times for the Yin-Yang benchmark

$${{{{\mathscr{L}}}}}_{\Delta {{{\rm{MSE}}}}}=\frac{1}{2}{\sum}_{i\ne c}{\left({t}_{i}-{t}_{c}-{\Delta }_{t}\right)}^{2},$$

(71)

where c is the true class of the current input and t_i, t_c denote the first spike time in the respective output neurons. In the EventProp formalism, this is a spike-time dependent loss l_p and, therefore, drives jumps in λ_V,i in output neuron i at spike times ${t}_{k}^{{{{\rm{spike}}}}}$ in the backward pass (see Table 1) by

$$\frac{\partial {l}_{p}}{\partial {t}_{k}^{{{{\rm{spike}}}}}}=\left\{\begin{array}{ll}({t}_{i}-{t}_{c}-{\Delta }_{t})&{{{\rm{if}}}}\,n(k)=i,{t}_{k}^{{{{\rm{spike}}}}}={t}_{i},i\ne c\\ {\sum}_{i\ne c}-({t}_{i}-{t}_{c}-{\Delta }_{t})&{{{\rm{if}}}}\,n(k)=c,{t}_{k}^{{{{\rm{spike}}}}}={t}_{c}\\ 0&\,{\mbox{otherwise}}\,\end{array}\right.$$

(72)

Implementation

We implemented all of our work in the mlGeNN framework^40,41 to exploit the the efficiency of event-based learning. In all of our experiments, we used the parameters from previous EventProp work²¹, apart from spike regularisation strengths, number of hidden layers and recurrent connections. We did not implement heterogeneous and trainable time constants, so that the independent effect of delays would be more clear. For our experiments on the SHD and SSC datasets, we adopted the data augmentation approaches described by Nowotny, Turner, and Knight²¹, which were designed to improve generalization. Specifically, we implemented the following augmentations:

Input Shifting: We randomly shifted all inputs by a value within the range of (−40, 40).
Input Blending: We blended two inputs from the same class by aligning their centres of mass and randomly selecting spikes from each input with a probability of 0.5.

For SSC we only used the shift augmentation. For the Yin-Yang dataset we decreased the learning rate on both weights and delays at the end of each epoch. On SHD and SSC, we implemented an “ease-in” scheduler on the weight learning rate, starting from 0.001 times the learning rate, increasing it at the end of each batch, until it reached the final value. For our chosen hyperparameters, see Tables 2–5. GeNN already provided an efficient implementation of spike transmission with per-synapse delays⁴³ – allowing the EventProp forward pass to be implemented efficiently. However, the λ_V transitions in the backward pass require access to postsynaptic λ values with a per-synapse delay ($\left[{\lambda }_{V,m}^{+}-{\lambda }_{I,m}^{+}\right]{| }_{{t}_{k}^{{{{\rm{spike}}}}}+{d}_{mn(k)}}$ from Equation: (35)). This required a small extension to GeNN’s existing system for providing delayed access to postsynaptic variables from a synapse model⁴² in order to enable it to use the per-synapse delays used for spike transmission in the forward pass.

Table 2 Yin-Yang parameters

Table 5 Braille reading parameters

link

Facebook baixar gratis

Efficient event-based delay learning in spiking neural networks

Theory

Learning weights in networks with delay

Learning delays

Time-invariant mean squared error loss

Implementation

Top 10 Business Analytics careers (and their salaries) in the UK 2025

Greg Kutzin on the Role of Python in Modern Business Analysis

How to Stand Out as a Business Analyst

Scaling Medical Digital Transformation in Global Pharma While Ensuring Local Fit