Probability Seminar Essen
Summerterm 2022
Most of the talks take place via Zoom at this link
Apr 12 |
Stefan Ankirchner (University of Jena) Approximating stochastic gradient descent with diffusions: error expansions and impact of learning rate schedules Applying a stochastic gradient descent method for minimizing an objective gives rise to a discrete-time process of estimated parameter values. In order to better understand the dynamics of the estimated values it can make sense to approximate the discrete-time process with a continuous-time diffusion. We refine some results on the weak error of diffusion approximations. In particular, we explicitly compute the leading term in the error expansion of an ODE approximation with respect to a parameter discretizing the learning rate schedule. The leading term changes if one extends the ODE with a Brownian diffusion component. Finally, we show that if the learning rate is time varying, then its rate of change needs to enter the drift coefficient in order to obtain an approximation of order 2. |
May 10 |
Postponed. |
May 17 |
Steffen Dereich (University of Münster) On the existence of optimal shallow networks In this talk we discuss existence of global minima in optimisation problems over shallow neural networks. More explicitly, the function class over which we minimise is the family of all functions that can be expressed as artificial neural networks with one hidden layer featuring a specified number of neurons with ReLU (or Leaky ReLU) activation and one linear neuron (without activation function). We give existence results. Moreover, we provide counterexamples that illustrate the relevance of the assumptions imposed in the theorems. |
May 24 |
Tobias Werner (University of Kassel) Deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear partial differential equations Deep-learning based algorithms are employed in a wide field of real world applications and are nowadays the standard approach for most machine learning related problems. They are used extensively in face and speech recognition, fraud detection, function approximation, and solving of partial differential equations (PDEs). The latter are utilized to model numerous phenomena in nature, medicine, economics, and physics. Often, these PDEs are nonlinear and high-dimensional. For instance, in the famous BlackScholes model the PDE dimension $d \in \mathbb{N}$ corresponds to the number of stocks considered in the model. Relaxing the rather unrealistic assumptions made in the model results in a loss of linearity within the corresponding Black-Scholes PDE. |
May 31 |
Josué Nussbaumer (Université Gustave Eiffel) Algebraic two-level measure trees Wolfgang Löhr and Anita Winter introduced algebraic trees which generalize the notion of graph-theoretic trees to potentially uncountable structures. They equipped the space of binary algebraic measure trees with a topology that relies on the Gromov-weak convergence of particular metric representations. They showed that this topology is compact and equivalent to the sample shape convergence on the subspace of binary algebraic measure trees, by encoding the latter with triangulations of the circle. We extended these results to a two-level setup, where algebraic trees are equipped with a probability measure on the set of probability measures. To do so, we encoded algebraic two-level measure trees with triangulations of the circle together with a two-level measure on the circle line. As an application, we constructed the algebraic nested Kingman coalescent. |
Jun 07 | Katharina Pohl (University Duisburg-Essen) |
Jun 14 | |
Jun 21 | |
Jun28 |
|
Jul 05 | Barbara Rüdiger-Mastandrea (University of Wuppertal) |
Jul 12 |
Talks of previous terms.