[13] A. G. Baydin, B. A. Pearlmutter, A. A. Radul, J. M. Siskind, Au-
tomatic differentiation in machine learning: a survey, arXiv preprint
arXiv:1502.05767 (2015).
[14] C. Basdevant, M. Deville, P. Haldenwang, J. Lacroix, J. Ouazzani,
R. Peyret, P. Orlandi, A. Patera, Spectral and finite difference solu-
tions of the Burgers equation, Computers & fluids 14 (1986) 23–41.
[15] S. H. Rudy, S. L. Brunton, J. L. Proctor, J. N. Kutz, Data-driven
discovery of partial differential equations, Science Advances 3 (2017).
[16] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
Corrado, A. Davis, J. Dean, M. Devin, et al., Tensorflow: Large-scale
machine learning on heterogeneous distributed systems, arXiv preprint
arXiv:1603.04467 (2016).
[17] D. C. Liu, J. Nocedal, On the limited memory BFGS method for large
scale optimization, Mathematical programming 45 (1989) 503–528.
[18] I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press, 2016.
[19] D. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv
preprint arXiv:1412.6980 (2014).
[20] A. Choromanska, M. Henaff, M. Mathieu, G. B. Arous, Y. LeCun, The
loss surfaces of multilayer networks, in: Artificial Intelligence and Statis-
tics, pp. 192–204.
[21] R. Shwartz-Ziv, N. Tishby, Opening the black box of deep neural net-
works via information, arXiv preprint arXiv:1703.00810 (2017).
[22] T. A. Driscoll, N. Hale, L. N. Trefethen, Chebfun guide, 2014.
[23] M. Stein, Large sample properties of simulations using latin hypercube
sampling, Technometrics 29 (1987) 143–151.
[24] A. Iserles, A first course in the numerical analysis of differential equa-
tions, 44, Cambridge University Press, 2009.
22