|
1 | 1 | Modern machine learning relies heavily on rapidly training and evaluating neural networks in problems ranging from image classification \cite{He++2016} to robotic control \cite{Siekmann++2021a}. Most neural network architectures have no robustness certificates, and can be sensitive to adversarial attacks and other input perturbations \cite{Huang++2017}. Many approaches that address this brittle behaviour rely on explicitly enforcing constraints during training to smooth or stabilize the network response \cite{Pauli++2022,Junnarkar++2023}. While effective on small-scale problems, these methods are computationally expensive, making them slow and difficult to scale up to complex real-world problems. |
2 | 2 |
|
3 | | -Recently, we proposed the \textit{Recurrent Equilibrium Network} (REN) \cite{Revay++2023} and \textit{Lipschitz-Bounded Deep Network} (LBDN) or \textit{sandwich layer} \cite{Wang+Manchester2023} model classes as computationally efficient solutions to these problems. The REN architecture is flexible in that it includes all common neural network models, such as multi-layer-perceptrons (MLPs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs). The weights and biases in RENs are directly parameterized to \textit{naturally satisfy} behavioural constraints chosen by the user. For example, we can build a REN with a given Lipschitz constant to ensure its output is quantifiably less sensitive to input perturbations. LBDNs are specializations of RENs with the specific feed-forward structure of deep neural networks like MLPs or CNNs and built-in guarantees on the Lipschitz bound. |
| 3 | +Recently, we proposed the \textit{Recurrent Equilibrium Network} (REN) \cite{Revay++2023} and \textit{Lipschitz-Bounded Deep Network} (LBDN) or \textit{sandwich layer} \cite{Wang+Manchester2023} model classes as computationally efficient solutions to these problems. RENs are flexible in that they include many common neural network models, such as multi-layer-perceptrons (MLPs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs). Their weights and biases are parameterized to naturally satisfy a set of user-defined robustness metrics constraining the internal stability and input-output sensitivity of the network. When a network is guaranteed to satisfy a robust metric, we call this a \textit{robustness certificate}. An example is a Lipschitz bound, which restricts the network's amplification of input perturbations in its outputs \cite{Pauli++2022}. LBDNs are specializations of RENs with the specific feed-forward structure of deep neural networks like MLPs or CNNs, and built-in restrictions on the Lipschitz bound. |
4 | 4 |
|
5 | | -The direct parameterization of RENs and LBDNs means that we can train models with standard, unconstrained optimization methods (such as stochastic gradient descent) while also guaranteeing their robustness. Achieving the “best of both worlds” in this way is the main advantage of the REN and LBDN model classes, and allows the user to freely train robust models for many common machine learning problems, as well as for more challenging real-world applications where safety is critical. |
| 5 | +This special parameterization of RENs and LBDNs means that we can train models with standard, unconstrained optimization methods (such as stochastic gradient descent) while also guaranteeing their robustness. Achieving the “best of both worlds” in this way is the main advantage of the REN and LBDN model classes, and allows the user to freely train robust models for many common machine learning problems, as well as for more challenging real-world applications where safety is critical. |
6 | 6 |
|
7 | | -This papers presents \verb|RobustNeuralNetworks.jl|: a package for neural network models that naturally satisfy robustness constraints. The package contains implementations of the REN and LBDN model classes introduced in \cite{Revay++2023} and \cite{Wang+Manchester2023}, respectively, and relies heavily on key features of the Julia language \cite{Bezanson++2017} (such as multiple dispatch) for an efficient implementation of these models. The purpose of \verb|RobustNeuralNetworks.jl| is to make our recent research in robust machine learning easily accessible to users in the scientific and machine learning communities. With this in mind, we have designed the package to interface directly with \verb|Flux.jl| \cite{Innes2018}, Julia's most widely-used machine learning package, making it straightforward to incorporate our robust neural networks into existing Julia code. |
| 7 | +This papers presents \verb|RobustNeuralNetworks.jl|: a package for neural networks with built-in robustness certificates. The package contains implementations of the REN and LBDN model classes introduced in \cite{Revay++2023} and \cite{Wang+Manchester2023}, respectively, and relies heavily on key features of the Julia language \cite{Bezanson++2017} (such as multiple dispatch) for an efficient implementation of these models. The purpose of \verb|RobustNeuralNetworks.jl| is to make our recent research in robust machine learning easily accessible to users in the scientific and machine learning communities. We have therefore designed the package to interface directly with \verb|Flux.jl| \cite{Innes2018}, Julia's most widely-used machine learning package, making it straightforward to incorporate our robust neural networks into existing Julia code. |
8 | 8 |
|
9 | 9 | The paper is structured as follows. Section \ref{sec:overview} provides an overview of the \verb|RobustNeuralNetworks.jl| package, including a brief introduction to the model classes (Sec. \ref{sec:model-structures}), their robustness certificates (Sec. \ref{sec:robustness}), and their implementation (Sec. \ref{sec:parameterizations}). Section \ref{sec:examples} guides the reader through a tutorial with three examples to demonstrate the use of RENs and LBDNs in machine learning: image classification (Sec. \ref{sec:mnist}), reinforcement learning (Sec. \ref{sec:rl}), and nonlinear state-observer design (Sec. \ref{sec:observer}). Section \ref{sec:conc} offers some concluding remarks and future directions for robust machine learning with \verb|RobustNeuralNetworks.jl|. For more detail on the theory behind RENs and LBDNs, and for examples comparing their performance to current state-of-the-art methods on a range of problems, we refer the reader to \cite{Revay++2023} and \cite{Wang+Manchester2023} (respectively). |
0 commit comments