CROP
A unified framework for certifying robustness of RL against test-time attacks

The goal of CROP iis to systematically certify the robustness of different RL algorithms based on certification criteria including per-state action stability and the lower bound of cumulative reward. Specifically, we propose three novel methods (LoAct, GRe, and LoRe) to achieve certification.

In CROP-leaderboard, we present the certification results in four RL environments under two certification criteria via three certification methods. Notably, we compare our certification with empirical results under attack to show the tightness of our certification.

The related paper can be found here.

Available Leaderboards
CartPole-v0
1
1
Leaderboard: CartPole-v0 (LoAct)

Robustness certiļ¬cation for per-state action in terms of certiļ¬ed radius r at all time steps. Each column corresponds to one smoothing variance σ and each row corresponds to one RL algorithm. For each figure, the x-axis is time step t, and the y-axis is the certified radius rt. The shaded area represents the standard deviation. The benign performance of locally smoothed policy under different smoothing variance σ can be found here.

1
1
Leaderboard: CartPole-v0 (GRe-mean)

Robustness certification as cumulative reward in terms of expection bound JE. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: CartPole-v0 (GRe-median)

Robustness certification as cumulative reward in terms of percentile bound JP (p = 50%). Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: CartPole-v0 (LoRe)

Robustness certification as cumulative reward in terms of absolute lower bound bound J. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
1
1
PongNoFrameskip-v4
1
1
Leaderboard: PongNoFrameskip-v4 (LoAct)

Robustness certiļ¬cation for per-state action in terms of certiļ¬ed radius r at time steps = 500. Each column corresponds to one smoothing variance σ and each row corresponds to one RL algorithm. For each figure, the x-axis is time step t, and the y-axis is the certified radius rt. The shaded area represents the standard deviation. The shaded area represents the standard deviation. The benign performance of locally smoothed policy under different smoothing variance σ can be found here.

1
1
Leaderboard: PongNoFrameskip-v4 (GRe-mean)

Robustness certification as cumulative reward in terms of expection bound JE at time steps = 500. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: PongNoFrameskip-v4 (GRe-median)

Robustness certification as cumulative reward in terms of percentile bound JP (p = 50%) at time steps = 500. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: PongNoFrameskip-v4 (LoRe)

Robustness certification as cumulative reward in terms of absolute lower bound bound J at time steps = 200. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
1
1
FreewayNoFrameskip-v4
1
1
Leaderboard: FreewayNoFrameskip-v4 (LoAct)

Robustness certiļ¬cation for per-state action in terms of certiļ¬ed radius r at time steps = 500. Each column corresponds to one smoothing variance σ and each row corresponds to one RL algorithm. For each figure, the x-axis is time step t, and the y-axis is the certified radius rt. The shaded area represents the standard deviation. The benign performance of locally smoothed policy under different smoothing variance σ can be found here.

1
1
Leaderboard: FreewayNoFrameskip-v4 (GRe-mean)

Robustness certification as cumulative reward in terms of expection bound JE at time steps = 500. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: FreewayNoFrameskip-v4 (GRe-median)

Robustness certification as cumulative reward in terms of percentile bound JP (p = 50%) at time steps = 500. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: FreewayNoFrameskip-v4 (LoRe)

Robustness certification as cumulative reward in terms of absolute lower bound bound J at time steps = 200. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
1
1
highway-fast-v0
1
1
Leaderboard: highway-fast-v0 (LoAct)

Robustness certiļ¬cation for per-state action in terms of certiļ¬ed radius r at time steps = 30. Each column corresponds to one smoothing variance σ and each row corresponds to one RL algorithm. For each figure, the x-axis is time step t, and the y-axis is the certified radius rt. The shaded area represents the standard deviation. The benign performance of locally smoothed policy under different smoothing variance σ can be found here.

1
1
Leaderboard: highway-fast-v0 (GRe-mean)

Robustness certification as cumulative reward in terms of expection bound JE at time steps = 30. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: highway-fast-v0 (GRe-median)

Robustness certification as cumulative reward in terms of percentile bound JP (p = 50%) at time steps = 30. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.

1
1
Leaderboard: highway-fast-v0 (LoRe)

Robustness certification as cumulative reward in terms of absolute lower bound bound J at time steps = 30. Each column corresponds to one smoothing variance. Solid lines represent the certified reward bounds of different methods, and dashed lines show the empirical performance under PGD attack.