Skip to content

CarliniWagnerL2Attack on MNIST does NOT work #101

@wenhaoyong

Description

@wenhaoyong

Thanks for this awesome toolbox.
When I try to attack MNIST using the CarliniWagnerL2Attack, the test results indicated that the attack was not successful.
Here comes the code:

    testset = torchvision.datasets.MNIST(root='./dataset', train=False, download=True, transform=transform_test)
    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=4)

    cw_attack = CarliniWagnerL2Attack(predict=target_model,
                                      num_classes=10,
                                      confidence=2.0,
                                      targeted=True,
                                      learning_rate=0.001,
                                      binary_search_steps=5,
                                      max_iterations=1000,
                                      abort_early=True,
                                      clip_min=0.0,
                                      clip_max=1.0)

    # construct adversarial samples
    for i, data in enumerate(testloader, 0):
        x, y = data
        x, y = x.to(device), y.to(device)
        y_pred = target_model(x).argmax(dim=1)
        print("y_pred:", y_pred)

        # Random target construction
        if y.size() != torch.Size([]):
            range_ = y.size()[0]
        else:
            range_ = 1
        targets = []
        for index in range(range_):
            target = randint(0, 9)
            while target == y[index].item():
                target = randint(0, 9)
            targets.append(target)
            attack_target = torch.tensor(targets).to(device)
        print("attack_target:", attack_target)     

        # C&W
        with ctx_noparamgrad_and_eval(target_model):
            x_adv = cw_attack.perturb(x, attack_target)
        y_pred_adv = target_model(x_adv).argmax(dim=1)
        print("y_pred_adv:", y_pred_adv)
        raise Exception

And the results were:

y_pred: tensor([7, 2, 1, 0, 4, 1, 4, 9, 5, 9, 0, 6, 9, 0, 1, 5], device='cuda:0')
attack_target: tensor([6, 8, 3, 7, 8, 2, 5, 6, 0, 2, 2, 8, 4, 6, 2, 2], device='cuda:0')
y_pred_adv: tensor([7, 2, 1, 0, 4, 1, 4, 9, 5, 9, 0, 6, 9, 0, 1, 5], device='cuda:0')

The pred labels after the CW attack are the same as before.
Any tips would be appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions