Fast Certification of Vision-Language Models Using Incremental Randomized Smoothing

Ashutosh Nirala, Ameya Joshi, Soumik Sarkar, Chinmay Hegde

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    A key benefit of deep vision-language models such as CLIP is that they enable zero-shot open vocabulary classification; the user has the ability to define novel class labels via natural language prompts at inference time. However, while CLIP-based zero-shot classifiers have demonstrated competitive performance across a range of domain shifts, they remain highly vulnerable to adversarial attacks. Therefore, ensuring the robustness of such models is crucial for their reliable deployment in the wild.In this work, we introduce Open Vocabulary Certification (OVC), a fast certification method designed for open-vocabulary models like CLIP via randomized smoothing techniques. Given a base "training"set of prompts and their corresponding certified CLIP classifiers, OVC relies on the observation that a classifier with a novel prompt can be viewed as a perturbed version of nearby classifiers in the base training set. Therefore, OVC can rapidly certify the novel classifier using a variation of incremental randomized smoothing. By using a caching trick, we achieve approximately two orders of magnitude acceleration in the certification process for novel prompts. To achieve further (heuristic) speedups, OVC approximates the embedding space at a given input using a multivariate normal distribution bypassing the need for sampling via forward passes through the vision backbone. We demonstrate the effectiveness of OVC on through experimental evaluation using multiple vision-language backbones on the CIFAR-10 and ImageNet test datasets.

    Original languageEnglish (US)
    Title of host publicationProceedings - IEEE Conference on Safe and Trustworthy Machine Learning, SaTML 2024
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages252-271
    Number of pages20
    ISBN (Electronic)9798350349504
    DOIs
    StatePublished - 2024
    Event2024 IEEE Conference on Safe and Trustworthy Machine Learning, SaTML 2024 - Toronto, Canada
    Duration: Apr 9 2024Apr 11 2024

    Publication series

    NameProceedings - IEEE Conference on Safe and Trustworthy Machine Learning, SaTML 2024

    Conference

    Conference2024 IEEE Conference on Safe and Trustworthy Machine Learning, SaTML 2024
    Country/TerritoryCanada
    CityToronto
    Period4/9/244/11/24

    Keywords

    • certified robustness
    • CLIP
    • randomized smoothing
    • Vision-language models

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Safety, Risk, Reliability and Quality
    • Modeling and Simulation

    Fingerprint

    Dive into the research topics of 'Fast Certification of Vision-Language Models Using Incremental Randomized Smoothing'. Together they form a unique fingerprint.

    Cite this