Improve model card: Add Hear-Your-Click context and refined metadata

#63

by nielsr HF Staff - opened Jul 15, 2025

base: refs/heads/main

←

from: refs/pr/63

Discussion Files changed

+32

-7

nielsr

Jul 15, 2025

This PR updates the model card for openai/clip-vit-base-patch32. It clarifies that this CLIP model serves as a crucial component (visual encoder) within the "Hear-Your-Click: Interactive Object-Specific Video-to-Audio Generation" framework.

The changes include:

Retaining the detailed description of the openai/clip-vit-base-patch32 model.
Adding a new section that introduces "Hear-Your-Click", its abstract, a link to its paper (2507.04959), and its GitHub repository (https://github.com/SynapGrid/Hear-Your-Click-2024).
Updating metadata with license: mit, library_name: transformers, and confirming pipeline_tag: zero-shot-image-classification.
Adding additional tags like clip and video-to-audio for better discoverability and context.
Including the BibTeX citation for the "Hear-Your-Click" paper.

This update provides valuable context for users interested in the applications of this foundational CLIP model.

Improve model card: Add Hear-Your-Click context and refined metadata4c4a3e8b

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment