To rename for future compatibility with transformers
#71
by
xiaohei66
- opened
Our vision encoder is a heavily modified version of SigLIP, featuring a dynamic resolution mechanism and 2D RoPE instead of the original’s fixed resolution and learnable absolute position embeddings.
This makes our implementation fundamentally different from the standard SigLIP in libraries like Transformers. To avoid future naming conflicts and confusion, we must move away from the Siglip* name.
xiaohei66
changed pull request title from
rename
to To rename for future compatibility with transformers
xiaohei66
changed pull request status to
open
xiaohei66
changed pull request status to
merged