DriveTiTok Concept Picture

DriveTiTok-s-640

DriveTiTok-s-640 is an image tokenizer for driving scene images based on the TiTok architecture. It generates seed tokens from past frames and uses Self-Attention to model temporal differences while integrating global context, enabling efficient compression. The model is trained on driving data collected in Tokyo.

Installation

uv sync --all-extras

Usage

We provide sample code for reconstructing images using DriveTiTok-s-640 and visualizing the reconstructed images. See tutorial.ipynb.

License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

Acknowledgement

This model was developed as part of the project JPNP20017, which is subsidized by the New Energy and Industrial Technology Development Organization (NEDO), Japan.

The training and inference code is based on ByteDance's https://github.com/bytedance/1d-tokenizer under the Apache-2.0 license. Additionally, the tutorial.ipynb in this repository is derived from https://github.com/bytedance/1d-tokenizer. We acknowledge and appreciate their contribution as the foundation of our development.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support