Contributing to ULTRATHINK
Thank you for your interest in contributing to ULTRATHINK! This document provides guidelines and information for contributors.
π€ How to Contribute
Reporting Issues
Before creating an issue, please:
- Search existing issues to avoid duplicates
- Use the issue templates when available
- Provide detailed information including:
- Python version and OS
- PyTorch version
- Full error traceback
- Steps to reproduce
- Expected vs actual behavior
Suggesting Features
We welcome feature suggestions! Please:
- Check existing feature requests first
- Describe the use case and motivation
- Provide implementation ideas if possible
- Consider backwards compatibility
Pull Requests
Before You Start
- Fork the repository and create a feature branch
- Check existing PRs to avoid duplicate work
- Discuss major changes in an issue first
Development Setup
# Clone your fork
git clone https://github.com/yourusername/ultrathink.git
cd ultrathink
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install in development mode
pip install -e .
pip install -r requirements.txt
# Install development tools
pip install pre-commit pytest pytest-cov black flake8 mypy
# Setup pre-commit hooks
pre-commit install
Code Standards
We maintain high code quality standards:
Code Formatting
# Format code with Black
black src/ tests/ scripts/
# Check formatting
black --check src/ tests/ scripts/
Linting
# Run flake8
flake8 src/ tests/ scripts/
# Run mypy for type checking
mypy src/
Testing
# Run all tests
python -m pytest tests/
# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html
# Run smoke test
python -m tests.smoke_test
Code Style Guidelines
- Follow PEP 8 with Black formatting
- Use type hints for all functions and methods
- Write docstrings for public APIs (Google style)
- Keep functions focused and reasonably sized
- Use meaningful variable names
- Add comments for complex logic
Example:
def train_model(
model: torch.nn.Module,
dataloader: DataLoader,
optimizer: torch.optim.Optimizer,
device: torch.device,
epochs: int = 10
) -> Dict[str, float]:
"""Train a PyTorch model.
Args:
model: The model to train
dataloader: Training data loader
optimizer: Optimizer for training
device: Device to train on
epochs: Number of training epochs
Returns:
Dictionary containing training metrics
Raises:
ValueError: If epochs is not positive
"""
if epochs <= 0:
raise ValueError("Epochs must be positive")
# Training implementation...
return {"loss": final_loss, "accuracy": final_acc}
Commit Guidelines
Commit Message Format
<type>(<scope>): <description>
[optional body]
[optional footer]
Types:
feat: New featurefix: Bug fixdocs: Documentation changesstyle: Code style changes (formatting, etc.)refactor: Code refactoringtest: Adding or updating testschore: Maintenance tasks
Examples:
feat(models): add flash attention support
fix(training): resolve gradient accumulation bug
docs(readme): update installation instructions
test(models): add unit tests for MoE routing
Pull Request Process
Create a feature branch from
maingit checkout -b feature/your-feature-nameMake your changes following the guidelines above
Write or update tests for your changes
Update documentation if needed
Run the full test suite
python -m pytest tests/ python -m tests.smoke_testRun pre-commit checks
pre-commit run --all-filesPush to your fork and create a pull request
Fill out the PR template completely
Respond to review feedback promptly
PR Review Process
- All PRs require at least one review
- Automated checks must pass
- Documentation must be updated for user-facing changes
- Breaking changes require discussion and migration guide
ποΈ Architecture Guidelines
Adding New Models
When adding new model components:
- Follow the existing patterns in
src/models/ - Inherit from appropriate base classes
- Add comprehensive docstrings
- Include configuration classes
- Add unit tests
- Update integration tests
Adding New Training Features
For training enhancements:
- Consider backwards compatibility
- Add configuration options
- Include proper logging
- Add evaluation metrics
- Document hyperparameter effects
Adding New Data Loaders
For data pipeline additions:
- Support streaming when possible
- Include quality filtering options
- Add proper error handling
- Support multiple formats
- Include data validation
π§ͺ Testing Guidelines
Test Structure
tests/
βββ unit/ # Unit tests for individual components
βββ integration/ # Integration tests for workflows
βββ fixtures/ # Test data and fixtures
βββ conftest.py # Pytest configuration
Writing Tests
Use descriptive test names
def test_transformer_block_forward_pass_with_attention_mask():Test edge cases and error conditions
def test_model_raises_error_with_invalid_vocab_size():Use fixtures for common setup
@pytest.fixture def small_model_config(): return ModelConfig(vocab_size=1000, n_embd=128, n_layer=2)Mock external dependencies
@patch('wandb.init') def test_training_without_wandb(mock_wandb):
Performance Tests
Include performance benchmarks for:
- Model forward/backward pass timing
- Memory usage patterns
- Throughput measurements
π Documentation
Code Documentation
- Docstrings: Use Google style for all public APIs
- Type hints: Required for all function signatures
- Comments: Explain complex algorithms and business logic
User Documentation
- README updates: For user-facing changes
- Configuration docs: For new parameters
- Examples: Include usage examples
- Migration guides: For breaking changes
π Release Process
Version Numbering
We follow Semantic Versioning:
MAJOR.MINOR.PATCH- Major: Breaking changes
- Minor: New features (backwards compatible)
- Patch: Bug fixes
Release Checklist
- Update version in
setup.py - Update
CHANGELOG.md - Run full test suite
- Update documentation
- Create release PR
- Tag release after merge
- Update GitHub release notes
π― Priority Areas
We're particularly interested in contributions to:
Performance optimizations
- Memory efficiency improvements
- Training speed optimizations
- Inference acceleration
New model architectures
- Novel attention mechanisms
- Advanced MoE strategies
- Multi-modal improvements
Training improvements
- Better data loading
- Advanced RLHF techniques
- Distributed training optimizations
Evaluation and benchmarking
- New benchmark integrations
- Evaluation metrics
- Analysis tools
Documentation and examples
- Tutorial notebooks
- Use case examples
- API documentation
β Questions?
- General questions: Open a Discussion
- Bug reports: Open an Issue
- Feature requests: Open an Issue with the feature template
- Security issues: Email [email protected]
π Recognition
Contributors will be:
- Listed in the README
- Mentioned in release notes
- Invited to the contributors team (for significant contributions)
Thank you for helping make ULTRATHINK better! π