initial commit
Init commit
Update use case
Add prefix prompt
Fix use_cache=False
Upload pytorch_model.bin with huggingface_hub
Update quantized gemm kernel
Fix tokenizer for transformers 0.34
Upload tokenizer.model with huggingface_hub