fixed readme.
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ The inference speed of lyraChatGLM has achieved **10x** acceleration upon the or
|
|
| 16 |
Among its main features are:
|
| 17 |
|
| 18 |
- weights: original ChatGLM-6B weights released by THUDM.
|
| 19 |
-
- device: lyraChatGLM is mainly based on
|
| 20 |
- batch_size: compiled with dynamic batch size, max batch_size = 8
|
| 21 |
|
| 22 |
## Speed
|
|
|
|
| 16 |
Among its main features are:
|
| 17 |
|
| 18 |
- weights: original ChatGLM-6B weights released by THUDM.
|
| 19 |
+
- device: lyraChatGLM is mainly based on TensorRT compiled for SM=80 (A100, for example).
|
| 20 |
- batch_size: compiled with dynamic batch size, max batch_size = 8
|
| 21 |
|
| 22 |
## Speed
|