Update README.md
Browse files
README.md
CHANGED
|
@@ -57,7 +57,7 @@ The following books are used to develop text corpus:
|
|
| 57 |
|
| 58 |
Corpus has total 1078389 word tokens.
|
| 59 |
|
| 60 |
-
## Datasets
|
| 61 |
|
| 62 |
- Header text are removed manually.
|
| 63 |
- Using sent_tokenize() function from NLTK python library, extra spaces and new-lines were removed programmatically.
|
|
@@ -93,6 +93,17 @@ The following hyperparameters were used during training:
|
|
| 93 |
| 2.3842 | 2.51 | 1000 | 2.5738 |
|
| 94 |
|
| 95 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
### Framework versions
|
| 97 |
|
| 98 |
- Transformers 4.26.1
|
|
|
|
| 57 |
|
| 58 |
Corpus has total 1078389 word tokens.
|
| 59 |
|
| 60 |
+
## Datasets Preprocessing
|
| 61 |
|
| 62 |
- Header text are removed manually.
|
| 63 |
- Using sent_tokenize() function from NLTK python library, extra spaces and new-lines were removed programmatically.
|
|
|
|
| 93 |
| 2.3842 | 2.51 | 1000 | 2.5738 |
|
| 94 |
|
| 95 |
|
| 96 |
+
## Sample Code Using Transformers Pipeline
|
| 97 |
+
|
| 98 |
+
```
|
| 99 |
+
from transformers import pipeline
|
| 100 |
+
|
| 101 |
+
story = pipeline('text-generation',model='./gpt2-shakespeare', tokenizer='gpt2', max_length = 300)
|
| 102 |
+
story("how art thou")
|
| 103 |
+
|
| 104 |
+
```
|
| 105 |
+
|
| 106 |
+
|
| 107 |
### Framework versions
|
| 108 |
|
| 109 |
- Transformers 4.26.1
|