Spaces:
Runtime error
Runtime error
| <!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| --> | |
| # Utilities for Generation | |
| This page lists all the utility functions used by [`~generation.GenerationMixin.generate`], | |
| [`~generation.GenerationMixin.greedy_search`], | |
| [`~generation.GenerationMixin.contrastive_search`], | |
| [`~generation.GenerationMixin.sample`], | |
| [`~generation.GenerationMixin.beam_search`], | |
| [`~generation.GenerationMixin.beam_sample`], | |
| [`~generation.GenerationMixin.group_beam_search`], and | |
| [`~generation.GenerationMixin.constrained_beam_search`]. | |
| Most of those are only useful if you are studying the code of the generate methods in the library. | |
| ## Generate Outputs | |
| The output of [`~generation.GenerationMixin.generate`] is an instance of a subclass of | |
| [`~utils.ModelOutput`]. This output is a data structure containing all the information returned | |
| by [`~generation.GenerationMixin.generate`], but that can also be used as tuple or dictionary. | |
| Here's an example: | |
| ```python | |
| from transformers import GPT2Tokenizer, GPT2LMHeadModel | |
| tokenizer = GPT2Tokenizer.from_pretrained("gpt2") | |
| model = GPT2LMHeadModel.from_pretrained("gpt2") | |
| inputs = tokenizer("Hello, my dog is cute and ", return_tensors="pt") | |
| generation_output = model.generate(**inputs, return_dict_in_generate=True, output_scores=True) | |
| ``` | |
| The `generation_output` object is a [`~generation.GreedySearchDecoderOnlyOutput`], as we can | |
| see in the documentation of that class below, it means it has the following attributes: | |
| - `sequences`: the generated sequences of tokens | |
| - `scores` (optional): the prediction scores of the language modelling head, for each generation step | |
| - `hidden_states` (optional): the hidden states of the model, for each generation step | |
| - `attentions` (optional): the attention weights of the model, for each generation step | |
| Here we have the `scores` since we passed along `output_scores=True`, but we don't have `hidden_states` and | |
| `attentions` because we didn't pass `output_hidden_states=True` or `output_attentions=True`. | |
| You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you | |
| will get `None`. Here for instance `generation_output.scores` are all the generated prediction scores of the | |
| language modeling head, and `generation_output.attentions` is `None`. | |
| When using our `generation_output` object as a tuple, it only keeps the attributes that don't have `None` values. | |
| Here, for instance, it has two elements, `loss` then `logits`, so | |
| ```python | |
| generation_output[:2] | |
| ``` | |
| will return the tuple `(generation_output.sequences, generation_output.scores)` for instance. | |
| When using our `generation_output` object as a dictionary, it only keeps the attributes that don't have `None` | |
| values. Here, for instance, it has two keys that are `sequences` and `scores`. | |
| We document here all output types. | |
| ### GreedySearchOutput | |
| [[autodoc]] generation.GreedySearchDecoderOnlyOutput | |
| [[autodoc]] generation.GreedySearchEncoderDecoderOutput | |
| [[autodoc]] generation.FlaxGreedySearchOutput | |
| ### SampleOutput | |
| [[autodoc]] generation.SampleDecoderOnlyOutput | |
| [[autodoc]] generation.SampleEncoderDecoderOutput | |
| [[autodoc]] generation.FlaxSampleOutput | |
| ### BeamSearchOutput | |
| [[autodoc]] generation.BeamSearchDecoderOnlyOutput | |
| [[autodoc]] generation.BeamSearchEncoderDecoderOutput | |
| ### BeamSampleOutput | |
| [[autodoc]] generation.BeamSampleDecoderOnlyOutput | |
| [[autodoc]] generation.BeamSampleEncoderDecoderOutput | |
| ## LogitsProcessor | |
| A [`LogitsProcessor`] can be used to modify the prediction scores of a language model head for | |
| generation. | |
| [[autodoc]] LogitsProcessor | |
| - __call__ | |
| [[autodoc]] LogitsProcessorList | |
| - __call__ | |
| [[autodoc]] LogitsWarper | |
| - __call__ | |
| [[autodoc]] MinLengthLogitsProcessor | |
| - __call__ | |
| [[autodoc]] MinNewTokensLengthLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TemperatureLogitsWarper | |
| - __call__ | |
| [[autodoc]] RepetitionPenaltyLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TopPLogitsWarper | |
| - __call__ | |
| [[autodoc]] TopKLogitsWarper | |
| - __call__ | |
| [[autodoc]] TypicalLogitsWarper | |
| - __call__ | |
| [[autodoc]] NoRepeatNGramLogitsProcessor | |
| - __call__ | |
| [[autodoc]] NoBadWordsLogitsProcessor | |
| - __call__ | |
| [[autodoc]] PrefixConstrainedLogitsProcessor | |
| - __call__ | |
| [[autodoc]] HammingDiversityLogitsProcessor | |
| - __call__ | |
| [[autodoc]] ForcedBOSTokenLogitsProcessor | |
| - __call__ | |
| [[autodoc]] ForcedEOSTokenLogitsProcessor | |
| - __call__ | |
| [[autodoc]] InfNanRemoveLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TFLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TFLogitsProcessorList | |
| - __call__ | |
| [[autodoc]] TFLogitsWarper | |
| - __call__ | |
| [[autodoc]] TFTemperatureLogitsWarper | |
| - __call__ | |
| [[autodoc]] TFTopPLogitsWarper | |
| - __call__ | |
| [[autodoc]] TFTopKLogitsWarper | |
| - __call__ | |
| [[autodoc]] TFMinLengthLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TFNoBadWordsLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TFNoRepeatNGramLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TFRepetitionPenaltyLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TFForcedBOSTokenLogitsProcessor | |
| - __call__ | |
| [[autodoc]] TFForcedEOSTokenLogitsProcessor | |
| - __call__ | |
| [[autodoc]] FlaxLogitsProcessor | |
| - __call__ | |
| [[autodoc]] FlaxLogitsProcessorList | |
| - __call__ | |
| [[autodoc]] FlaxLogitsWarper | |
| - __call__ | |
| [[autodoc]] FlaxTemperatureLogitsWarper | |
| - __call__ | |
| [[autodoc]] FlaxTopPLogitsWarper | |
| - __call__ | |
| [[autodoc]] FlaxTopKLogitsWarper | |
| - __call__ | |
| [[autodoc]] FlaxForcedBOSTokenLogitsProcessor | |
| - __call__ | |
| [[autodoc]] FlaxForcedEOSTokenLogitsProcessor | |
| - __call__ | |
| [[autodoc]] FlaxMinLengthLogitsProcessor | |
| - __call__ | |
| ## StoppingCriteria | |
| A [`StoppingCriteria`] can be used to change when to stop generation (other than EOS token). | |
| [[autodoc]] StoppingCriteria | |
| - __call__ | |
| [[autodoc]] StoppingCriteriaList | |
| - __call__ | |
| [[autodoc]] MaxLengthCriteria | |
| - __call__ | |
| [[autodoc]] MaxTimeCriteria | |
| - __call__ | |
| ## Constraints | |
| A [`Constraint`] can be used to force the generation to include specific tokens or sequences in the output. | |
| [[autodoc]] Constraint | |
| [[autodoc]] PhrasalConstraint | |
| [[autodoc]] DisjunctiveConstraint | |
| [[autodoc]] ConstraintListState | |
| ## BeamSearch | |
| [[autodoc]] BeamScorer | |
| - process | |
| - finalize | |
| [[autodoc]] BeamSearchScorer | |
| - process | |
| - finalize | |
| [[autodoc]] ConstrainedBeamSearchScorer | |
| - process | |
| - finalize | |
| ## Utilities | |
| [[autodoc]] top_k_top_p_filtering | |
| [[autodoc]] tf_top_k_top_p_filtering | |
| ## Streamers | |
| [[autodoc]] TextStreamer | |
| [[autodoc]] TextIteratorStreamer | |