The pre-trained model to use for compression.
The pre-trained tokenizer to use for compression.
Function to get the pure token from a token. This is used to normalize tokens before processing.
Function to check if a token is the beginning of a new word. This is used to determine how to merge tokens into words.
The tokenizer to use calculating the compression rate.
Configuration for LLMLingua2.
Maximum batch size for processing prompts. This is used to limit the number of prompts processed in a single batch.
Maximum number of tokens to force in the compression. This is used to ensure that certain tokens are always included in the compressed prompt.
Maximum sequence length for the model. This is used to limit the length of the input sequences to the model.
Logger function to log messages.
Compresses a prompt based on the given options.
Compresses a prompt based on the given options. Alias for compress
, but uses snake_case for options.
The TypeScript implementation on original
PromptCompressor
, which is a class for compressing prompts using a language model.See
Original Implementation