vllm@0.5.4 vulnerabilities

A high-throughput and memory-efficient inference and serving engine for LLMs

Direct Vulnerabilities

Known vulnerabilities in the vllm package. This does not include vulnerabilities belonging to this package’s dependencies.

How to fix?

Automatically find and fix vulnerabilities affecting your projects. Snyk scans for vulnerabilities and provides fixes for free.

Fix for free
VulnerabilityVulnerable Version
  • H
Deserialization of Untrusted Data

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Deserialization of Untrusted Data in the MessageQueue.dequeue() API function. An attacker can execute arbitrary code by sending a malicious payload to the message queue.

How to fix Deserialization of Untrusted Data?

There is no fixed version for vllm.

[0,)
  • C
Deserialization of Untrusted Data

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Deserialization of Untrusted Data in the _make_handler_coro() function used by the async RPC server for communication between nodes, which uses pickle for deserialization. An attacker can execute arbitrary code by sending serialized data from a malicious RPC client.

How to fix Deserialization of Untrusted Data?

Upgrade vllm to version 0.6.2 or higher.

[,0.6.2)
  • H
Allocation of Resources Without Limits or Throttling

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Allocation of Resources Without Limits or Throttling in outlines_logits_processors.py module, which uses a local cache with unbounded size by default. An attacker can occupy all space on the target system by sending a stream of decoding requests with different schemas, adding indefinitely to the outlines cache.

How to fix Allocation of Resources Without Limits or Throttling?

Upgrade vllm to version 0.8.0 or higher.

[,0.8.0)
  • C
Deserialization of Untrusted Data

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Deserialization of Untrusted Data in the MooncakePipe class, which relies on pickle for serialization and deserialization in recv_tensor(). An attacker can execute arbitrary code on the host by sending malicious payloads using the ZMQ over TCP interface, which is exposed by the Mooncake integration by default.

How to fix Deserialization of Untrusted Data?

Upgrade vllm to version 0.8.0 or higher.

[,0.8.0)
  • L
Use of Weak Hash

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Use of Weak Hash due to the use of a predictable constant value in the Python 3.12 built-in hash function. An attacker can interfere with subsequent responses and cause unintended behavior by exploiting predictable hash collisions to populate the cache with prompts known to collide with another prompt in use.

How to fix Use of Weak Hash?

Upgrade vllm to version 0.7.2 or higher.

[,0.7.2)
  • H
Deserialization of Untrusted Data

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Deserialization of Untrusted Data via the hf_model_weights_iterator process due to the usage of the torch.load function with the weights_only parameter set to False, which is considered insecure. An attacker can execute arbitrary code during the unpickling process by supplying malicious pickle data.

How to fix Deserialization of Untrusted Data?

Upgrade vllm to version 0.7.0 or higher.

[,0.7.0)
  • M
Uncontrolled Resource Consumption ('Resource Exhaustion')

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Uncontrolled Resource Consumption ('Resource Exhaustion') via the best_of parameter. An attacker can cause the system to become unresponsive and prevent legitimate users from accessing the service by consuming excessive system resources.

How to fix Uncontrolled Resource Consumption ('Resource Exhaustion')?

Upgrade vllm to version 0.6.3 or higher.

[,0.6.3)
  • H
Improper Validation of Syntactic Correctness of Input

vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs

Affected versions of this package are vulnerable to Improper Validation of Syntactic Correctness of Input in the process_model_inputs() and process_model_inputs_async() functions, accessible through the completions API. An attacker can crash the server by sending a request with and empty prompt, if a model that does not prepend any data (such as gpt2) is in use by the server. The crash will only happen if the processed prompt that is passed to these functions is still empty.

How to fix Improper Validation of Syntactic Correctness of Input?

Upgrade vllm to version 0.5.5 or higher.

[,0.5.5)