Troubleshooting and best practices for dead-letter queues in Amazon SQS - Amazon Simple Queue Service

Troubleshooting and best practices for dead-letter queues in Amazon SQS

This section provides an overview of common DLQ issues and how to resolve them.

Create alarms for dead-letter queues using Amazon CloudWatch

You can configure an alarm for any messages moved to a dead-letter queue using Amazon CloudWatch and the metric ApproximateNumberOfMessagesVisible. For more information, see Creating CloudWatch alarms for Amazon SQS metrics. After you receive an alert that messages have been sent to the dead-letter queue, you can review the messages using polling to receive the message.

Viewing messages using the console might cause messages to be moved to a dead-letter queue

Amazon SQS counts viewing a message in the console against the corresponding queue's redrive policy. Thus, if you view a message in the console the number of times specified in the corresponding queue's redrive policy, the message is moved to the corresponding queue's dead-letter queue.

To adjust this behavior, you can do one of the following:

  • Increase the Maximum Receives setting for the corresponding queue's redrive policy.

  • Avoid viewing the corresponding queue's messages in the console.

The NumberOfMessagesSent and NumberOfMessagesReceived for a dead-letter queue don't match

If you send a message to a dead-letter queue manually, it is captured by the NumberOfMessagesSent metric. However, if a message is sent to a dead-letter queue as a result of a failed processing attempt, it isn't captured by this metric. Thus, it is possible for the values of NumberOfMessagesSent and NumberOfMessagesReceived to be different.

For information about creating and configuring a dead-letter queue redrive

Dead-letter queue redrive requires you to set appropriate permissions for Amazon SQS to receive messages from the dead-letter queue and send messages to the destination queue. If you do not have the correct permissions, the dead-letter queue redrive task can fail. You can view the status of your message redrive task to remediate the issues and try again.

Standard and FIFO queues handle message failure differently

Standard queues keep processing messages until the expiration of the retention period. This continuous processing minimizes chances of the queue being blocked by unconsumed messages. Having a large number of messages that the consumer repeatedly fails to delete can increase costs and place extra load on the hardware. To keep costs down, move failed messages to the dead-letter queue.

Standard queues also allow a high number of in flight messages. If the majority of your messages cannot be consumed and are not sent to a dead-letter queue, your rate of processing messages can slow down. To maintain the efficiency of your queue, make sure that your application correctly handles message processing.

FIFO queues provide exactly-once processing by consuming messages in sequence from a message group. Thus, although the consumer can continue to retrieve ordered messages from another message group, the first message group remains unavailable until the message blocking the queue is processed successfully or moved to a dead-letter queue.

Additionally, FIFO queues allow a lower number of in flight messages. To keep your FIFO queue from getting blocked by a message, make sure that your application correctly handles message processing.

For more information, see Amazon SQS message quotas and Working with Amazon SQS messages.