8000 [Messenger] ErrorDetailsStamp +RedeliveryStamp can lead to an oversize message for SQS · Issue #44943 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[Messenger] ErrorDetailsStamp +RedeliveryStamp can lead to an oversize message for SQS #44943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
magnetik opened this issue Jan 7, 2022 · 23 comments

Comments

@magnetik
Copy link
Contributor
magnetik commented Jan 7, 2022

Symfony version(s) affected

5.4

Description

SQS has a limit of 262,144 bytes (256 KB) message size.

When there is an error during message handling, the ErrorDetailsStamp adds a stacktrace that can lead to a message bigger than this limit.

How to reproduce

  • Dispatch messages with SQS with a redelivery policy that makes it retry a few times
  • Throw an exception with a super big stack trace
  • See it fails

Possible Solution

Not sure, maybe an option to remove stack trace from ErrorDetailsStamp

Additional Context

No response

@Jean85
Copy link
Contributor
Jean85 commented Apr 7, 2022

👍 on this one. I had my whole worker setup crash in a loop due to this.

This is super critical, I had to:

  • scale the workers x5 to mitigate the issue
  • troubleshoot blindly to find the issue that was getting the message fail
  • deploy an hotfix

The failure is completely out of my code, all inside the framework, so I had no way to understand easily which message it was, even with Sentry:

AsyncAws\Core\Exception\Http\ClientException: HTTP 400 returned for "https://sqs.eu-central-1.amazonaws.com/".

Code:    InvalidParameterValue
Message: One or more parameters are invalid. Reason: Message must be shorter than 262144 bytes.
Type:    Sender
Detail:  

#22 /vendor/async-aws/core/src/Response.php(406): AsyncAws\Core\Response::AsyncAws\Core\{closure}
#21 /vendor/async-aws/core/src/Response.php(423): AsyncAws\Core\Response::getResolveStatus
#20 /vendor/async-aws/core/src/Response.php(160): AsyncAws\Core\Response::resolve
#19 /vendor/async-aws/core/src/Response.php(105): AsyncAws\Core\Response::__destruct
#18 /vendor/symfony/amazon-sqs-messenger/Transport/Connection.php(363): Symfony\Component\Messenger\Bridge\AmazonSqs\Transport\Connection::send
#17 /vendor/symfony/amazon-sqs-messenger/Transport/AmazonSqsSender.php(65): Symfony\Component\Messenger\Bridge\AmazonSqs\Transport\AmazonSqsSender::send
#16 /vendor/symfony/amazon-sqs-messenger/Transport/AmazonSqsTransport.php(81): Symfony\Component\Messenger\Bridge\AmazonSqs\Transport\AmazonSqsTransport::send
#15 /vendor/symfony/messenger/EventListener/SendFailedMessageForRetryListener.php(80): Symfony\Component\Messenger\EventListener\SendFailedMessageForRetryListener::onMessageFailed
#14 /vendor/symfony/event-dispatcher/EventDispatcher.php(270): Symfony\Component\EventDispatcher\EventDispatcher::Symfony\Component\EventDispatcher\{closure}
#13 /vendor/symfony/event-dispatcher/EventDispatcher.php(230): Symfony\Component\EventDispatcher\EventDispatcher::callListeners
#12 /vendor/symfony/event-dispatcher/EventDispatcher.php(59): Symfony\Component\EventDispatcher\EventDispatcher::dispatch
#11 /vendor/symfony/messenger/Worker.php(267): Symfony\Component\Messenger\Worker::dispatchEvent
#10 /vendor/symfony/messenger/Worker.php(194): Symfony\Component\Messenger\Worker::ack
#9 /vendor/symfony/messenger/Worker.php(170): Symfony\Component\Messenger\Worker::handleMessage
#8 /vendor/symfony/messenger/Worker.php(107): Symfony\Component\Messenger\Worker::run
#7 /vendor/symfony/messenger/Command/ConsumeMessagesCommand.php(225): Symfony\Component\Messenger\Command\ConsumeMessagesCommand::execute
#6 /vendor/symfony/console/Command/Command.php(298): Symfony\Component\Console\Command\Command::run
#5 /vendor/symfony/console/Application.php(1023): Symfony\Component\Console\Application::doRunCommand
#4 /vendor/symfony/framework-bundle/Console/Application.php(96): Symfony\Bundle\FrameworkBundle\Console\Application::doRunCommand
#3 /vendor/symfony/console/Application.php(299): Symfony\Component\Console\Application::doRun
#2 /vendor/symfony/framework-bundle/Console/Application.php(82): Symfony\Bundle\FrameworkBundle\Console\Application::doRun
#1 /vendor/symfony/console/Application.php(171): Symfony\Component\Console\Application::run
#0 /bin/console(42): null

Luckily I had verbose logs on, so I was able to read from the crashing worker which exception was tripping everything. But this has to be solved in some way, because the messenger is making itself trip due to too much stamp data.

8000

@Jean85
Copy link
Contributor
Jean85 commented Apr 11, 2022

Pinging @Nyholm @jderusse because this is tightly related to their Async-AWS package solution.

@Nyholm
Copy link
Member
Nyholm commented Apr 11, 2022

Thank you for the ping.

I dont think this is a AsyncAWS issue. All transports have issues/limitations. What do you think about listening for a Symfony\Component\Messenger\Exception\TransportException in SendFailedMessageForRetryListener? But Im not sure what the preferred action would be if we cannot write to SQS.

@Jean85
Copy link
Contributor
Jean85 commented Apr 11, 2022

That's probably the main core of the issue: it's hard to pin down a possible approach or workaround, because the issue rises at the end of the process, after the serialization already happened, so it's hard to find a way to "slim down" the envelope.

@BPavol
Copy link
BPavol commented Sep 21, 2022

I have same problem while using SQS. I cheched stack trace and main problem for oversized messages is, that arguments of functions are serialized too.

aws/aws-sdk-php package contains more functions, that receives whole serialized message as string. If those functions are called 4 times in one run and after retry 4 more times, size of the envelope grows exponentially.

I can think of one solution:
You can override Symfony\Component\Messenger\EventListener\AddErrorDetailsStampListener and create custom Symfony\Component\Messenger\Stamp\ErrorDetailsStamp with exception without function argumets. This can downsize message drastically.

It is only a theory, I must try it. With this solution you lose function arguments in stack trace, but better than oversized message.

@Rastusik
Copy link
Rastusik commented Sep 21, 2022

Our team just had a similar issue. I think that the right solution would be to remove method arguments from the stack trace from FlatternException (not in general, just for the messenger use case somehow, maybe by using a different exception with similar code). In our case the problem was that there was a lot of data in stack trace arguments and it was duplicated because some data was propagated through several function calls. And each retry appended new duplicit data to the end.

@dbu
Copy link
Contributor
dbu commented Sep 26, 2022

With rabbitmq and the amqp PHP library, the header limit is even just 128 KB (overall, not per header). I adjusted our application to remove the error details stamp alltogether (which would not be a good general solution of course). How about a configuration for the allowed header size (that maybe could be default to the correct value if known?) and in addition to reduce verbosity of the stack trace also shorten the trace to strictly prevent errors even on very deep traces with long namespaces?

its tricky that the limit is overall. maybe serialize everything, then check the length and if it is too long shorten the error details by the difference + 100 bytes of safety margin?

btw: i am surprised that the amqp library allows to send messages that break the system, they could check the header limit on sending as well and not only on receiving...

@Jean85
Copy link
Contributor
Jean85 commented Sep 28, 2022

btw: i am surprised that the amqp library allows to send messages that break the system, they could check the header limit on sending as well and not only on receiving...

That's not what's happening: envelope gets bigger before getting sent back again, and goes over the limit.

@dbu
Copy link
Contributor
dbu commented Sep 29, 2022

That's not what's happening: envelope gets bigger before getting sent back again, and goes over the limit.

are you saying the message we send goes very close to the limit and then rabbitmq adds something of its own to the header that pushes it over the limit? that would either mean rabbit adds quite some volume of data, or it must be extremly unlucky to exactly cross the limit if only some dozens of bytes get added. though its possible, we process several 100k messages a day in our system so even statistically unlikely things are bound to happen sometimes...

how much below the absolute limit should the header then be, to avoid problems?

@Jean85
Copy link
Contributor
Jean85 commented Sep 29, 2022

What I'm saying is that I'm seeing the issue only when sending the message (and Rabbit/SQS rejects it), there's no overhead added. But even when rejecting/retrying a message, in reality the trasport is in fact sending the message again, adding a new stamp (normally the RedeliveryStamp one), so that's what's happening to me.

@mtanasiewicz
Copy link

It looks like we're experiencing same issue in our project. More debugging needs to be done, but everything looks like it's the same.

@mtanasiewicz
Copy link

Just in case anyone needs a workaround: https://github.com/mtanasiewicz/symfony-messenger-sqs-workaround

@tsantos84
Copy link
Contributor
tsantos84 commented Mar 9, 2023

An option at the transport level to entirely disable the ErrorStamp or at least allow us to provide the size of the trace could work. I'm surfacing the same issue, btw. In my case, I'm using beanstalkd transport.

framework:
    messenger:
        transports:
            async:
                dsn: '%env(MESSENGER_TRANSPORT_DSN)%'
                attach_error_header_on_failure: true|false
                error_header_size: 10 // attach the last 10 items in the stack trace array

@tsantos84
Copy link
Contributor
tsantos84 commented Mar 9, 2023

Another workaround to completely disable the default behavior without additional classes is to remove the AddErrorDetailsStampListener from the container:

<?php

namespace App;

use Symfony\Bundle\FrameworkBundle\Kernel\MicroKernelTrait;
use Symfony\Component\DependencyInjection\Compiler\CompilerPassInterface;
use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\HttpKernel\Kernel as BaseKernel;

class Kernel extends BaseKernel implements CompilerPassInterface
{
    use MicroKernelTrait;

    public function process(ContainerBuilder $container): void
    {
        $container->removeDefinition('messenger.failure.add_error_details_stamp_listener');
    }
}

@alexbaileyuk
Copy link

As an alternative to completely disabling the stamp, perhaps some stamp configuration is a possibility? Either something like include_stack_trace which defaults to true or somethig more configurable like include_exception_stack, include_exception_code etc. This would need some way to configure stamps I guess.

For now, the workaround of implementing a custom listener and stamp class seems to work. However, I'd say that this is less than ideal considering that this feels like it should be configurable.

Alternatively, perhaps a step in the right direction would be a parameter for error_details_stamp_class which would allow overwriting the stamp without having to implement custom listeners as well?

@carsonbot
Copy link

Hey, thanks for your report!
There has not been a lot of activity here for a while. Is this bug still relevant? Have you managed to find a workaround?

@carsonbot
Copy link

Could I get a reply or should I close this?

@Nyholm
Copy link
Member
Nyholm commented Feb 24, 2024

Keep this open please

@carsonbot
Copy link

Hey, thanks for your report!
There has not been a lot of activity here for a while. Is this bug still relevant? Have you managed to find a workaround?

@dbu
Copy link
Contributor
dbu commented Aug 25, 2024

it is stil an issue

@carsonbot carsonbot removed the Stalled label Aug 25, 2024
@carsonbot
Copy link

Hey, thanks for your report!
There has not been a lot of activity here for a while. Is this bug still relevant? Have you managed to find a workaround?

@dbu
Copy link
Contributor
dbu commented Feb 26, 2025

this has not been adressed afaik

@carsonbot carsonbot removed the Stalled label Feb 26, 2025
@LewisW
Copy link
LewisW commented Mar 7, 2025

The real issue is that the error isn't handled gracefully and prevents the message from hitting the failed/dead letter queue and so it can easily go unnoticed or data be lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

0