8000 Escape terminator characters in filter arguments by depau · Pull Request #25 · kkroening/ffmpeg-python · GitHub
[go: up one dir, main page]

Skip to content

Escape terminator characters in filter arguments #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 12, 2017
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions ffmpeg/nodes.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import unicode_literals

from builtins import object
from past.builtins import basestring
import hashlib
import json

Expand Down Expand Up @@ -51,8 +52,21 @@ class FilterNode(Node):
"""FilterNode"""
def _get_filter(self):
params_text = self._name
arg_params = ['{}'.format(arg) for arg in self._args]
kwarg_params = ['{}={}'.format(k, self._kwargs[k]) for k in sorted(self._kwargs)]

# Helper function to escape uncomfortable characters
def escape_chars(seq, keys):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pull this out to a global function we can put a test around it:

def _escape_chars(text):
    """Helper function to escape uncomfortable characters."""
    for ch in '\\[]=;:,':
        text = text.replace(ch, '\\' + ch)
    return text

# in test_ffmpeg.py:
def test_escape_chars():
    assert ffmpeg.nodes._escape_chars('a[b]c=d;e,\\f') == 'a\\[b\\]c\\=d\\;e\\,\\\\f'

Then FilterNode._get_filter becomes..

def _get_filter(self):
    params_text = self._name
    _args = [_escape_chars(str(x)) for x in self._args]
    _kwargs = {k: _escape_chars(str(v)) for k, v in self._kwargs.items()}
    ...

Copy link
Owner
@kkroening kkroening Jul 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I ran a few scenarios against ffmpeg using drawtext, and ffmpeg's string unescaping is super bizarre. Some chars only need a single backslash before them, some require two backslashes, and some require eight backslashes.

Here's an example that draws the string a\b:

ffmpeg -i ffmpeg/tests/sample_data/dummy.mp4 -filter_complex '[0]drawtext=x=0:y=100:text=a\\\\\\\\b:fontfile=/Library/Fonts/Arial.ttf[1]' -map '[1]' out.mp4

Yup, eight backslashes.

I found this stackoverflow article talking about it:

Special character escapes are like violence: if they're not solving your problem, you're not using enough.
ffmpeg -i test.mpg -vf drawtext=text="It\\\\\'s so easy"

Ahaha.

Anyways I'm gonna try to see if there's any method to the madness and whether there's a way to completely correctly escape. It's nice to have this done in a tool like ffmpeg-python so that hopefully nobody else has to deal with ffmpeg's insanity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG this is crazy. In fact I did notice that I only needed one backslash when running it on the shell, and two with python, which resulted in 4 backslashes.

I'm currently running my code in my testing branch and I'm keeping it for the moment as it works for timestamps (which are basically the only thing that needs escaping in our project, though we could avoid it using SS.fraction only).

Let me know if I should commit the change you proposed or if you're going to do it some other way.

The text is being parsed a couple times, so you not only have to escape the quote, you also have to escape the slash escaping the quote. Twice.

If what the StackOverflow article says is right, we should do a few real-world tests with ffmpeg, find out how many backslashes are required for a particular filter or character, then make an intelligent escaper out of these numbers. The escaper would escape characters in trim arguments with r'\\', text argument of drawtext with r'\\\, etc., or fallback to some default.

Copy link
Owner
@kkroening kkroening Jul 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are my science results:

`'`: 7 backslashes in drawtext text param, 3 in fontfile; 3 in param names; 1 in filter name
`\`: 7 backslashes in drawtext text param, 3 in fontfile; 3 in param names; 1 in filter name
`%`: 4 backslashes in drawtext text param; unknown elsewhere
`:`: 2 backslashes
`,`: 1 backslash
`[`: 1 backslash
`]`: 1 backslash
`=`: 1 backslash
newline: 0 backslash; use raw newline char

I was thinking something like this:

args = [escape_chars(str(x, '=:')) for x in self._args]
kwargs = {}
for k, v in self._kwargs.items():
    k = escape_chars(k, '\\\'=:')
    v = escape_chars(str(v), '\\\'=:')
    kwargs[k] = v

arg_params = ['{}'.format(arg) for arg in args]
kwarg_params = ['{}={}'.format(k, kwargs[k]) for k in sorted(kwargs)]
params = ':'.join(arg_params + kwarg_params)

params_text = self._name
if params:
    params_text += '={}'.format(':'.join(params))
return escape_chars(params_text, '\\\'[],;')

And then another escape_chars for rendering the text param of drawtext. I think this produces similar results to what's above. But need more science

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's any way to encode unicode chars in ffmpeg args. You'd think there would be, since people might want to render Chinese text over a video

for k in keys:
if not isinstance(seq[k], basestring):
continue
for ch in "[]=;:,":
Copy link
Owner
@kkroening kkroening Jul 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also escape \ chars, e.g. \ => \\\\\\\\ (see comment above)

seq[k] = seq[k].replace(ch, "\\\\"+ch)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "\\"+ch?

return seq

_args = escape_chars(self._args[:], range(len(self._args)))
_kwargs = escape_chars(self._kwargs.copy(), self._kwargs.keys())

arg_params = ['{}'.format(arg) for arg in _args]
kwarg_params = ['{}={}'.format(k, _kwargs[k]) for k in sorted(_kwargs)]
params = arg_params + kwarg_params
if params:
params_text += '={}'.format(':'.join(params))
Expand Down
0