8000 Autocommit when executing a COPY command by berdario · Pull Request #63 · sqlalchemy-redshift/sqlalchemy-redshift · GitHub
[go: up one dir, main page]

Skip to content

Autocommit when executing a COPY command #63

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

berdario
Copy link
Contributor
@berdario berdario commented Oct 30, 2015

This would help with this issue, I think:

http://stackoverflow.com/questions/28271049/redshift-copy-operation-doesnt-work-in-sqlalchemy#

I'm still new with Redshift, and I haven't checked if there are any Redshift-specific commands aside from COPY (which is the only one that I've used as for now)

Todos

  • MIT compatible
  • Tests
  • Documentation
  • Updated CHANGES.rst

@berdario
Copy link
Contributor Author

The linter is failing due to a line being 81 characters long... personally in my project I disable the E501 since I find it too restrictive... any hints at how to get that line to fit?

(without doing stuff like: import sqlalchemy.dialects.postgresql.psycopg2 as pgdialect and then using pgdialect.PGDialect_psycopg2, pgdialect.PGExecutionContext_psycopg2 ?)

@jklukas
Copy link
Member
jklukas commented Oct 31, 2015

I'd recommend:

from sqlalchemy.dialects.postgresql.psycopg2 import (
    PGDialect_psycopg2, PGExecutionContext_psycopg2)

@@ -301,6 +307,11 @@ def _fetch_redshift_column_attributes(self, column):
return text


class RedshiftExecutionCntext(PGExecutionContext_psycopg2):
def should_autocommit_text(self, statement):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd get this to call super and only check for 'COPY' if the result is false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would indeed be a more robust solution, allowing us to automatically stay compatible with future changes to the psycopg2 dialect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's better... I don't think that the psycopg2 dialect might need to add keywords very often, but it's the cleaner way

I'll see to do that later (before tomorrow? I've been away in the weekend, and now I'm a bit busy again)

@graingert
Copy link
Member

Can we get a test and changelog for this? Does this also cause problems with the regular postgres COPY command?

@graingert
Copy link
Member

(without doing stuff like: import sqlalchemy.dialects.postgresql.psycopg2 as pgdialect and then using pgdialect.PGDialect_psycopg2, pgdialect.PGExecutionContext_psycopg2 ?)

I'd prefer

from sqlalchemy.dialects.postgresql import pyscopg2 as pgdialect

@berdario
Copy link
Contributor Author
berdario commented Dec 4, 2015

Sorry for the delay. Anyhow now I added those small fixes.

I'd like to add a test, but I wouldn't be able to test it running in Travis (not easily... and also I soon won't have access anymore to the redshift cluster that I was using for testing), sorry

I was not aware of the Postgres COPY command, and it feels weird that the default pgdialect doesn't recognize it as a data-changing operation.... maybe this PR should actually be sent to sqlalchemy itself instead?

@graingert
Copy link
Member

@berdario yeah submit it upstream and see what @zzzeek thinks

@berdario
Copy link
Contributor Author

https://bitbucket.org/zzzeek/sqlalchemy/pull-requests/68/

It probably won't be merged upstream, zzzeek says

I'd seek to get it merged to the redshift side regardless, it looks like COPY has a more prominent role in Redshift and there might be other RS-specific commands to support in the future.

@kerryshireman
Copy link

Hey all, just read the SQLAlchemy thread as well as this PR. Do you plan to accept this PR here in sqlalchemy-redshift?

class RedshiftExecutionContext(pgdialect.PGExecutionContext_psycopg2):
def should_autocommit_text(self, statement):
return super(RedshiftExecutionContext, self).should_autocommit_text() \
or AUTOCOMMIT_REGEXP.match(statement)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally prefer to use parentheses for line continuations rather than \:

        return (super(RedshiftExecutionContext, self).should_autocommit_text()
                    or AUTOCOMMIT_REGEXP.match(statement))

@jklukas
Copy link
Member
jklukas commented Feb 3, 2016

I really appreciate that @berdario made an attempt to get this included upstream. Given the proposal was rejected, I'd be happy to see this merged here.

@berdario - You mentioned adding a test in https://bitbucket.org/zzzeek/sqlalchemy/pull-requests/68/ ; would you be able to port that here?

@graingert - Are you okay with adding this here considering it's not going into sqlalchemy?

@kerryshireman
Copy link

Thanks for following up, @jklukas.

Adding

.execution_options(autocommit=True)

to my instance of CopyCommand works in the meantime.

Side question: When running COPY directly from psql or through a JDBC connection, status & rows loaded are sent to stdout. Ex:

INFO:  Load into table 'table_name' completed, 48351 record(s) loaded successfully.

I could find nothing in any attributes of the result object when running CopyCommand. Is there a way to capture this output?

@jklukas
Copy link
Member
jklukas commented Feb 3, 2016

@kerryshireman - I know that psycopg2 makes this available as cursor.rowcount. I suspect sqlalchemy also makes this available, though I don't know what the method is offhand.

@graingert
Copy link
Member

@kerryshireman
Copy link

Thanks guys, though I tried accessing that attribute before and it is always returned as -1 directly after the COPY command is executed.

@graingert
Copy link
Member

@kerryshireman that's nasty, that's probably another bug.

@berdario
Copy link
Contributor Author
berdario commented Feb 4, 2016

@jklukas
Unfortunately, as I mentioned here, I cannot run the tests on TravisCI, and I haven't touched redshift in 2 months, so it's unpractical for me to write a test here. The one that I wrote for the sqlalchemy's PR is substantially different, since it creates a file locally (not on S3)

@mtrbean
Copy link
Contributor
mtrbean commented May 8, 2016

I feel strongly against hard-wiring autocommit to COPY commands because it looks like we are losing the ability to rollback a transaction that contains a COPY statement. Thoughts?

@jklukas
Copy link
Member
jklukas commented May 9, 2016

it looks like we are losing the ability to rollback a transaction that contains a COPY statement

I'm having trouble locating specific SQLAlchemy documentation that speaks to this, but I think the should_autocommit setting only affects behavior when your connection is configured to be in autocommit mode and you're not inside of an explicit transaction. So, I don't think this prevents you from putting COPY inside a transaction and rolling back, but it let's SQLAlchemy know that a COPY statement is something that changes data and that a commit is needed in contexts where autocommit is enabled.

@mtrbean
Copy link
Contributor
mtrbean commented May 9, 2016

@jklukas Thanks for the clarification! I was misled by the subject of this PR into thinking that every COPY statement is going to be committed automatically 😛

@timsavage
Copy link

I just ran into this exact issue, glad to see a solution is already in progress. For now we add an explicit commit after COPY commands, however this does leave us with a situation where BEGIN/ROLLBACK (which we make heavy use of during unit testing) will not apply for COPY operations.

@jklukas
Copy link
Member
jklukas commented Jun 22, 2016

I think lack of a test was the only thing holding this up.

@graingert - Would you consider merging this as-is?

@graingert
Copy link
Member

@jklukas tests, squash, rebase, changelog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0