8000 File objects for publish workbook by nnevalainen · Pull Request #704 · tableau/server-client-python · GitHub
[go: up one dir, main page]

Skip to content

File objects for publish workbook #704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

nnevalainen
Copy link
Contributor

This is a proposal for an enhancement for workbook.publish to support a file object parameter.

User story

As a TSC user I want to publish workbooks by providing a file object, so that I don't have to store temporary files to a file system

Specific use case

  1. Ad-hoc conversion of a non-packaged workbook to a packaged workbook in order to embed supporting files
  2. Ad-hoc conversion is needed because workbook is produced dynamically by populating parameters for a parametrised template workbook

Acceptance criteria

  • Workbook can be published by providing either a file path or a file object
  • Parameter type (file path vs file object) is automatically determined
  • When providing a file object
    • Workbook name must be always provided (can't be interpreted from file name)
    • Workbook type (.twb vs .twbx) is automatically determined by sniffing the beginning of the file object

@nnevalainen nnevalainen changed the title Feature/file objects for publish workbook File objects for publish workbook Oct 13, 2020
@nnevalainen
Copy link
Contributor Author

@t8y8 @shinchris How do you feel about the concept/implementation?

@t8y8
Copy link
Collaborator
t8y8 commented Oct 21, 2020

I'm not opposed to the idea, but I need to take a deeper look.

Do I understand the primary use case as being "I need to hold a workbook xml in memory, possibly modifying it, and then eventually publish it"?

@nnevalainen
Copy link
Contributor Author

Yes that's about it. Modified xml could then be published either packaged or non-packaged without writing it to the filesystem. Thanks for looking into this!

Copy link
Collaborator
@t8y8 t8y8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea — I understand the use case and think it’s worth pursuing.

I left some specific comments, mostly a small nit.

I do have a larger concern about how we would sprinkle a bunch of if path checks in the already somewhat if statement heavy code. I’m wondering if there’s a way to avoid that?

if is_file_path:
file_size = os.path.getsize(file)
else:
file.seek(0, os.SEEK_END)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to look this up, but I see what’s going on here — is it possible to safely use os.fstat(fd).st_size? If not, can you extract this into a little utility function named get_file_size or something similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understood it, in memory binary streams such as BytesIO() don't have a file descriptor that is needed for os.fstat. I'll extract the logic to a utility function

from io import BytesIO()
import os

with open('file.txt', 'wb') as f:
    f.write('Hello World!'.encode())

f1 = open('file.txt')

f2 = BytesIO()
f2.write('Hello World!'.encode())

print('f1 size', os.fstat(f1.fileno()).st_size)
print('f2 size', os.fstat(f2.fileno()).st_size)

> f1 size 12
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> io.UnsupportedOperation: fileno

@t8y8
Copy link
Collaborator
t8y8 commented Oct 21, 2020

@nnevalainen Alright, a quick review! I need to ponder a bit on the if checks question, but overall it feels close/the right direction.

@nnevalainen
Copy link
Contributor Author

I do have a larger concern about how we would sprinkle a bunch of if path checks in the already somewhat if statement heavy code. I’m wondering if there’s a way to avoid that?

Should we try to get rid of the "if path" checks altogether and adopt duck typing approach? We could group all operations using file path together like

try:
    filename = os.path.basename(file)
    file_extension = os.path.splitext(filename)[1][1:]
    file_size = os.path.getsize(file)
    ...
except SpecificException:
    file_size = get_file_size() # utility function returning file object size
    ...

# Perform common oprations

@t8y8
Copy link
Collaborator
t8y8 commented Oct 21, 2020

I do have a larger concern about how we would sprinkle a bunch of if path checks in the already somewhat if statement heavy code. I’m wondering if there’s a way to avoid that?

Should we try to get rid of the "if path" checks altogether and adopt duck typing approach? We could group all operations using file path together like

try:
    filename = os.path.basename(file)
    file_extension = os.path.splitext(filename)[1][1:]
    file_size = os.path.getsize(file)
    ...
except SpecificException:
    file_size = get_file_size() # utility function returning file object size
    ...

# Perform common oprations

Yes, I like that approach, if we need to add another method in the future, or update something, we can keep it contained to the try block. Ship it!

@shinchris
Copy link
Contributor

Other than the requested changes from @t8y8, this looks good to me.
Would there be use cases for allowing this for datasource.publish as well?

@t8y8
Copy link
Collaborator
t8y8 commented Oct 22, 2020

Yeah, I think the use case is still valid for TDS/TDSX, and maybe even flows (TFL/TFLX).

@nnevalainen would you be interested in doing that as well?
You can say no, or at a minimum defer it to a second PR :)

@nnevalainen
Copy link
Contributor Author

I can take a look at that as well in a separate PR after finishing this one

@nnevalainen
Copy link
Contributor Author

Alright finished with the the changes. Let me know what you think

@nnevalainen nnevalainen requested a review from t8y8 October 25, 2020 12:52
Copy link
Collaborator
@t8y8 t8y8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@t8y8 t8y8 merged commit 1f4e27f into tableau:development Oct 30, 2020
@shinchris
Copy link
Contributor

@nnevalainen forgot to ask earlier, but would you be able to read and sign our CLA?

This was referenced Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0