Redesign of make_request and the returned response object. And use of xmltodict. #75

elcolumbio · 2018-05-27T13:49:32Z

deleted Datawrapper see below
renamed Dictwrapper to Datawrapper introduced functionality for text (reports)
use xmltodict
introduced DotDict which replaces the ObjectDict. Difference it's by design recursive. So you can use methods on complete nested structures. Like you can at any point of your parsing convert it to a 100 % python data structure build up of dicts and lists. DotDict has useful functionality enable deepcopy for DictWrapper #63, get method, use pprint as default printout, handles cdata correctly.
removed some of the implicit try except confusing parts
use chardet to guess encoding (same how requests does, but better :) )
correctly handles cdata in xml:
Example old for actual cdata: 'Length': {'Units': {'value': 'inches'}, 'value': '9.00'},
Example new for actual cdata: 'Length': {'#text': '2.80', '@Units': 'inches'}
The other 'value' tags are removed.

What should be the response?
sensible defaults are in the parsed property (it seems the name parsed is confusing people):
for XML: DotDict
for Text: decoded text

reverted pull requests #72 #68 those expected a functionality which i have implemented by default.
in case of #72 its better to be explicit, this is possible right now.
check if parsed_response.parsed is None and then fallback to parsed_response.response.content
For XML reports it makes sense to always fallback to the dot_dict as default. I introduced a flag for this.

#68 assumes we would parse or encode or unzip text which we have never done. In this pull request its implemented by design, via the request text method.

The now unified DataWrapper Object allows you to access:

a python dictionary only for XML: pydict
the headers: headers
the request.response object: original

Try it out :)

Also i am happy that i found this as logical base for the DotDict.

adjust docs, we moved functionality list comprehension seems nicer

elcolumbio · 2018-05-28T03:44:44Z

removed not used namespaces

If we use xmltodict we can make our codebase much cleaner.
Like i would remove everything about namespaces.
We parse them from the xml and so can everyone who wants to parse the xml himself.
I don't understand namespaces, but i question if we ever helped anybody with those Constants?

Right now we removed all namespaces, without any marker. It looks like it worked, so we can exclude naming cullusions and the need of namespaces.

now we parse them from the response before often the first thing we did was to remove the namespace without any marker

codecov-io · 2018-05-28T04:37:20Z

Codecov Report

❗ No coverage uploaded for pull request base (develop@0300af9). Click here to learn what that means.
The diff coverage is 68.42%.

@@            Coverage Diff             @@
##             develop      #75   +/-   ##
==========================================
  Coverage           ?   81.16%           
==========================================
  Files              ?       18           
  Lines              ?      924           
  Branches           ?       94           
==========================================
  Hits               ?      750           
  Misses             ?      165           
  Partials           ?        9

Impacted Files	Coverage Δ
mws/apis/products.py	`100% <ø> (ø)`
mws/apis/orders.py	`100% <ø> (ø)`
mws/apis/inbound_shipments.py	`100% <ø> (ø)`
mws/apis/inventory.py	`100% <ø> (ø)`
mws/apis/recommendations.py	`100% <ø> (ø)`
mws/apis/sellers.py	`100% <ø> (ø)`
mws/apis/subscriptions.py	`23.18% <0%> (ø)`
mws/apis/reports.py	`100% <100%> (ø)`
mws/utils.py	`82.14% <54.54%> (ø)`
mws/mws.py	`78.91% <88.05%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0300af9...d37de8c. Read the comment docs.

just to get an overview it seems to me we often did not parse anything

see FrozenJson from the fluent python book https://github.com/fluentpython/example-code/blob/master/19-dyn-attr-prop/oscon/explore0.py

i dont understand the hashfunction just by inference i could see the docs and design are just wrong. - it takes not a string, it takes a bytes object - you don't need a class for that one string

if you think its too ugly. there a bunch of other ways to do it. I think its ok.

commented in this pull request

nateshmbhat · 2018-06-29T06:07:14Z

sorry if its a basic question. How do I clone this pull request ?

GriceTurrble · 2018-08-02T19:50:54Z

Changes merged from #84 .

it now uses the incremental reading the old version needed some refactoring anyway. https://chardet.readthedocs.io/en/latest/usage.html#basic-usage if this is too slow we can use the old approach besides we use detection on all lines at once.

elcolumbio · 2018-12-18T11:19:23Z

Just looked over everything again.
I only changed the guess_encoding function and some comments.
Now it is nearly impossible to run into encoding errors, i tested even some bytestrings with mixed encodings (malformatted).

Now I am quite happy with the code. But most likely i dont see the critical things.

elcolumbio · 2019-01-15T20:19:33Z

Should i make a new clean pull request?

GriceTurrble · 2019-01-15T20:51:29Z

No need, I just need to find the time to get into it. I'll set time aside this week and get cracking.

elcolumbio · 2019-01-29T00:34:04Z

setup.py

    description=short_description,
    long_description=long_description,
-    packages=['mws'],
+    packages=['mws', 'mws.apis'],


This is fixing a bug since we introduced the additional folder apis.
It is recommended to do it like this for simple projects, see for e.g. here https://setuptools.readthedocs.io/en/latest/python3.html, its also done with a point instead of a backslash.

mws/mws.py

Bobspadger · 2019-12-24T11:37:35Z

So just doing a bit of testing by importing it into my product code.

I've only spent 15 minutes so far, (and i'm about to leave the office for Christmas!) but this seems to have broken then exected behaviour of being able to do object.ASIN etc as it now returns a DotDict which no longer returns attributes as before.

I recovered this by using the .parsed method to get the indiviudual objects.

I've also tried with the .pydict option and this too is now nested in a GetCompetitivePricingForASINResult when calling the GetCompetitivePricing api.

I think the old interface was much more user friendly - are we able to refactor this to allow access in the old way so as not to break too much for v1.0 ?

Other than that good work , Merry Christmas!

elcolumbio · 2019-12-26T16:53:32Z

mws/mws.py

-class DictWrapper(object):
-    """
-    Main class that converts XML data to a parsed response object as a tree of ObjectDicts,
-    stored in the .parsed property.


It was supposed to be the .parsed property before too. And the syntax: response.parsed reads nice.

here's a quick example of what I mean:

import mws SECRET_KEY = "" AWS_ACCESS_KEY = "" SELLER_ID = "" MARKETPLACE_ID = "" # Setup - the same for both products_api = mws.Products(access_key=AWS_ACCESS_KEY, secret_key=SECRET_KEY, account_id=SELLER_ID, region='UK') asin = 'B0107YJU8Q' competitive_prices = products_api.get_competitive_pricing_for_asin( asins=[asin], marketplace_id=MARKETPLACE_ID) product = competitive_prices.parsed # For the current version, including the develop branch, this will return the ASIN of the product by using the dot # notation, the .get() method and a deeper dot notation # Access the ASIN: try: print(product.ASIN) except KeyError as exc: print('Cannot access this by dot notation') print(product.get('ASIN')) # this returns the ASIN print(product.Product.Identifiers.MarketplaceASIN.ASIN) # This returns the ASIN # xmltodict version # Sadly, this version does not return the ASIN as it is now @ASIN try: print(product.ASIN) # This is returning None, should retrun the ASIN except Exception as exc: print('Cannot access the .ASIN') print(product.get('ASIN')) # This returns None, should retrun the ASIN print(product.Product.Identifiers.MarketplaceASIN.ASIN) print(product.get('@ASIN')) # Returns the ASIN, its not very 'clean' ## There are other areas that the same @ symbol is prefixing keys, such as offer listings etc

We are now getting elements returned prefiex with @ which breaks the API.
I feel it should keep the consistent dot notation so it works with all nodes, not just some.

so you should be able to access any node using the dot notation, and not have to change back to the .get('@ASIN) , .get('@belongsToRequester) etc:

{'CompetitivePrice': {'@belongsToRequester': 'false', '@condition': 'New', '@subcondition': 'New', '@xmlns': OrderedDict([('ns2', 'http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd')]), 'CompetitivePriceId': '1', 'Price': {'LandedPrice': {'CurrencyCode': 'GBP', 'Amount': '99.00'}, 'ListingPrice': {'CurrencyCode': 'GBP', 'Amount': '99.00'}, 'Shipping': {'CurrencyCode': 'GBP', 'Amount': '0.00'}}}}

I think that the dot notation either works fully (so you can get access to whichever bit of the returned data you want, or gets dropped completely and the dictionary method is used (so .get() and data['key'] etc)

Does that make sense?

I've done a bit of digging and it looks like the xmltodict library won't let us achieve this:

https://github.com/martinblech/xmltodict/blob/master/xmltodict.py#L204

Which is a shame.

I still think its worth a discussion on if this is a breaking change we are happy to use.

GriceTurrble · 2020-02-25T18:59:19Z

Hi folks,

I've been holding on this one for a long while now, but I'd like to bring it into the larger project for further testing and tweaking.

Internal branch feature-xmltodict-revamp is a recent copy of develop, so it serves as a clean starting point we can build on.

@elcolumbio , thank you for the initial contribution here. You are welcome to keep contributing on the internal feature branch so we can get a good final candidate before merging into develop.

Bobspadger · 2020-02-26T10:32:19Z

Hi folks,

I've been holding on this one for a long while now, but I'd like to bring it into the larger project for further testing and tweaking.

Internal branch feature-xmltodict-revamp is a recent copy of develop, so it serves as a clean starting point we can build on.

@elcolumbio , thank you for the initial contribution here. You are welcome to keep contributing on the internal feature branch so we can get a good final candidate before merging into develop.

I think this makes perfect sense.

I think we need to think how the xml2dict should work - as the new implementation has some fairly big changes (making some of the dot access almost pointless / redundant)

Mitalee · 2020-02-27T05:33:10Z

(making some of the dot access almost pointless / redundant)
Could you elaborate?

I have noticed different responses for single object versus multiple object xml responses - in the parsed version, the single object response is converted to a simple dict, whereas the multi object response is converted to a list. So I had to code in different actions to both events (whether the parsed response was a dict or a list). Maybe this could be taken into account while implementing a new format?

Bobspadger · 2020-02-27T09:58:22Z

(making some of the dot access almost pointless / redundant)
Could you elaborate?

I have noticed different responses for single object versus multiple object xml responses - in the parsed version, the single object response is converted to a simple dict, whereas the multi object response is converted to a list. So I had to code in different actions to both events (whether the parsed response was a dict or a list). Maybe this could be taken into account while implementing a new format?

This is something we have discussed about the original implementation a few times.
My in house solution was to make EVERYTHING into a list, even if it was only 1 item long, meaning the rest of my code dealt with the response correctly.

I believe treating the results this way in the module would be beneficial and save people hours of their life testing :)

As for the dot access redundant, on the new xmltodict system, the response.ASIN is no longer accessible via the dot notation as it is response.@ASIN which is not accessible.

I feel this is a really important function to have, as when you are retrieving long lists of results of unknown products, being able to quickly and easily access the ASIN is incredibly important.

Just my feelings on the subject :)

elcolumbio · 2020-02-27T20:27:52Z

That is fine, it was my first bigger contribution and i still have to learn a lot of things. Like not changing too many different things or make the process easier to follow.
@Bobspadger I very much agree with you on the dot accessibility. If i remember correctly those are used because it could be a list and that is not achievable with a normal dictionary. Those conflicts also appear some levels down. Most of the time it works, but there are definetely cases in our production code where it fails (i found: finance api and the different transaction types, the lists gets overwritten by unique dictionary keys).
But i believe there is a pragmatic solution for dot access. With one additional rule like Bobspadger mentioned.

Bobspadger · 2020-02-28T10:23:57Z

So we have two discussions going on here it looks like.

Dot access to the root of the response for things like .ASIN
Always returning a list of items

Does that sound fair?

Bobspadger · 2020-02-28T10:25:51Z

That is fine, it was my first bigger contribution and i still have to learn a lot of things. Like not changing too many different things or make the process easier to follow.
@Bobspadger I very much agree with you on the dot accessibility. If i remember correctly those are used because it could be a list and that is not achievable with a normal dictionary. Those conflicts also appear some levels down. Most of the time it works, but there are definetely cases in our production code where it fails (i found: finance api and the different transaction types, the lists gets overwritten by unique dictionary keys).
But i believe there is a pragmatic solution for dot access. With one additional rule like Bobspadger mentioned.

Its all good @elcolumbio its fantastic works , keep it up.

As long as we can all agree on the route forward thats the best for everyone then everything will be awesome!

https://youtu.be/StTqXEQ2l-Y

Florian Benkö and others added 7 commits May 27, 2018 15:35

use xmltodict

aad50d7

Merge branch 'master' into develop

43c6fec

use xmltodict

09ffff4

remove dict2xml by DotDict

fed8d48

Merge remote-tracking branch 'origin/develop' into develop

9873263

cleaned up some stuff from my commit python-amazon-mws#65

14efe58

adjust docs, we moved functionality list comprehension seems nicer

removed docs by self documenting code

19a7aa8

removed the illusion that we care too much about namespaces

a57ddf6

now we parse them from the response before often the first thing we did was to remove the namespace without any marker

elcolumbio changed the title ~~first version of xmltodict~~ replace parsing of xml with the library xmltodict May 28, 2018

elcolumbio changed the title ~~replace parsing of xml with the library xmltodict~~ Replace parsing of xml with the library xmltodict May 28, 2018

add dependency xmltodict

ab4c87c

Florian Benkö added 11 commits May 28, 2018 10:23

raw design of datawrapper withot dictwrapper

af161cc

just to get an overview it seems to me we often did not parse anything

bug fixes

358d530

moved validate hash, bug fixes

847bd58

parsed and header work for xml files

5ac6b7e

use a dotdict inspired from FrozenJson

3038d64

see FrozenJson from the fluent python book https://github.com/fluentpython/example-code/blob/master/19-dyn-attr-prop/oscon/explore0.py

a DotDict(obj).get(key, default) now returns a DotDict object

0790415

fixed encoding issue python-amazon-mws#76 , fixed hash function

5622c36

i dont understand the hashfunction just by inference i could see the docs and design are just wrong. - it takes not a string, it takes a bytes object - you don't need a class for that one string

conformity for accessing response attributes

e48d5d7

test the hash function

c2b6f99

if you think its too ugly. there a bunch of other ways to do it. I think its ok.

python 2.7 support for collection import

e6656c0

fix previous

05fd2d8

elcolumbio mentioned this pull request May 29, 2018

report enum and get_reportid #74

Merged

Florian Benkö added 5 commits May 29, 2018 16:51

Testing the hash function and a fake request.response

6a65756

commented in this pull request

tests for the new datawrapper

4ea6469

python 2

3c2dbd2

python2

d17da86

python2

c0de832

Merge branch 'develop' into develop

d37de8c

GriceTurrble mentioned this pull request Aug 2, 2018

Add marketplace ID enums #19

Closed

GriceTurrble mentioned this pull request Dec 3, 2018

character encoding #100

Closed

Florian Benkö added 3 commits December 18, 2018 10:43

guess encoding rewritten

cb4e9cd

it now uses the incremental reading the old version needed some refactoring anyway. https://chardet.readthedocs.io/en/latest/usage.html#basic-usage if this is too slow we can use the old approach besides we use detection on all lines at once.

typos, unused import counter, docs

3816db1

better comments

7e387b0

elcolumbio commented Jan 29, 2019

View reviewed changes

elcolumbio commented Jul 1, 2019

View reviewed changes

mws/mws.py Show resolved Hide resolved

elcolumbio mentioned this pull request Jul 1, 2019

Remove rouge "values" key when building ObjectDict from xml #121

Closed

elcolumbio mentioned this pull request Jul 22, 2019

Add __getitem__ method to DictWrapper #126

Closed

Bobspadger mentioned this pull request Dec 18, 2019

Is this a dead project? #137

Closed

elcolumbio changed the title ~~Very much needed redesign of make_request and the returned response object. And use of xmltodict.~~ Redesign of make_request and the returned response object. And use of xmltodict. Dec 20, 2019

Bobspadger reviewed Dec 24, 2019

View reviewed changes

mws/mws.py Show resolved Hide resolved

elcolumbio commented Dec 26, 2019

View reviewed changes

GriceTurrble changed the base branch from develop to feature-xmltodict-revamp February 25, 2020 18:52

GriceTurrble merged commit 3f12b76 into python-amazon-mws:feature-xmltodict-revamp Feb 25, 2020

GriceTurrble mentioned this pull request Aug 28, 2020

Naming convention #77

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Redesign of make_request and the returned response object. And use of xmltodict. #75

Redesign of make_request and the returned response object. And use of xmltodict. #75

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Redesign of make_request and the returned response object. And use of xmltodict. #75

Redesign of make_request and the returned response object. And use of xmltodict. #75

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

removed not used namespaces

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants