-
Notifications
You must be signed in to change notification settings - Fork 207
Redesign of make_request and the returned response object. And use of xmltodict. #75
Redesign of make_request and the returned response object. And use of xmltodict. #75
Conversation
adjust docs, we moved functionality list comprehension seems nicer
removed not used namespacesIf we use xmltodict we can make our codebase much cleaner. Right now we removed all namespaces, without any marker. It looks like it worked, so we can exclude naming cullusions and the need of namespaces. |
now we parse them from the response before often the first thing we did was to remove the namespace without any marker
Codecov Report
@@ Coverage Diff @@
## develop #75 +/- ##
==========================================
Coverage ? 81.16%
==========================================
Files ? 18
Lines ? 924
Branches ? 94
==========================================
Hits ? 750
Misses ? 165
Partials ? 9
Continue to review full report at Codecov.
|
just to get an overview it seems to me we often did not parse anything
see FrozenJson from the fluent python book https://github.com/fluentpython/example-code/blob/master/19-dyn-attr-prop/oscon/explore0.py
i dont understand the hashfunction just by inference i could see the docs and design are just wrong. - it takes not a string, it takes a bytes object - you don't need a class for that one string
if you think its too ugly. there a bunch of other ways to do it. I think its ok.
commented in this pull request
|
sorry if its a basic question. How do I clone this pull request ? |
|
Changes merged from #84 . |
it now uses the incremental reading the old version needed some refactoring anyway. https://chardet.readthedocs.io/en/latest/usage.html#basic-usage if this is too slow we can use the old approach besides we use detection on all lines at once.
|
Just looked over everything again. Now I am quite happy with the code. But most likely i dont see the critical things. |
|
Should i make a new clean pull request? |
|
No need, I just need to find the time to get into it. I'll set time aside this week and get cracking. |
| description=short_description, | ||
| long_description=long_description, | ||
| packages=['mws'], | ||
| packages=['mws', 'mws.apis'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixing a bug since we introduced the additional folder apis.
It is recommended to do it like this for simple projects, see for e.g. here https://setuptools.readthedocs.io/en/latest/python3.html, its also done with a point instead of a backslash.
|
So just doing a bit of testing by importing it into my product code. I've only spent 15 minutes so far, (and i'm about to leave the office for Christmas!) but this seems to have broken then exected behaviour of being able to do I recovered this by using the I've also tried with the I think the old interface was much more user friendly - are we able to refactor this to allow access in the old way so as not to break too much for v1.0 ? Other than that good work , Merry Christmas! |
| class DictWrapper(object): | ||
| """ | ||
| Main class that converts XML data to a parsed response object as a tree of ObjectDicts, | ||
| stored in the .parsed property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was supposed to be the .parsed property before too. And the syntax: response.parsed reads nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here's a quick example of what I mean:
import mws
SECRET_KEY = ""
AWS_ACCESS_KEY = ""
SELLER_ID = ""
MARKETPLACE_ID = ""
# Setup - the same for both
products_api = mws.Products(access_key=AWS_ACCESS_KEY, secret_key=SECRET_KEY, account_id=SELLER_ID, region='UK')
asin = 'B0107YJU8Q'
competitive_prices = products_api.get_competitive_pricing_for_asin(
asins=[asin], marketplace_id=MARKETPLACE_ID)
product = competitive_prices.parsed
# For the current version, including the develop branch, this will return the ASIN of the product by using the dot
# notation, the .get() method and a deeper dot notation
# Access the ASIN:
try:
print(product.ASIN)
except KeyError as exc:
print('Cannot access this by dot notation')
print(product.get('ASIN')) # this returns the ASIN
print(product.Product.Identifiers.MarketplaceASIN.ASIN) # This returns the ASIN
# xmltodict version
# Sadly, this version does not return the ASIN as it is now @ASIN
try:
print(product.ASIN) # This is returning None, should retrun the ASIN
except Exception as exc:
print('Cannot access the .ASIN')
print(product.get('ASIN')) # This returns None, should retrun the ASIN
print(product.Product.Identifiers.MarketplaceASIN.ASIN)
print(product.get('@ASIN')) # Returns the ASIN, its not very 'clean'
## There are other areas that the same @ symbol is prefixing keys, such as offer listings etcWe are now getting elements returned prefiex with @ which breaks the API.
I feel it should keep the consistent dot notation so it works with all nodes, not just some.
so you should be able to access any node using the dot notation, and not have to change back to the .get('@ASIN) , .get('@belongsToRequester) etc:
{'CompetitivePrice': {'@belongsToRequester': 'false', '@condition': 'New', '@subcondition': 'New', '@xmlns': OrderedDict([('ns2', 'http://mws.amazonservices.com/schema/Products/2011-10-01/default.xsd')]), 'CompetitivePriceId': '1', 'Price': {'LandedPrice': {'CurrencyCode': 'GBP', 'Amount': '99.00'}, 'ListingPrice': {'CurrencyCode': 'GBP', 'Amount': '99.00'}, 'Shipping': {'CurrencyCode': 'GBP', 'Amount': '0.00'}}}}
I think that the dot notation either works fully (so you can get access to whichever bit of the returned data you want, or gets dropped completely and the dictionary method is used (so .get() and data['key'] etc)
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done a bit of digging and it looks like the xmltodict library won't let us achieve this:
https://github.com/martinblech/xmltodict/blob/master/xmltodict.py#L204
Which is a shame.
I still think its worth a discussion on if this is a breaking change we are happy to use.
|
Hi folks, I've been holding on this one for a long while now, but I'd like to bring it into the larger project for further testing and tweaking. Internal branch @elcolumbio , thank you for the initial contribution here. You are welcome to keep contributing on the internal feature branch so we can get a good final candidate before merging into |
I think this makes perfect sense. I think we need to think how the xml2dict should work - as the new implementation has some fairly big changes (making some of the dot access almost pointless / redundant) |
I have noticed different responses for single object versus multiple object xml responses - in the parsed version, the single object response is converted to a simple dict, whereas the multi object response is converted to a list. So I had to code in different actions to both events (whether the parsed response was a dict or a list). Maybe this could be taken into account while implementing a new format? |
This is something we have discussed about the original implementation a few times. I believe treating the results this way in the module would be beneficial and save people hours of their life testing :) As for the dot access redundant, on the new I feel this is a really important function to have, as when you are retrieving long lists of results of unknown products, being able to quickly and easily access the ASIN is incredibly important. Just my feelings on the subject :) |
|
That is fine, it was my first bigger contribution and i still have to learn a lot of things. Like not changing too many different things or make the process easier to follow. |
|
So we have two discussions going on here it looks like.
Does that sound fair? |
Its all good @elcolumbio its fantastic works , keep it up. As long as we can all agree on the route forward thats the best for everyone then everything will be awesome! |
Example old for actual cdata: 'Length': {'Units': {'value': 'inches'}, 'value': '9.00'},
Example new for actual cdata: 'Length': {'#text': '2.80', '@Units': 'inches'}
The other 'value' tags are removed.
What should be the response?
sensible defaults are in the parsed property (it seems the name parsed is confusing people):
for XML: DotDict
for Text: decoded text
reverted pull requests #72 #68 those expected a functionality which i have implemented by default.
in case of #72 its better to be explicit, this is possible right now.
check if parsed_response.parsed is None and then fallback to parsed_response.response.content
For XML reports it makes sense to always fallback to the dot_dict as default. I introduced a flag for this.
#68 assumes we would parse or encode or unzip text which we have never done. In this pull request its implemented by design, via the request text method.
The now unified DataWrapper Object allows you to access:
Try it out :)
Also i am happy that i found this as logical base for the DotDict.