8000 BUG: join on column with index by debnathshoham · Pull Request #49360 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

BUG: join on column with index #49360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4bf58bd
initial commit
debnathshoham Oct 26, 2022
f4f84f4
Merge branch 'main' of https://github.com/pandas-dev/pandas into gh28…
debnathshoham Oct 27, 2022
00e87ba
test tweak
debnathshoham Oct 27, 2022
a0f6d76
lots of change in test_merge.TestMerge.test_merge_left_empty_right_no…
debnathshoham Oct 27, 2022
e89cacb
multiple tests changed in test_merge.py
debnathshoham Oct 27, 2022
35a4b5d
precommit changes
debnathshoham Oct 27, 2022
20add1c
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Oct 27, 2022
a324be7
Merge branch 'main' of https://github.com/pandas-dev/pandas into gh28…
debnathshoham Oct 27, 2022
eeae11b
tweaked test_join.py
debnathshoham Oct 28, 2022
f8 8000 24d73
unsure changes in test_merge_asof.py
debnathshoham Oct 28, 2022
d95515c
precommit changes
debnathshoham Oct 28, 2022
a8c78a0
Merge remote-tracking branch 'origin/gh28243_bug_join_leftindex_right…
debnathshoham Oct 28, 2022
751e882
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Oct 28, 2022
4110e2b
Merge branch 'main' of https://github.com/pandas-dev/pandas into gh28…
debnathshoham Oct 28, 2022
66dfff4
test_merge_index_as_string.py tweaks
debnathshoham Oct 28, 2022
c3e5fd2
precommit clean
debnathshoham Oct 28, 2022
86fc699
Merge branch 'main' of https://github.com/pandas-dev/pandas into gh28…
debnathshoham Oct 28, 2022
6f6021d
Merge remote-tracking branch 'origin/gh28243_bug_join_leftindex_right…
debnathshoham Oct 28, 2022
f18b2d2
updated asof join_index
debnathshoham Oct 28, 2022
15f3c9f
added another test and issue
debnathshoham Oct 28, 2022
a9760ff
cleanup precommit
debnathshoham Oct 28, 2022
f096c0c
test cleanup
debnathshoham Oct 28, 2022
aa8ac6d
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Oct 29, 2022
b1b2ac7
Update test_merge_asof.py
debnathshoham Oct 29, 2022
f37d3b4
cosmetic undo
debnathshoham Oct 29, 2022
1cfa627
undo unnecessary cast from tests
debnathshoham Oct 29, 2022
8dcb0bc
Merge remote-tracking branch 'origin/gh28243_bug_join_leftindex_right…
debnathshoham Oct 29, 2022
a9b8412
cosmetic change
debnathshoham Oct 29, 2022
95741d7
updated whatsnew
debnathshoham Oct 29, 2022
ca1e1ed
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Oct 31, 2022
ced3c5b
whatsnew to 2.0.0
debnathshoham Oct 31, 2022
76b9310
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Nov 1, 2022
1c83099
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Nov 1, 2022
9c28ecf
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Nov 8, 2022
81ace15
Merge branch 'main' into gh28243_bug_join_leftindex_righton
debnathshoham Dec 14, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
precommit changes
  • Loading branch information
debnathshoham committed Oct 28, 2022
commit d95515cbf44d9dfb4d9f27d873200775110faabe
10 changes: 5 additions & 5 deletions pandas/tests/reshape/merge/test_join.py
Original file line number Diff line number Diff line change
Expand Up @@ -416,7 +416,7 @@ def test_join_inner_multiindex(self, lexsorted_two_level_string_multiindex):
expected = expected.drop(["first", "second"], axis=1)
expected.index = joined.index

#assert joined.index.is_monotonic_increasing
# assert joined.index.is_monotonic_increasing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would fail.. since to_join is being joined on the index.. the final df would contain that index

this behaviour is mentioned in the docs for merge.. but since join also uses most of the same stuff.. I think they should be consistent?!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not sure I follow - what does this produce now? Why does the monotonicity of the index change?

Copy link
Member Author
@debnathshoham debnathshoham Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously joined inherited the index from data df. After the change it will inherit from tojoin.

My reason for this to be expected:

  • In merge, with column on one side and index on the other, the result inherits the index from the data frame merged on the index (edit: this behaviour is as per doc)
  • Since join uses merge, I think their output should be consistent with each other

tm.assert_frame_equal(joined, expected)

# _assert_same_contents(expected, expected2.loc[:, expected.columns])
Expand Down Expand Up @@ -662,9 +662,9 @@ def test_join_multi_to_multi(self, join_type):
right = DataFrame({"v2": [100 * i for i in range(1, 7)]}, index=rightindex)

result = left.join(right, on=["abc", "xy"], how=join_type)
expected = (
left.reset_index()
.merge(right.reset_index(), on=["abc", "xy"], how=join_type))
expected = left.reset_index().merge(
right.reset_index(), on=["abc", "xy"], how=join_type
)
if join_type == "left":
expected = expected.set_index(["abc", "xy", "num"])
else:
Expand Down Expand Up @@ -728,7 +728,7 @@ def test_join_datetime_string(self):
],
index=[2, 4],
columns=["x", "y", "z", "a"],
).astype({"x":"datetime64[ns]"})
).astype({"x": "datetime64[ns]"})
tm.assert_frame_equal(result, expected)

def test_join_with_categorical_index(self):
Expand Down
16 changes: 9 additions & 7 deletions pandas/tests/reshape/merge/test_merge_asof.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,7 +453,7 @@ def test_multiby_indexed(self):
result = merge_asof(
left, right, left_index=True, right_index=True, by=["k1", "k2"]
)
expected.index= result.index
expected.index = result.index
tm.assert_frame_equal(expected, result)

with pytest.raises(
Expand Down Expand Up @@ -716,7 +716,7 @@ def test_index_tolerance(self, trades, quotes, tolerance):
by="ticker",
tolerance=Timedelta("1day"),
)
expected.index= result.index
expected.index = result.index
tm.assert_frame_equal(result, expected)

def test_allow_exact_matches(self, trades, quotes, allow_exact_matches):
Expand Down Expand Up @@ -1291,7 +1291,7 @@ def test_merge_by_col_tz_aware(self):
expected = pd.DataFrame(
[[pd.Timestamp("2018-01-01", tz="UTC"), 2, "a", "b"]],
columns=["by_col", "on_col", "values_x", "values_y"],
).astype({"by_col":"datetime64[ns, UTC]"})
).astype({"by_col": "datetime64[ns, UTC]"})
tm.assert_frame_equal(result, expected)

def test_by_mixed_tz_aware(self):
Expand All @@ -1316,7 +1316,7 @@ def test_by_mixed_tz_aware(self):
expected = pd.DataFrame(
[[pd.Timestamp("2018-01-01", tz="UTC"), "HELLO", 2, "a"]],
columns=["by_col1", "by_col2", "on_col", "value_x"],
).astype({"by_col1":"datetime64[ns, UTC]"})
).astype({"by_col1": "datetime64[ns, UTC]"})
expected["value_y"] = np.array([np.nan], dtype=object)
tm.assert_frame_equal(result, expected)

Expand Down Expand Up @@ -1544,7 +1544,7 @@ def test_merge_asof_array_as_on():
"a": [2, 6],
"ts": [pd.Timestamp("2021/01/01 00:37"), pd.Timestamp("2021/01/01 01:40")],
}
).astype({"ts":"datetime64[ns]"})
).astype({"ts": "datetime64[ns]"})
ts_merge = pd.date_range(
start=pd.Timestamp("2021/01/01 00:00"), periods=3, freq="1h"
)
Expand All @@ -1557,7 +1557,9 @@ def test_merge_asof_array_as_on():
allow_exact_matches=False,
direction="backward",
)
expected = pd.DataFrame({"b": [4, 8, 7], "a": [np.nan, 2, 6], "ts": ts_merge}).astype({"ts":"datetime64[ns]"})
expected = pd.DataFrame(
{"b": [4, 8, 7], "a": [np.nan, 2, 6], "ts": ts_merge}
).astype({"ts": "datetime64[ns]"})
tm.assert_frame_equal(result, expected)

result = merge_asof(
Expand All @@ -1574,5 +1576,5 @@ def test_merge_asof_array_as_on():
"ts": [pd.Timestamp("2021/01/01 00:37"), pd.Timestamp("2021/01/01 01:40")],
"b": [4, 8],
}
).astype({"ts":"datetime64[ns]"})
).astype({"ts": "datetime64[ns]"})
tm.assert_frame_equal(result, expected)
0