8000 API: make selecting coordinate / masking a bit easier in HDFStore · Issue #4467 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

API: make selecting coordinate / masking a bit easier in HDFStore #4467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Aug 5, 2013 · 3 comments · Fixed by #4470
Closed

API: make selecting coordinate / masking a bit easier in HDFStore #4467

jreback opened this issue Aug 5, 2013 · 3 comments · Fixed by #4470
Labels
API Design IO HDF5 read_hdf, HDFStore
Milestone

Comments

@jreback
Copy link
Contributor
jreback commented Aug 5, 2013

From the PyTables ML

Select where month=5 from the index
(this could be done internally maybe)

big issues is that Coordinates is sort of 'private' here,
make where take a boolean array / coordinates

# create a frame
In [45]: df = DataFrame(randn(1000,2),index=date_range('20000101',periods=1000))

In [53]: df
Out[53]: 
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1000 entries, 2000-01-01 00:00:00 to 2002-09-26 00:00:00
Freq: D
Data columns (total 2 columns):
0    1000  non-null values
1    1000  non-null values
dtypes: float64(2)

# store it as a table
In [46]: store = pd.HDFStore('test.h5',mode='w')

In [47]: store.append('df',df)

# select out the index (a datetimeindex in this case)
In [48]: c = store.select_column('df','index')

# get the coordinates of matching index
In [49]: coords = c[pd.DatetimeIndex(c).month==5]

# select those rows
In [51]: from pandas.io.pytables import Coordinates

In [50]: store.select('df',where=Coordinates(coords.index,None,None))
Out[50]: 
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 93 entries, 2000-05-01 00:00:00 to 2002-05-31 00:00:00
Data columns (total 2 columns):
0    93  non-null values
1    93  non-null values
dtypes: float64(2)
@cpcloud
Copy link
Member
cpcloud commented Aug 5, 2013

what about making tuples mean Coordinates? this would be an API change since right now tuples are assumed to be a sequence of "Termables" maybe deprecate and fully remove in 0.14?

@jreback
Copy link
Contributor Author
jreback commented Aug 5, 2013

no...Coordinates is an internal class; that 'looks' like a mask

e.g

store.select('df',where=np.arange(5))
store.select('df',where=[True True False False False])

should work

right now you would have to specify that as

store.select('df',where=Coordinates(np.arange(5),None,None))

@cpcloud
Copy link
Member
cpcloud commented Aug 5, 2013

ah ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0