8000 chore: fix typos (#844) · PhVHoang/datafusion-python@57eb959 · GitHub
[go: up one dir, main page]

Skip to content

Commit 57eb959

Browse files
authored
chore: fix typos (apache#844)
- run [codespell](https://github.com/codespell-project/codespell) on the source code - change name of parameter in db-benchmark.dockerfile based on spelling suggestion and the documentation: https://www.rdocumentation.org/packages/utils/versions/3.6.2/topics/install.packages
1 parent 90f5b5b commit 57eb959

23 files changed

+32
-32
lines changed

benchmarks/db-benchmark/db-benchmark.dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ RUN cd pandas && \
5858
RUN cd modin && \
5959
virtualenv py-modin --python=/usr/bin/python3.10
6060

61-
RUN Rscript -e 'install.packages(c("jsonlite","bit64","devtools","rmarkdown"), dependecies=TRUE, repos="https://cloud.r-project.org")'
61+
RUN Rscript -e 'install.packages(c("jsonlite","bit64","devtools","rmarkdown"), dependencies=TRUE, repos="https://cloud.r-project.org")'
6262

6363
SHELL ["/bin/bash", "-c"]
6464

docs/mdbook/src/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919
DataFusion is a blazing fast query engine that lets you run data analyses quickly and reliably.
2020

21-
DataFusion is written in Rust, but also exposes Python and SQL bindings, so you can easily query data in your langauge of choice. You don't need to know any Rust to be a happy and productive user of DataFusion.
21+
DataFusion is written in Rust, but also exposes Python and SQL bindings, so you can easily query data in your language of choice. You don't need to know any Rust to be a happy and productive user of DataFusion.
2222

2323
DataFusion lets you run queries faster than pandas. Let's compare query runtimes for a 5GB CSV file with 100 million rows of data.
2424

docs/source/_static/theme_overrides.css

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ a.navbar-brand img {
5656

5757

5858
/* This is the bootstrap CSS style for "table-striped". Since the theme does
59-
not yet provide an easy way to configure this globaly, it easier to simply
59+
not yet provide an easy way to configure this globally, it easier to simply
6060
include this snippet here than updating each table in all rst files to
6161
add ":class: table-striped" */
6262

docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
# specific language governing permissions and limitations
1616
# under the License.
1717

18-
"""Documenation generation."""
18+
"""Documentation generation."""
1919

2020
# Configuration file for the Sphinx documentation builder.
2121
#

docs/source/user-guide/common-operations/expressions.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Expressions
2121
===========
2222

2323
In DataFusion an expression is an abstraction that represents a computation.
24-
Expressions are used as the primary inputs and ouputs for most functions within
24+
Expressions are used as the primary inputs and outputs for most functions within
2525
DataFusion. As such, expressions can be combined to create expression trees, a
2626
concept shared across most compilers and databases.
2727

examples/export.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,6 @@
4848
pylist = df.to_pylist()
4949
assert pylist == [{"a": 1, "b": 4}, {"a": 2, "b": 5}, {"a": 3, "b": 6}]
5050

51-
# export to Pyton dictionary of columns
51+
# export to Python dictionary of columns
5252
pydict = df.to_pydict()
5353
assert pydict == {"a": [1, 2, 3], "b": [4, 5, 6]}

examples/python-udf-comparisons.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
# question "return all of the rows that have a specific combination of these
2929
# values". We have the combinations we care about provided as a python
3030
# list of tuples. There is no built in function that supports this operation,
31-
# but it can be explicilty specified via a single expression or we can
31+
# but it can be explicitly specified via a single expression or we can
3232
# use a user defined function.
3333

3434
ctx = SessionContext()

examples/tpch/q02_minimum_cost_supplier.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@
9696
# create a column of that value. We can then filter down any rows for which the cost and
9797
# minimum do not match.
9898

99-
# The default window frame as of 5/6/2024 is from unbounded preceeding to the current row.
99+
# The default window frame as of 5/6/2024 is from unbounded preceding to the current row.
100100
# We want to evaluate the entire data frame, so we specify this.
101101
window_frame = datafusion.WindowFrame("rows", None, None)
102102
df = df.with_column(

examples/tpch/q04_order_priority_checking.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,9 @@
5353

5454
# Limit results to cases where commitment date before receipt date
5555
# Aggregate the results so we only get one row to join with the order table.
56-
# Alterately, and likely more idomatic is instead of `.aggregate` you could
56+
# Alternately, and likely more idiomatic is instead of `.aggregate` you could
5757
# do `.select_columns("l_orderkey").distinct()`. The goal here is to show
58-
# mulitple examples of how to use Data Fusion.
58+
# multiple examples of how to use Data Fusion.
5959
df_lineitem = df_lineitem.filter(col("l_commitdate") < col("l_receiptdate")).aggregate(
6060
[col("l_orderkey")], []
6161
)

examples/tpch/q06_forecasting_revenue_change.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,5 +82,5 @@
8282

8383
revenue = df.collect()[0]["revenue"][0].as_py()
8484

85-
# Note: the output value from this query may be dependant on the size of the database generated
85+
# Note: the output value from this query may be dependent on the size of the database generated
8686
print(f"Potential lost revenue: {revenue:.2f}")

0 commit comments

Comments
 (0)
0