You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/training/0-Core Concepts.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -187,7 +187,7 @@
187
187
" - `Image`\n",
188
188
" - `PageBreak`\n",
189
189
"\n",
190
-
"Other element types that we will add in the future include tables and figures. Different partioning functions use different methods for determining the element type and extracting the associated content. Document elements have a `str` representation. You can print them using the snippet below."
190
+
"Other element types that we will add in the future include tables and figures. Different partitioning functions use different methods for determining the element type and extracting the associated content. Document elements have a `str` representation. You
A3E2
can print them using the snippet below."
191
191
]
192
192
},
193
193
{
Collapse file: examples/training/1-Intro to Bricks.ipynb
Copy file name to clipboardExpand all lines: examples/training/1-Intro to Bricks.ipynb
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -143,7 +143,7 @@
143
143
"id": "e3a8e7f4",
144
144
"metadata": {},
145
145
"source": [
146
-
"The `unstructured` library also includes partitioning bricks targeted at specific document types. The `partition` brick uses these document-specific partitioning bricks under the hood. There are a few reasons you may want to use a document-specific partioning brick instead of `partition`:\n",
146
+
"The `unstructured` library also includes partitioning bricks targeted at specific document types. The `partition` brick uses these document-specific partitioning bricks under the hood. There are a few reasons you may want to use a document-specific partitioning brick instead of `partition`:\n",
147
147
"\n",
148
148
"1. If you already know the document type, filetype detection is unnecessary. Using the document-specific brick directly will make your program run faster.\n",
149
149
"2. Fewer dependencies. You don't need to install `libmagic` for filetype detection if you're only using document-specific bricks.\n",
@@ -312,7 +312,7 @@
312
312
"id": "358e149b",
313
313
"metadata": {},
314
314
"source": [
315
-
"Since a cleaning brick is just a `str -> str` function, users can also easily include their own cleaning bricks for custom data preparation tasks. In the example below, we partition a Russian offensive campaign assessment from the institute of the study of war and remove citations, which are not natural language text that we want to inclue for model training purposes."
315
+
"Since a cleaning brick is just a `str -> str` function, users can also easily include their own cleaning bricks for custom data preparation tasks. In the example below, we partition a Russian offensive campaign assessment from the institute of the study of war and remove citations, which are not natural language text that we want to include for model training purposes."
Copy file name to clipboardExpand all lines: examples/training/2-File Exploration.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@
7
7
"source": [
8
8
"# File Exploration\n",
9
9
"\n",
10
-
"In addition to core document processing capabilities, the `unstructured` library includes utilities for summarizing information about raw doucments. We will cover how to use these utilities in this notebook. At the conclusion of this notebook, you should understand:\n",
10
+
"In addition to core document processing capabilities, the `unstructured` library includes utilities for summarizing information about raw documents. We will cover how to use these utilities in this notebook. At the conclusion of this notebook, you should understand:\n",
11
11
"\n",
12
12
"- [Filetype detection in `unstructured`](#filetype)\n",
13
13
"- [How to generate summary statistics about documents](#summary)"
0 commit comments