ajaycode
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/bricks.rst‎
Lines changed: 5 additions & 5 deletions b/‎docs/source/bricks.rst‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎docs/source/elements.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/elements.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/argilla-summarization/README.md‎
Lines changed: 1 addition & 1 deletion b/‎examples/argilla-summarization/README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/sec-sentiment-analysis/README.md‎
Lines changed: 1 addition & 1 deletion b/‎examples/sec-sentiment-analysis/README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/sec-sentiment-analysis/fetch.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/sec-sentiment-analysis/fetch.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/training/0-Core Concepts.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎examples/training/0-Core Concepts.ipynb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/training/1-Intro to Bricks.ipynb‎
Lines changed: 2 additions & 2 deletions b/‎examples/training/1-Intro to Bricks.ipynb‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/training/2-File Exploration.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎examples/training/2-File Exploration.ipynb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎requirements/test.in‎
Lines changed: 1 addition & 1 deletion b/‎requirements/test.in‎
Lines changed: 1 addition & 1 deletion
@@ -228,7 +228,7 @@ The output will look the same as the example from the document parsing section a
 ### E-mail Parsing
 
 The `partition_email` function within `unstructured` is helpful for parsing `.eml` files. Common
-e-mail clients such as Microsoft Outlook and Gmail support exproting e-mails as `.eml` files.
+e-mail clients such as Microsoft Outlook and Gmail support exporting e-mails as `.eml` files.
 `partition_email` accepts filenames, file-like object, and raw text as input. The following
 three snippets for parsing `.eml` files are equivalent:
 
 
@@ -20,7 +20,7 @@ titles, narrative text, and tables.
 The ``partition`` brick is the simplest way to partition a document in ``unstructured``.
 If you call the ``partition`` function, ``unstructured`` will attempt to detect the
 file type and route it to the appropriate partitioning brick. All partitioning bricks
-called within ``partition`` are called using the defualt kwargs. Use the document-type
+called within ``partition`` are called using the default kwargs. Use the document-type
 specific bricks if you need to apply non-default settings.
 ``partition`` currently supports ``.docx``, ``.doc``, ``.pptx``, ``.ppt``, ``.eml``, ``.html``, ``.pdf``,
 ``.png``, ``.jpg``, and ``.txt`` files.
@@ -539,7 +539,7 @@ Examples:
 ``clean_ordered_bullets``
 -------------------------
 
-Remove alpha-numeric bullets from the beginning of text up to three “sub-section” levels.
+Remove alphanumeric bullets from the beginning of text up to three “sub-section” levels.
 
 Examples:
 
@@ -687,7 +687,7 @@ Extracts text that occurs before the specified pattern.
 
 Options:
 
-* If ``index`` is set, extract before the ``(index + 1)``th occurence of the pattern. The default is ``0``.
+* If ``index`` is set, extract before the ``(index + 1)``th occurrence of the pattern. The default is ``0``.
 * Strips leading whitespace if ``strip`` is set to ``True``. The default is ``True``.
 
 
@@ -710,7 +710,7 @@ Extracts text that occurs after the specified pattern.
 
 Options:
 
-* If ``index`` is set, extract after the ``(index + 1)``th occurence of the pattern. The default is ``0``.
+* If ``index`` is set, extract after the ``(index + 1)``th occurrence of the pattern. The default is ``0``.
 * Strips trailing whitespace if ``strip`` is set to ``True``. The default is ``True``.
 
 
@@ -834,7 +834,7 @@ Examples:
 ``extract_ordered_bullets``
 ---------------------------
 
-Extracts alpha-numeric bullets from the beginning of text up to three “sub-section” levels.
+Extracts alphanumeric bullets from the beginning of text up to three “sub-section” levels.
 
 Examples:
 
 
@@ -2,7 +2,7 @@ Elements
 --------
 
 The following are the structured page elements that are available within the ``unstructured``
-package. Partioning bricks convert raw documents to this common set of elements. If you need
+package. Partitioning bricks convert raw documents to this common set of elements. If you need
 a custom element, the recommended approach is to create a sub-class of one of the default
 elements.
 
 
@@ -8,7 +8,7 @@ complete a data science project in hours that previously would have taken weeks.
 To get started, use the following steps:
 
 - Ensure you have Python 3.8 or higher installed on your system
-- Create a new Python virtual enviornment
+- Create a new Python virtual environment
 - Run `pip install -r requirements.txt` to install the dependencies
 - Run `PYTHONPATH=. jupyter notebook` from this directory to launch the notebook
 
 
@@ -5,7 +5,7 @@ and several bricks from the `unstructured` library to train a sentiment analysis
 risk factors section of S-1 filings. To get started, use the following steps:
 
 - Ensure you have Python 3.8 or higher installed on your system
-- Create a new Python virtual enviornment
+- Create a new Python virtual environment
 - Run `pip install -r requirements.txt` to install the dependencies
 - Run `PYTHONPATH=. jupyter notebook` from this directory to launch the notebook
 
 
@@ -125,7 +125,7 @@ def get_form_by_ticker(
 
 
 def _form_types(form_type: str, allow_amended_filing: Optional[bool] = True):
-    """Potentialy expand to include amended filing, e.g.:
+    """Potentially expand to include amended filing, e.g.:
     "10-Q" -> "10-Q/A"
     """
     assert form_type in VALID_FILING_TYPES
@@ -144,7 +144,7 @@ def get_form_by_cik(
 ) -> str:
     """For a given CIK, returns the most recent form of a given form_type. By default
     an amended version of the form_type may be retrieved (allow_amended_filing=True).
-    E.g., if form_type is "10-Q", the retrived form could be a 10-Q or 10-Q/A.
+    E.g., if form_type is "10-Q", the retrieved form could be a 10-Q or 10-Q/A.
     """
     session = _get_session(company, email)
     acc_num, _ = _get_recent_acc_num_by_cik(
 
@@ -187,7 +187,7 @@
     "    - `Image`\n",
     "    - `PageBreak`\n",
     "    \n",
-    "Other element types that we will add in the future include tables and figures. Different partioning functions use different methods for determining the element type and extracting the associated content. Document elements have a `str` representation. You can print them using the snippet below."
+    "Other element types that we will add in the future include tables and figures. Different partitioning functions use different methods for determining the element type and extracting the associated content. Document elements have a `str` representation. You
A3E2
 can print them using the snippet below."
    ]
   },
   {
 
@@ -143,7 +143,7 @@
    "id": "e3a8e7f4",
    "metadata": {},
    "source": [
-    "The `unstructured` library also includes partitioning bricks targeted at specific document types. The `partition` brick uses these document-specific partitioning bricks under the hood. There are a few reasons you may want to use a document-specific partioning brick instead of `partition`:\n",
+    "The `unstructured` library also includes partitioning bricks targeted at specific document types. The `partition` brick uses these document-specific partitioning bricks under the hood. There are a few reasons you may want to use a document-specific partitioning brick instead of `partition`:\n",
     "\n",
     "1. If you already know the document type, filetype detection is unnecessary. Using the document-specific brick directly will make your program run faster.\n",
     "2. Fewer dependencies. You don't need to install `libmagic` for filetype detection if you're only using document-specific bricks.\n",
@@ -312,7 +312,7 @@
    "id": "358e149b",
    "metadata": {},
    "source": [
-    "Since a cleaning brick is just a `str -> str` function, users can also easily include their own cleaning bricks for custom data preparation tasks. In the example below, we partition a Russian offensive campaign assessment from the institute of the study of war and remove citations, which are not natural language text that we want to inclue for model training purposes."
+    "Since a cleaning brick is just a `str -> str` function, users can also easily include their own cleaning bricks for custom data preparation tasks. In the example below, we partition a Russian offensive campaign assessment from the institute of the study of war and remove citations, which are not natural language text that we want to include for model training purposes."
    ]
   },
   {
 
@@ -7,7 +7,7 @@
    "source": [
     "# File Exploration\n",
     "\n",
-    "In addition to core document processing capabilities, the `unstructured` library includes utilities for summarizing information about raw doucments. We will cover how to use these utilities in this notebook. At the conclusion of this notebook, you should understand:\n",
+    "In addition to core document processing capabilities, the `unstructured` library includes utilities for summarizing information about raw documents. We will cover how to use these utilities in this notebook. At the conclusion of this notebook, you should understand:\n",
     "\n",
     "- [Filetype detection in `unstructured`](#filetype)\n",
     "- [How to generate summary statistics about documents](#summary)"
 
@@ -15,5 +15,5 @@ types-requests
 vcrpy
 
 # NOTE(robinson) - The following pins are to address
-# vulernabilities in dependency scans
+# vulnerabilities in dependency scans
 certifi>=2022.12.07
Original file line number	Diff line number	Diff line change
`@@ -187,7 +187,7 @@`
`187`	`187`	" - `Image`\n",
`188`	`188`	" - `PageBreak`\n",
`189`	`189`	`" \n",`
`190`		- "Other element types that we will add in the future include tables and figures. Different partioning functions use different methods for determining the element type and extracting the associated content. Document elements have a `str` representation. You can print them using the snippet below."
	`190`	+ "Other element types that we will add in the future include tables and figures. Different partitioning functions use different methods for determining the element type and extracting the associated content. Document elements have a `str` representation. You A3E2 can print them using the snippet below."
`191`	`191`	`]`
`192`	`192`	`},`
`193`	`193`	`{`