10000 Fix fewest modules OOM by steinybot · Pull Request #4986 · scala-js/scala-js · GitHub
[go: up one dir, main page]

Skip to content

Fix fewest modules OOM #4986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update comment
  • Loading branch information
steinybot committed May 22, 2024
commit f72c81a520bdc9c9faeebc01a2f4c85776dfed8b
8000
Original file line number Diff line number Diff line change
Expand Up @@ -24,138 +24,139 @@ import scala.collection.{immutable, mutable}

/** Tagger groups classes into coarse modules.
*
* It is the primary mechanism for the FewestModulesAnalyzer but also used
* by the SmallModulesForAnalyzer.
* It is the primary mechanism for the FewestModulesAnalyzer but also used
* by the SmallModulesForAnalyzer.
*
* To group classes into modules appropriately, we want to know for
* each class, "how" it can be reached. In practice, this means we
* record the path from the original public module and every
* dynamic import hop we made.
* To group classes into modules appropriately, we want to know for
* each class, "how" it can be reached. In practice, this means we
* record the path from the original public module and every
* dynamic import hop we made.
*
* Of all these paths, we only care about the "simplest" ones. Or
* more formally, the minimum prefixes of all paths. For example,
* if a class is reachable by the following paths:
* Of all these paths, we only care about the "simplest" ones. Or
* more formally, the minimum prefixes of all paths. For example,
* if a class is reachable by the following paths:
*
* - a -> b
* - a -> b -> c
* - d -> c
* - d
*
* We really only care about:
* We really only care about:
*
* - a -> b
* - d
*
* Because if we reach the class through a path that goes through
* `c`, it is necessarily already loaded.
* Because if we reach the class through a path that goes through
* `c`, it is necessarily already loaded.
*
* Once we have obtained this minimal set of paths, we use the last
* element of each path to determine the final module
* grouping. This is because these have an actual static dependency
* on the node in question.
* Once we have obtained this minimal set of paths, we use the last
* element of each path to determine the final module
* grouping. This is because these have an actual static dependency
* on the node in question.
*
* Merging these tags into a single `ModuleID` is delegated to the
* caller.
* Merging these tags into a single `ModuleID` is delegated to the
* caller.
*
* == Class Exclusion ==
* Classes can be excluded from the modules generated by Tagger.
* == Class Exclusion ==
* Classes can be excluded from the modules generated by Tagger.
*
* For excluded classes, the Tagger assumes that they are in a module
* provided by a different part of the overall analysis.
* For excluded classes, the Tagger assumes that they are in a module
* provided by a different part of the overall analysis.
*
* The (transitive) dependencies of the class are nevertheless taken into
* account and tagged as appropriate.
* In particular, to avoid cycles and excessive splitting alike (see #4835),
* we need to introduce an additional tagging mechanism.
* The (transitive) dependencies of the class are nevertheless taken into
* account and tagged as appropriate.
* In particular, to avoid cycles and excessive splitting alike (see #4835),
* we need to introduce an additional tagging mechanism.
*
* To illustra 8000 te the problem, take the following dependency graph as an example
* To illustrate the problem, take the following dependency graph as an example
*
* a -> b -> c
* a -> b -> c
*
* where B is excluded. Naively, we would want to group a and c together into a'.
* However, this would lead to a circular dependency between a' and c.
* where B is excluded. Naively, we would want to group a and c together into a'.
* However, this would lead to a circular dependency between a' and c.
*
* Nevertheless, in the absence of b, or if b is not an excluded class, we'd
* want to perform the grouping to avoid unnecessary splitting.
* Nevertheless, in the absence of b, or if b is not an excluded class, we'd
* want to perform the grouping to avoid unnecessary splitting.
*
* We achieve this by tracking an additional tag, representing the maximum
* number of hops from an excluded class (aka fine) to a non-excluded class
* (aka coarse) class for any path from an entrypoint to the given class.
* We achieve this by tracking an additional tag, representing the maximum
* number of hops from an excluded class (aka fine) to a non-excluded class
* (aka coarse) class for any path from an entrypoint to the given class.
*
* We then only permit grouping coarse classes with the same tag. This avoids
* the creation of cycles.
* We then only permit grouping coarse classes with the same tag. This avoids
* the creation of cycles.
*
* The following is a proof that this strategy avoids cycles.
* The following is a proof that this strategy avoids cycles.
*
* Given
* Given
*
* G = (V, E), acyclic, V = F ∪ C, F ∩ C = ∅
* the original dependency graph,
* F: set of fine classes,
* C: set of coarse classes
* G = (V, E), acyclic, V = F ∪ C, F ∩ C = ∅
* the original dependency graph,
* F: set of fine classes,
* C: set of coarse classes
*
* t : V → ℕ (the maxExcludedHopCount tag)
* ∀ (v1, v2) ∈ E : t(v1) ≤ t(v2)
* ∀ (f, c) ∈ E : f ∈ F, c ∈ C ⇒ t(f) < t(c)
* t : V → ℕ (the maxExcludedHopCount tag)
* ∀ (v1, v2) ∈ E : t(v1) ≤ t(v2)
* ∀ (f, c) ∈ E : f ∈ F, c ∈ C ⇒ t(f) < t(c)
*
* Define
* Define
*
* G' = (V', E'), V' = F ∪ C' (the new grouped graph)
* G' = (V', E'), V' = F ∪ C' (the new grouped graph)
*
* C' = { n ∈ ℕ | ∃ c ∈ C : t(c) = n }
* C' = { n ∈ ℕ | ∃ c ∈ C : t(c) = n }
*
* E' = { (f1, f2) ∈ E | f1, f2 ∈ F } ∪
* { (f, n) | f ∈ F, ∃ c ∈ C : (f, c) ∈ E : t(c) = n } ∪
* { (n, f) | f ∈ F, ∃ c ∈ C : (c, f) ∈ E : t(c) = n } ∪
* { (n, m) | n ≠ m, ∃ c1, c2 ∈ C : (c1, c2) ∈ E : t(c1) = n, t(c2) = m }
* E' = { (f1, f2) ∈ E | f1, f2 ∈ F } ∪
* { (f, n) | f ∈ F, ∃ c ∈ C : (f, c) ∈ E : t(c) = n } ∪
* { (n, f) | f ∈ F, ∃ c ∈ C : (c, f) ∈ E : t(c) = n } ∪
* { (n, m) | n ≠ m, ∃ c1, c2 ∈ C : (c1, c2) ∈ E : t(c1) = n, t(c2) = m }
*
* t' : V' → ℕ:
* t' : V' → ℕ:
*
* t'(f) = t(f) (if f ∈ F)
* t'(n) = n (if n ∈ C')
* t'(f) = t(f) (if f ∈ F)
* t'(n) = n (if n ∈ C')
*
* Lemma 1 (unproven)
* Lemma 1 (unproven)
*
* ∀ (v1, v2) ∈ E' : t'(v1) ≤ t'(v2)
* ∀ (v1, v2) ∈ E' : t'(v1) ≤ t'(v2)
*
* Lemma 2 (unproven)
* Lemma 2 (unproven)
*
* ∀ (f, n) ∈ E' : f ∈ F, n ∈ C' : t'(f) < t'(n)
* ∀ (f, n) ∈ E' : f ∈ F, n ∈ C' : t'(f) < t'(n)
*
* Lemma 3
* Lemma 3
*
* ∀ (n, m) ∈ E' : n,m ∈ C' ⇒ t'(n) < t'(m)
* ∀ (n, m) ∈ E' : n,m ∈ C' ⇒ t'(n) < t'(m)
*
* Follows from Lemma 1 and (n, m) ∈ E' ⇒ n ≠ m (by definition).
* Follows from Lemma 1 and (n, m) ∈ E' ⇒ n ≠ m (by definition).
*
* Theorem
* Theorem
*
* G' is acyclic
* G' is acyclic
*
* Proof by contradiction.
* Proof by contradiction.
*
* Assume ∃ p = x1, ..., xn (x1 = xn, n > 1, xi ∈ V)
* Assume ∃ p = x1, ..., xn (x1 = xn, n > 1, xi ∈ V)
*
* ∃ xi ∈ C' by contradiction: ∀ xi ∈ F ⇒ p is a cycle in G
* ∃ xi ∈ C' by contradiction: ∀ xi ∈ F ⇒ p is a cycle in G
*
* ∃ xi ∈ F by contradiction: ∀ xi ∈ C' ⇒
* t'(xi) increases strictly monotonically (by Lemma 3),
* but x1 = xn ⇒ t'(x1) = t'(xn)
* ∃ xi ∈ F by contradiction: ∀ xi ∈ C' ⇒
* t'(xi) increases strictly monotonically (by Lemma 3),
* but x1 = xn ⇒ t'(x1) = t'(xn)
*
* Therefore,
* Therefore,
*
* ∃ (xi, xj) ∈ p : xi ∈ F, xj ∈ C'
* ∃ (xi, xj) ∈ p : xi ∈ F, xj ∈ C'
*
* Therefore (by Lemma 1)
* Therefore (by Lemma 1)
*
* t'(x1) ≤ ... ≤ t'(xi) < t'(xj) ≤ ... ≤ t'(xn) ⇒ t'(x1) < t'(xn)
* t'(x1) ≤ ... ≤ t'(xi) < t'(xj) ≤ ... ≤ t'(xn) ⇒ t'(x1) < t'(xn)
*
* But x1 = xn ⇒ t'(x1) = t'(xn), which is a contradiction.
* But x1 = xn ⇒ t'(x1) = t'(xn), which is a contradiction.
*
* Therefore, G' is acyclic.
* Therefore, G' is acyclic.
*/
private class Tagger(infos: ModuleAnalyzer.DependencyInfo,
excludedClasses: scala.collection.Set[ClassName] = Set.empty) {
excludedClasses: scala.collection.Set[ClassName] = Set.empty) {

import Tagger._

private[this] val allPaths = mutable.Map.empty[ClassName, Paths]
Expand All @@ -180,9 +181,9 @@ private class Tagger(infos: ModuleAnalyzer.DependencyInfo,
* also re-tagged where appropriate.
*
* @param classNames the starting set of classes to tag
* @param pathRoot the root of the path
* @param pathSteps the steps that make up path
* @param nextSteps the accumulated result
* @param pathRoot the root of the path
* @param pathSteps the steps that make up path
* @param nextSteps the accumulated result
* @return the next steps to traverse
*/
@tailrec
Expand All @@ -192,8 +193,8 @@ private class Tagger(infos: ModuleAnalyzer.DependencyInfo,
case Some(className) => allPaths.get(className) match {
case None =>
val paths = new Paths(trackExcludedHopCounts = excludedClasses.nonEmpty)
paths.put(pathRoot, pathSteps)
allPaths.update(className, paths)
paths.put(pathRoot, pathSteps)
// Consider dependencies the first time we encounter them as this is the shortest path there will be.
val classInfo = infos.classDependencies(className)
tag(classNames.tail ++ classInfo.staticDependencies, pathRoot, pathSteps,
Expand All @@ -202,10 +203,10 @@ private class Tagger(infos: ModuleAnalyzer.DependencyInfo,
val moduleIDChanged = paths.put(pathRoot, pathSteps)
val nextClassNames =
if (moduleIDChanged) {
// TODO: What about dynamic dependencies? I think we need to revisit them too
// Revisit static dependencies again when the module id changes.
// We do not need to revisit dynamic dependencies because by definition they are not used by public
// modules, the first time we tag them is the shortest path so that path end never changes
// modules, the first time we tag them is the shortest path so that path end never changes and only the
// ends are used for the module ids.
classNames.tail ++ infos.classDependencies(className).staticDependencies
} else {
// Otherwise do not consider dependencies again as there is no more information to find.
Expand All @@ -224,9 +225,9 @@ private class Tagger(infos: ModuleAnalyzer.DependencyInfo,
* dependencies have been tagged.
*
* @param classNames the steps to tag and traverse
* @param pathRoot the root of the path
* @param pathSteps the steps that make up path
* @param acc the accumulator
* @param pathRoot the root of the path
* @param pathSteps the steps that make up path
* @param acc the accumulator
*/
@tailrec
private def tagNextSteps(classNames: Set[ClassName], pathRoot: ModuleID, pathSteps: List[ClassName],
Expand All @@ -250,9 +251,9 @@ private class Tagger(infos: ModuleAnalyzer.DependencyInfo,
*
* This will traverse dependencies repeatedly if a prefix is found with a larger excluded hop count.
*
* @param className the starting class to tag
* @param className the starting class to tag
* @param excludedHopCount the excluded hop count so far
* @param fromExcluded whether the previous step was excluded
* @param fromExcluded whether the previous step was excluded
*/
private def updateExcludedHopCounts(className: ClassName, excludedHopCount: Int, fromExcluded: Boolean): Unit = {
val isExcluded = excludedClasses.contains(className)
Expand Down Expand Up @@ -294,7 +295,7 @@ private object Tagger {

/** "Interesting" paths that can lead to a given class.
*
* "Interesting" in this context means:
* "Interesting" in this context means:
* - All direct paths from a public dependency.
* - All non-empty, mutually prefix-free paths of dynamic import hops.
*/
Expand All @@ -317,6 +318,9 @@ private object Tagger {
}
}

/**
* @return `true` if the moduleID for this class changed
*/
def updateExcludedHopCount(excludedHopCount: Int): Boolean = {
val hopCountsChanged = excludedHopCount > maxExcludedHopCount
if (hopCountsChanged)
Expand Down
0