8000 Produce a smaller SDK · Issue #41128 · dotnet/sdk · GitHub
[go: up one dir, main page]

Skip to content

Produce a smaller SDK #41128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
richlander opened this issue May 23, 2024 · 10 comments
Open

Produce a smaller SDK #41128

richlander opened this issue May 23, 2024 · 10 comments
Assignees
Labels
Area-Install untriaged Request triage from a team member
Milestone

Comments

@richlander
Copy link
Member

The SDK could be a lot smaller. A smaller SDK would significantly benefit CI and similar environments. We typically think of the SDK being installed on a developer machine, being persistent there, and only be updated once/month (at most). I'm guessing that most SDK installs are not in persistent but disposable/temporary environments.

We're always looking for ways to reduce the size of containers. The fewer bytes we transmit over the wire, the better. The size difference between compressed and uncompressed SDK sizes is telling. The compression is very good.

Related issues:

I propose we do the following:

  • Remove the unused assemblies by doing RID-specific SDK builds.
  • Removing all the duplicate assemblies, via some kind of sharing mechanism (which might require help from the runtime team).
  • Establish multiple supported layers of the SDK.

On the last point, I'm interested in producing two different flavors of the SDK in containers, a core layer and a tools layer. Perhaps there are other splits that would be more compelling / complementary.

For containers, I'd see a lot of value in having a -tools layer that contained all the existing dotnet- tools, like dotnet-watch and additional tools like dotnet-trace and also move PowerShell to that layer. It could be equally interesting to make dotnet-* tools available in a layer on top of aspnet but that's a different topic.

My hypothesis is that we can reduce > 50MB of compressed size from our SDK container images (for the core layer), with no loss of functionality for typical needs. That would be huge.

Ideally, we can do this for .NET 10.

@am11
Copy link
Member
am11 commented May 24, 2024

Splitting SDK into smaller packages and restoring them as-needed would be neat.

Establish multiple supported layers of the SDK.

Would it be better to move them in nuget packages? Granted there will be more moving parts, but to solve that problem caching of nuget packages can help rescue, e.g. https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-net#caching-dependencies. dotnet publish --packages <cachable-reusable-path> type of options help in containers and other types of CI environments which, otherwise, repeatedly hit the network to download the exact same set of packages several times a day.

Improving the nuget package caching experience (with guidance and/or tooling support) would cover 'slow restores' issue more broadly; in docker (dotnet/dotnet-docker#2457) and non-docker CI use-cases.

@richlander
Copy link
Member Author

Most of the tools are already packages and should work well with that type of scheme. See: https://www.nuget.org/packages/dotnet-dump.

Using containers and using tools directly on the Actions host are two different approaches that are equally valid/valuable, but I don't think super related. If we make the SDK container image smaller (or bigger), it won't affect or benefit from the Actions caching. Similarly, if you are going the Actions route, you'll likely write more yaml. If you are going the container route, you'll likely enhance a multi-stage build Dockerfile.

Here's a sample that uses bind mounts: https://github.com/dotnet/dotnet-docker/blob/main/samples/releasesapi/Dockerfile.ubuntu-chiseled. I've been intending to use those more broadly. The syntax is a bit ugly, so my motivation has been a bit low. The docker init Dockerfiles use this syntax, however. I should develop a good performance test for it, to better describe the benefit.

@DamianEdwards
Copy link
Member

Just to be clear, the included commands in the SDK like dotnet watch are not tools, they are SDK commands. dotnet-dump and dotnet-ef are tools that must be installed and aren't part of the SDK at all and thus don't contribute to its size. I assumed this issue was more about tackling deduplication of files for the included commands and other infrastructure in the SDK, e.g. multiple copies of Roslyn including all locale resources, etc.

@richlander
Copy link
Member Author

That's a good point. In terms of layering, I'd like to see a build of the SDK that only has components needed for builds, meaning not dotnet watch.

These commands have been the source of all of our false-positives. They are the most likely to produce false positive in future unless we radically changed the way they are built (which may well be necessary).

dotnet/dotnet-docker#5325

@DamianEdwards
Copy link
Member

While I'm sympathetic to the motivations here, extra layering complexity in our SDK SKUs will of course increase complexity that end users are required to understand. I have an SDK, but what kind of SDK do I have? Can I "upgrade" or "downgrade" from one SDK type to another?

I think workloads were originally partially motivated by a desire to better factor the SDK and optionally allow the acquisition of parts of it be delayed. Perhaps some kind of "acquire on demand" capability is better suited here, and/or expansion of the workloads feature to enable bringing in SDK commands, locales, etc.

@richlander
Copy link
Member Author

We can do it just for containers (to start). I don't think users will be confused. We already have layering in place for other scenarios and people have been able to grasp it. In fact, it has been quite successful. Also, as I say, I expect that this is a 90/10 thing. 90% of users won't even notice.

Related: dotnet/dotnet-docker#4821

I think catering the SDK for the persistent installation model is a mistake. We should make it super easy for people to install fewer bits for disposable environments.

@DamianEdwards
Copy link
Member

I think catering the SDK for the persistent installation model is a mistake. We should make it super easy for people to install fewer bits for disposable environments.

I agree with this but of course trade offs still need to be made. Perhaps a good next step would be a straw man proposal of what to introduce that we can then evolve from, e.g. a new SDK container type, call it "SDK Slim", that contains only enough to build a .NET console app project. The following components would not be included:

  • dotnet watch
  • Web SDK
  • Razor SDK (including the Razor compiler)
  • Worker SDK
  • ASP.NET Core ref/runtime packs
  • ASP.NET Core templates

Some of these might actually already support acquire on demand semantics, e.g. MSBuild SDKs via the NuGet SDK resolver, ref and runtime packs via core SDK functionality. Would be good to do some exploration here to see how close it already is.

@richlander
Copy link
Member Author

On demand semantics are a good approach. The real question is where the min line is. Today, you can build ASP.NET Core apps (with all versions matching) w/o downloading parts of the platform (runtime or tooling). I think that's very good and most developer will want that to continue. It's also where we offer value compared to the node.js ecosystem.

Container layers naturally cache. Downloadable content can be cached, but it requires complicated opt-in patterns to do well. When I say cache, I mean that building multiple ASP.NET Core app images within the same CI leg should share as much content as possible.

In the ideal world, we'd offer significant functionality in the base SDK image and then show folks how to opt-in to more sharing on top of that. That's what is happening here: https://github.com/dotnet/dotnet-docker/blob/7d4d56941607d8521d500be152d66bb7d9e3dbf0/samples/releasesapi/Dockerfile.ubuntu-chiseled#L10-L12. I plan to expand that pattern to other samples where there is a benefit (anything self-contained).

After writing all that (and thinking a bit more), console + web api could be a good baseline for the min sdk. It's really a question of whether we can come up with good patterns for ensuring users can cache the content they want. This includes for users that want an air-gapped experience.

I've often wondered whether some users might benefit from a cached runtime pack in our images. We've never done it because it is too big. If we reduce the size of the SDK, we may be able to develop a pattern that makes caching runtime packs (while still being servicing friendly) more workable.

@ViktorHofer ViktorHofer removed their assignment Jan 13, 2025
@ericstj
Copy link
Member
ericstj commented May 12, 2025

I was running an experiment and shared the results with @marcpopMSFT the other day. Running a tool that captures duplication identified by >= AssemblyVersion, == Name+Culture, == TFM - here are the results:

===== Deduplication Summary =====
Total files analyzed: 2689
Files identified for removal: 832
Original size: 299.41 MB
Size after deduplication: 178.99 MB
Reduction: 120.42 MB (40.2%)

net10.p4.csv

@richlander
Copy link
Member Author

Nicely done.

Files identified for removal: 832

Wow!

@marcpopMSFT marcpopMSFT added this to the 11.0.1xx milestone May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Install untriaged Request triage from a team member
Projects
None yet
Development

No branches or pull requests

7 participants
0