8000 llama.cpp/examples/perplexity at BambaAbstractMemory · gabe-l-hart/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content
{"payload":{"allShortcutsEnabled":false,"path":"examples/perplexity","repo":{"id":768340132,"defaultBranch":"master","name":"llama.cpp","ownerLogin":"gabe-l-hart","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2024-03-06T22:46:55.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/1254484?v=4","public":true,"private":false,"isOrgOwned":false},"currentUser":null,"refInfo":{"name":"BambaAbstractMemory","listCacheKey":"v0:1748444253.0","canEdit":false,"refType":"branch","currentOid":"611a470fc1e25e7388c71734f09852a5d9c6ed06"},"tree":{"items":[{"name":"CMakeLists.txt","path":"examples/perplexity/CMakeLists.txt","contentType":"file"},{"name":"README.md","path":"examples/perplexity/README.md","contentType":"file"},{"name":"perplexity.cpp","path":"examples/perplexity/perplexity.cpp","contentType":"file"}],"templateDirectorySuggestionUrl":null,"readme":{"displayName":"README.md","richText":"\u003carticle class=\"markdown-body entry-content container-lg\" itemprop=\"text\"\u003e\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch1 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003ePerplexity\u003c/h1\u003e\u003ca id=\"user-content-perplexity\" class=\"anchor\" aria-label=\"Permalink: Perplexity\" href=\"#perplexity\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cp dir=\"auto\"\u003eThe \u003ccode\u003eperplexity\u003c/code\u003e example can be used to calculate the so-called perplexity value of a language model over a given text corpus.\nPerplexity measures how well the model can predict the next token with lower values being better.\nNote that perplexity is \u003cstrong\u003enot\u003c/strong\u003e directly comparable between models, especially if they use different tokenizers.\nAlso note that finetunes typically result in a higher perplexity value even though the human-rated quality of outputs increases.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eWithin llama.cpp the perplexity of base models is used primarily to judge the quality loss from e.g. quantized models vs. FP16.\nThe convention among contributors is to use the Wikitext-2 test set for testing unless noted otherwise (can be obtained with \u003ccode\u003escripts/get-wikitext-2.sh\u003c/code\u003e).\nWhen numbers are listed all command line arguments and compilation options are left at their defaults unless noted otherwise.\nllama.cpp numbers are \u003cstrong\u003enot\u003c/strong\u003e directly comparable to those of other projects because the exact values depend strongly on the implementation details.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eBy default only the mean perplexity value and the corresponding uncertainty is calculated.\nThe uncertainty is determined empirically by assuming a Gaussian distribution of the \"correct\" logits per and then applying error propagation.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eMore statistics can be obtained by recording the logits from the FP16 version of a model.\nTo do this, supply \u003ccode\u003eperplexity\u003c/code\u003e with \u003ccode\u003e--kl-divergence-base path/to/logit/binary/file.kld\u003c/code\u003e.\nThe program will then record all logits and save them to the provided path in binary format.\n\u003cstrong\u003eThe logit file will be very large, 11 GiB for LLaMA 2 or 37 GiB for LLaMA 3 when using the Wikitext-2 test set.\u003c/strong\u003e\nOnce you have the file, supply \u003ccode\u003eperplexity\u003c/code\u003e with the quantized model, the logits file via \u003ccode\u003e--kl-divergence-base\u003c/code\u003e,\nand finally the \u003ccode\u003e--kl-divergence\u003c/code\u003e argument to indicate that the program should calculate the so-called Kullback-Leibler divergence.\nThis is a measure of how similar the FP16 and the quantized logit distributions are with a value of 0 indicating that the distribution are the same.\nThe uncertainty on the mean KL divergence is calculated by assuming the KL divergence per token follows a Gaussian distribution.\u003c/p\u003e\n\u003cp dir=\"auto\"\u003eIn addition to the KL divergence the following statistics are calculated with \u003ccode\u003e--kl-divergence\u003c/code\u003e:\u003c/p\u003e\n\u003cul dir=\"auto\"\u003e\n\u003cli\u003eRatio of mean FP16 PPL and quantized PPL. Uncertainty is estimated on logits, then propagated. The logarithm of this metric is also calculated and printed, it is 0 if the logit distributions are the same.\u003c/li\u003e\n\u003cli\u003eDifference of mean FP16 PPL and quantized PPL. Uncertainty is estimated on logits, then propagated.\u003c/li\u003e\n\u003cli\u003eMean change in \"correct\" token probability. Positive values mean the model gets better at prediction, negative values mean it gets worse.\u003c/li\u003e\n\u003cli\u003ePearson correlation coefficient of the \"correct\" token probabilites between models.\u003c/li\u003e\n\u003cli\u003ePercentiles of change in \"correct\" token probability. Positive values mean the model gets better at prediction, negative values mean it gets worse. Can be used to judge noise vs. quality loss from quantization. If the percentiles are symmetric then the quantization is essentially just adding noise. If the negative values are significantly larger than the positive values then this indicates that the model is actually becoming worse from the quantization.\u003c/li\u003e\n\u003cli\u003eThe root mean square of the change in token probabilities. If you were to assume that the quantization simply causes Gaussian noise on the token probabilities then this would be the standard deviation of said noise. The uncertainty on the value is calculated that the change in token probabilities follows a Gaussian distribution. Related discussion: \u003ca class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"5571803\" data-permission-text=\"Title is private\" data-url=\"https://github.com/ggml-org/llama.cpp/discussions/2875\" data-hovercard-type=\"discussion\" data-hovercard-url=\"/ggml-org/llama.cpp/discussions/2875/hovercard\" href=\"https://github.com/ggml-org/llama.cpp/discussions/2875\"\u003eggml-org#2875\u003c/a\u003e .\u003c/li\u003e\n\u003cli\u003eSame top p: Percentage of how often the token was assigned the highest probabilites by both models. The uncertainty is calculated from the Gaussian approximation of the binomial distribution.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eLLaMA 3 8b Scoreboard\u003c/h2\u003e\u003ca id=\"user-content-llama-3-8b-scoreboard\" class=\"anchor\" aria-label=\"Permalink: LLaMA 3 8b Scoreboard\" href=\"#llama-3-8b-scoreboard\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth align=\"left\"\u003eRevision\u003c/th\u003e\n\u003cth align=\"left\"\u003ef364eb6f\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eBackend\u003c/td\u003e\n\u003ctd align=\"left\"\u003eCUDA\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eCPU\u003c/td\u003e\n\u003ctd align=\"left\"\u003eAMD Epyc 7742\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eGPU\u003c/td\u003e\n\u003ctd align=\"left\"\u003e1x NVIDIA RTX 4090\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cp dir=\"auto\"\u003eResults were generated using the CUDA backend and are sorted by Kullback-Leibler divergence relative to FP16.\nThe \"WT\" importance matrices were created using varying numbers of Wikitext tokens and can be found \u003ca href=\"https://huggingface.co/JohannesGaessler/llama.cpp_importance_matrices/blob/main/imatrix-llama_3-8b-f16-2.7m_tokens.dat\" rel=\"nofollow\"\u003ehere\u003c/a\u003e.\nNote: the FP16 logits used for the calculation of all metrics other than perplexity are stored in a binary file between runs.\nIn order to save space this file does \u003cstrong\u003enot\u003c/strong\u003e contain the exact same FP32 logits but instead casts them to 16 bit unsigned integers (with some scaling).\nSo the \"f16\" results are to be understood as the difference resulting only from this downcast.\u003c/p\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eQuantization\u003c/th\u003e\n\u003cth\u003eimatrix\u003c/th\u003e\n\u003cth\u003eModel size [GiB]\u003c/th\u003e\n\u003cth\u003ePPL\u003c/th\u003e\n\u003cth\u003eΔPPL\u003c/th\u003e\n\u003cth\u003eKLD\u003c/th\u003e\n\u003cth\u003eMean Δp\u003c/th\u003e\n\u003cth\u003eRMS Δp\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003ef16\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e14.97\u003c/td\u003e\n\u003ctd\u003e6.233160 ± 0.037828\u003c/td\u003e\n\u003ctd\u003e0.001524 ± 0.000755\u003c/td\u003e\n\u003ctd\u003e0.000551 ± 0.000002\u003c/td\u003e\n\u003ctd\u003e0.001 ± 0.002 %\u003c/td\u003e\n\u003ctd\u003e0.787 ± 0.004 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq8_0\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e7.96\u003c/td\u003e\n\u003ctd\u003e6.234284 ± 0.037878\u003c/td\u003e\n\u003ctd\u003e0.002650 ± 0.001006\u003c/td\u003e\n\u003ctd\u003e0.001355 ± 0.000006\u003c/td\u003e\n\u003ctd\u003e-0.019 ± 0.003 %\u003c/td\u003e\n\u003ctd\u003e1.198 ± 0.007 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq6_K\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e6.14\u003c/td\u003e\n\u003ctd\u003e6.253382 ± 0.038078\u003c/td\u003e\n\u003ctd\u003e0.021748 ± 0.001852\u003c/td\u003e\n\u003ctd\u003e0.005452 ± 0.000035\u003c/td\u003e\n\u003ctd\u003e-0.007 ± 0.006 %\u003c/td\u003e\n\u003ctd\u003e2.295 ± 0.019 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq5_K_M\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e5.33\u003c/td\u003e\n\u003ctd\u003e6.288607 ± 0.038338\u003c/td\u003e\n\u003ctd\u003e0.056974 ± 0.002598\u003c/td\u003e\n\u003ctd\u003e0.010762 ± 0.000079\u003c/td\u003e\n\u003ctd\u003e-0.114 ± 0.008 %\u003c/td\u003e\n\u003ctd\u003e3.160 ± 0.031 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq5_K_S\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e5.21\u003c/td\u003e\n\u003ctd\u003e6.336598 ± 0.038755\u003c/td\u003e\n\u003ctd\u003e0.104964 ± 0.003331\u003c/td\u003e\n\u003ctd\u003e0.016595 ± 0.000122\u003c/td\u003e\n\u003ctd\u003e-0.223 ± 0.010 %\u003c/td\u003e\n\u003ctd\u003e3.918 ± 0.036 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq5_1\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e5.65\u003c/td\u003e\n\u003ctd\u003e6.337857 ± 0.038677\u003c/td\u003e\n\u003ctd\u003e0.106223 ± 0.003476\u003c/td\u003e\n\u003ctd\u003e0.018045 ± 0.000139\u003c/td\u003e\n\u003ctd\u003e-0.287 ± 0.011 %\u003c/td\u003e\n\u003ctd\u003e4.123 ± 0.039 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq5_0\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e5.21\u003c/td\u003e\n\u003ctd\u003e6.363224 ± 0.038861\u003c/td\u003e\n\u003ctd\u003e0.131591 ± 0.003894\u003c/td\u003e\n\u003ctd\u003e0.022239 ± 0.000166\u003c/td\u003e\n\u003ctd\u003e-0.416 ± 0.012 %\u003c/td\u003e\n\u003ctd\u003e4.634 ± 0.043 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq4_K_M\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e4.58\u003c/td\u003e\n\u003ctd\u003e6.382937 ± 0.039055\u003c/td\u003e\n\u003ctd\u003e0.151303 ± 0.004429\u003c/td\u003e\n\u003ctd\u003e0.028152 ± 0.000240\u003c/td\u003e\n\u003ctd\u003e-0.389 ± 0.014 %\u003c/td\u003e\n\u003ctd\u003e5.251 ± 0.049 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq4_K_M\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e4.58\u003c/td\u003e\n\u003ctd\u003e6.407115 ± 0.039119\u003c/td\u003e\n\u003ctd\u003e0.175482 ± 0.004620\u003c/td\u003e\n\u003ctd\u003e0.031273 ± 0.000238\u003c/td\u003e\n\u003ctd\u003e-0.596 ± 0.014 %\u003c/td\u003e\n\u003ctd\u003e5.519 ± 0.050 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq4_K_S\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e4.37\u003c/td\u003e\n\u003ctd\u003e6.409697 ± 0.039189\u003c/td\u003e\n\u003ctd\u003e0.178064 ± 0.004744\u003c/td\u003e\n\u003ctd\u003e0.031951 ± 0.000259\u003c/td\u003e\n\u003ctd\u003e-0.531 ± 0.015 %\u003c/td\u003e\n\u003ctd\u003e5.645 ± 0.051 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq4_NL\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e4.35\u003c/td\u003e\n\u003ctd\u003e6.455593 ± 0.039630\u003c/td\u003e\n\u003ctd\u003e0.223959 ± 0.005201\u003c/td\u003e\n\u003ctd\u003e0.035742 ± 0.000288\u003c/td\u003e\n\u003ctd\u003e-0.590 ± 0.016 %\u003c/td\u003e\n\u003ctd\u003e5.998 ± 0.054 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq4_XS\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e4.14\u003c/td\u003e\n\u003ctd\u003e6.459705 ± 0.039595\u003c/td\u003e\n\u003ctd\u003e0.228071 ± 0.005207\u003c/td\u003e\n\u003ctd\u003e0.036334 ± 0.000284\u003c/td\u003e\n\u003ctd\u003e-0.668 ± 0.016 %\u003c/td\u003e\n\u003ctd\u003e6.044 ± 0.054 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq4_K_S\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e4.37\u003c/td\u003e\n\u003ctd\u003e6.500529 ± 0.039778\u003c/td\u003e\n\u003ctd\u003e0.268895 ± 0.005638\u003c/td\u003e\n\u003ctd\u003e0.043136 ± 0.000314\u003c/td\u003e\n\u003ctd\u003e-0.927 ± 0.017 %\u003c/td\u003e\n\u003ctd\u003e6.562 ± 0.055 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq4_1\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e4.78\u003c/td\u003e\n\u003ctd\u003e6.682737 ± 0.041285\u003c/td\u003e\n\u003ctd\u003e0.451103 ± 0.008030\u003c/td\u003e\n\u003ctd\u003e0.071683 ± 0.000505\u003c/td\u003e\n\u003ctd\u003e-0.927 ± 0.017 %\u003c/td\u003e\n\u003ctd\u003e8.512 ± 0.063 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq4_0\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e4.34\u003c/td\u003e\n\u003ctd\u003e6.700147 ± 0.041226\u003c/td\u003e\n\u003ctd\u003e0.468514 ± 0.007951\u003c/td\u003e\n\u003ctd\u003e0.071940 ± 0.000491\u003c/td\u003e\n\u003ctd\u003e-1.588 ± 0.022 %\u003c/td\u003e\n\u003ctd\u003e8.434 ± 0.061 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq3_K_L\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e4.03\u003c/td\u003e\n\u003ctd\u003e6.671223 ± 0.041427\u003c/td\u003e\n\u003ctd\u003e0.439590 ± 0.008154\u003c/td\u003e\n\u003ctd\u003e0.073077 ± 0.000529\u003c/td\u003e\n\u003ctd\u003e-0.940 ± 0.023 %\u003c/td\u003e\n\u003ctd\u003e8.662 ± 0.064 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq3_K_M\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e3.74\u003c/td\u003e\n\u003ctd\u003e6.734255 ± 0.041838\u003c/td\u003e\n\u003ctd\u003e0.502622 ± 0.008901\u003c/td\u003e\n\u003ctd\u003e0.084358 ± 0.000588\u003c/td\u003e\n\u003ctd\u003e-1.198 ± 0.024 %\u003c/td\u003e\n\u003ctd\u003e9.292 ± 0.065 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq3_K_L\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e4.03\u003c/td\u003e\n\u003ctd\u003e6.787876 ± 0.042104\u003c/td\u003e\n\u003ctd\u003e0.556242 ± 0.009171\u003c/td\u003e\n\u003ctd\u003e0.087176 ± 0.000614\u003c/td\u003e\n\u003ctd\u003e-1.532 ± 0.025 %\u003c/td\u003e\n\u003ctd\u003e9.432 ± 0.067 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq3_K_M\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e3.74\u003c/td\u003e\n\u003ctd\u003e6.888498 ± 0.042669\u003c/td\u003e\n\u003ctd\u003e0.656864 ± 0.010071\u003c/td\u003e\n\u003ctd\u003e0.101913 ± 0.000677\u003c/td\u003e\n\u003ctd\u003e-1.990 ± 0.026 %\u003c/td\u003e\n\u003ctd\u003e10.203 ± 0.068 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq3_M\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e3.53\u003c/td\u003e\n\u003ctd\u003e6.898327 ± 0.041643\u003c/td\u003e\n\u003ctd\u003e0.666694 ± 0.009449\u003c/td\u003e\n\u003ctd\u003e0.102534 ± 0.000663\u003c/td\u003e\n\u003ctd\u003e-3.178 ± 0.026 %\u003c/td\u003e\n\u003ctd\u003e10.513 ± 0.066 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq3_S\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e3.42\u003c/td\u003e\n\u003ctd\u003e6.965501 ± 0.042406\u003c/td\u003e\n\u003ctd\u003e0.733867 ± 0.010245\u003c/td\u003e\n\u003ctd\u003e0.111278 ± 0.000710\u003c/td\u003e\n\u003ctd\u003e-3.066 ± 0.027 %\u003c/td\u003e\n\u003ctd\u003e10.845 ± 0.068 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq3_XS\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e3.28\u003c/td\u003e\n\u003ctd\u003e7.163043 ± 0.043772\u003c/td\u003e\n\u003ctd\u003e0.931409 ± 0.012084\u003c/td\u003e\n\u003ctd\u003e0.138693 ± 0.000857\u003c/td\u003e\n\u003ctd\u003e-3.667 ± 0.031 %\u003c/td\u003e\n\u003ctd\u003e12.148 ± 0.070 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq3_XXS\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e3.05\u003c/td\u003e\n\u003ctd\u003e7.458436 ± 0.046404\u003c/td\u003e\n\u003ctd\u003e1.226803 ± 0.015234\u003c/td\u003e\n\u003ctd\u003e0.183625 ± 0.001042\u003c/td\u003e\n\u003ctd\u003e-3.918 ± 0.035 %\u003c/td\u003e\n\u003ctd\u003e13.836 ± 0.074 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq3_K_S\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e3.41\u003c/td\u003e\n\u003ctd\u003e7.602878 ± 0.046848\u003c/td\u003e\n\u003ctd\u003e1.371244 ± 0.015688\u003c/td\u003e\n\u003ctd\u003e0.199821 ± 0.001008\u003c/td\u003e\n\u003ctd\u003e-5.046 ± 0.037 %\u003c/td\u003e\n\u003ctd\u003e14.980 ± 0.070 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq3_K_S\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e3.41\u003c/td\u003e\n\u003ctd\u003e7.863786 ± 0.048885\u003c/td\u003e\n\u003ctd\u003e1.632152 ± 0.017733\u003c/td\u003e\n\u003ctd\u003e0.228217 ± 0.001079\u003c/td\u003e\n\u003ctd\u003e-5.604 ± 0.038 %\u003c/td\u003e\n\u003ctd\u003e15.541 ± 0.070 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq2_M\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e2.74\u003c/td\u003e\n\u003ctd\u003e8.600799 ± 0.055124\u003c/td\u003e\n\u003ctd\u003e2.369166 ± 0.025244\u003c/td\u003e\n\u003ctd\u003e0.325989 ± 0.00160\u003c/td\u003e\n\u003ctd\u003e-6.463 ± 0.046 %\u003c/td\u003e\n\u003ctd\u003e18.519 ± 0.080 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K\u003c/td\u003e\n\u003ctd\u003eWT 10k\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e8.652290 ± 0.055572\u003c/td\u003e\n\u003ctd\u003e2.420657 ± 0.025587\u003c/td\u003e\n\u003ctd\u003e0.331393 ± 0.001562\u003c/td\u003e\n\u003ctd\u003e-6.606 ± 0.046 %\u003c/td\u003e\n\u003ctd\u003e18.790 ± 0.078 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K\u003c/td\u003e\n\u003ctd\u003eWT 100k\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e8.641993 ± 0.055406\u003c/td\u003e\n\u003ctd\u003e2.410359 ± 0.025495\u003c/td\u003e\n\u003ctd\u003e0.331672 ± 0.001569\u003c/td\u003e\n\u003ctd\u003e-6.628 ± 0.047 %\u003c/td\u003e\n\u003ctd\u003e18.856 ± 0.078 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e8.647825 ± 0.055610\u003c/td\u003e\n\u003ctd\u003e2.416191 ± 0.025683\u003c/td\u003e\n\u003ctd\u003e0.332223 ± 0.001572\u003c/td\u003e\n\u003ctd\u003e-6.500 ± 0.047 %\u003c/td\u003e\n\u003ctd\u003e18.881 ± 0.078 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K\u003c/td\u003e\n\u003ctd\u003eWT 1m\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e8.674365 ± 0.055743\u003c/td\u003e\n\u003ctd\u003e2.442732 ± 0.025843\u003c/td\u003e\n\u003ctd\u003e0.335308 ± 0.001576\u003c/td\u003e\n\u003ctd\u003e-6.634 ± 0.047 %\u003c/td\u003e\n\u003ctd\u003e19.009 ± 0.079 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K\u003c/td\u003e\n\u003ctd\u003eWT 1k\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e8.682605 ± 0.055916\u003c/td\u003e\n\u003ctd\u003e2.450972 ± 0.026069\u003c/td\u003e\n\u003ctd\u003e0.337093 ± 0.001596\u003c/td\u003e\n\u003ctd\u003e-6.596 ± 0.047 %\u003c/td\u003e\n\u003ctd\u003e18.977 ± 0.079 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K_S\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e9.323778 ± 0.061551\u003c/td\u003e\n\u003ctd\u003e3.092145 ± 0.031914\u003c/td\u003e\n\u003ctd\u003e0.403360 ± 0.001787\u003c/td\u003e\n\u003ctd\u003e-7.131 ± 0.049 %\u003c/td\u003e\n\u003ctd\u003e20.050 ± 0.081 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K_S\u003c/td\u003e\n\u003ctd\u003eWT 1m\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e9.329321 ± 0.061378\u003c/td\u003e\n\u003ctd\u003e3.097688 ± 0.031816\u003c/td\u003e\n\u003ctd\u003e0.403590 ± 0.001797\u003c/td\u003e\n\u003ctd\u003e-7.289 ± 0.049 %\u003c/td\u003e\n\u003ctd\u003e20.123 ± 0.081 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K_S\u003c/td\u003e\n\u003ctd\u003eWT 100k\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e9.362973 ± 0.061740\u003c/td\u003e\n\u003ctd\u003e3.131339 ± 0.032169\u003c/td\u003e\n\u003ctd\u003e0.408367 ± 0.001802\u003c/td\u003e\n\u003ctd\u003e-7.198 ± 0.050 %\u003c/td\u003e\n\u003ctd\u003e20.132 ± 0.081 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K_S\u003c/td\u003e\n\u003ctd\u003eWT 10k\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e9.376479 ± 0.062045\u003c/td\u003e\n\u003ctd\u003e3.144846 ± 0.032464\u003c/td\u003e\n\u003ctd\u003e0.408662 ± 0.001819\u003c/td\u003e\n\u003ctd\u003e-7.141 ± 0.050 %\u003c/td\u003e\n\u003ctd\u003e20.120 ± 0.081 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K_S\u003c/td\u003e\n\u003ctd\u003eWT 1k\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e9.415200 ± 0.062475\u003c/td\u003e\n\u003ctd\u003e3.183567 ± 0.032993\u003c/td\u003e\n\u003ctd\u003e0.415865 ± 0.001846\u003c/td\u003e\n\u003ctd\u003e-7.153 ± 0.050 %\u003c/td\u003e\n\u003ctd\u003e20.311 ± 0.082 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq2_S\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e2.56\u003c/td\u003e\n\u003ctd\u003e9.650781 ± 0.063209\u003c/td\u003e\n\u003ctd\u003e3.419148 ± 0.034017\u003c/td\u003e\n\u003ctd\u003e0.439197 ± 0.001976\u003c/td\u003e\n\u003ctd\u003e-8.319 ± 0.052 %\u003c/td\u003e\n\u003ctd\u003e21.491 ± 0.083 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eq2_K\u003c/td\u003e\n\u003ctd\u003eNone\u003c/td\u003e\n\u003ctd\u003e2.96\u003c/td\u003e\n\u003ctd\u003e9.751568 ± 0.063312\u003c/td\u003e\n\u003ctd\u003e3.519934 ± 0.033863\u003c/td\u003e\n\u003ctd\u003e0.445132 ± 0.001835\u003c/td\u003e\n\u003ctd\u003e-9.123 ± 0.051 %\u003c/td\u003e\n\u003ctd\u003e21.421 ± 0.079 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq2_XS\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e2.43\u003c/td\u003e\n\u003ctd\u003e10.761424 ± 0.071056\u003c/td\u003e\n\u003ctd\u003e4.529791 ± 0.042229\u003c/td\u003e\n\u003ctd\u003e0.546290 ± 0.002133\u003c/td\u003e\n\u003ctd\u003e-10.576 ± 0.056 %\u003c/td\u003e\n\u003ctd\u003e23.872 ± 0.082 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq2_XXS\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e2.24\u003c/td\u003e\n\u003ctd\u003e14.091782 ± 0.098396\u003c/td\u003e\n\u003ctd\u003e7.860148 ± 0.070752\u003c/td\u003e\n\u003ctd\u003e0.812022 ± 0.002741\u003c/td\u003e\n\u003ctd\u003e-14.363 ± 0.065 %\u003c/td\u003e\n\u003ctd\u003e28.576 ± 0.084 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq1_M\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e2.01\u003c/td\u003e\n\u003ctd\u003e25.493722 ± 0.177903\u003c/td\u003e\n\u003ctd\u003e19.262089 ± 0.152396\u003c/td\u003e\n\u003ctd\u003e1.393084 ± 0.003529\u003c/td\u003e\n\u003ctd\u003e-24.672 ± 0.077 %\u003c/td\u003e\n\u003ctd\u003e38.287 ± 0.084 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq1_S\u003c/td\u003e\n\u003ctd\u003eWT 1m\u003c/td\u003e\n\u003ctd\u003e1.88\u003c/td\u003e\n\u003ctd\u003e58.097760 ± 0.438604\u003c/td\u003e\n\u003ctd\u003e51.866126 ± 0.416604\u003c/td\u003e\n\u003ctd\u003e2.211278 ± 0.004688\u003c/td\u003e\n\u003ctd\u003e-32.471 ± 0.087 %\u003c/td\u003e\n\u003ctd\u003e46.418 ± 0.085 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq1_S\u003c/td\u003e\n\u003ctd\u003eWT 1k\u003c/td\u003e\n\u003ctd\u003e1.88\u003c/td\u003e\n\u003ctd\u003e58.267851 ± 0.446208\u003c/td\u003e\n\u003ctd\u003e52.036218 ± 0.424373\u003c/td\u003e\n\u003ctd\u003e2.214858 ± 0.004778\u003c/td\u003e\n\u003ctd\u003e-31.880 ± 0.089 %\u003c/td\u003e\n\u003ctd\u003e46.330 ± 0.086 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq1_S\u003c/td\u003e\n\u003ctd\u003eWT 100k\u003c/td\u003e\n\u003ctd\u003e1.88\u003c/td\u003e\n\u003ctd\u003e58.581498 ± 0.453145\u003c/td\u003e\n\u003ctd\u003e52.349864 ± 0.431360\u003c/td\u003e\n\u003ctd\u003e2.220834 ± 0.004818\u003c/td\u003e\n\u003ctd\u003e-32.261 ± 0.089 %\u003c/td\u003e\n\u003ctd\u003e46.002 ± 0.086 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq1_S\u003c/td\u003e\n\u003ctd\u003eWT 10m\u003c/td\u003e\n\u003ctd\u003e1.88\u003c/td\u003e\n\u003ctd\u003e60.694593 ± 0.471290\u003c/td\u003e\n\u003ctd\u003e54.462959 ± 0.449644\u003c/td\u003e\n\u003ctd\u003e2.254554 ± 0.004868\u003c/td\u003e\n\u003ctd\u003e-31.973 ± 0.088 %\u003c/td\u003e\n\u003ctd\u003e46.271 ± 0.086 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eiq1_S\u003c/td\u003e\n\u003ctd\u003eWT 10k\u003c/td\u003e\n\u003ctd\u003e1.88\u003c/td\u003e\n\u003ctd\u003e63.221324 ± 0.493077\u003c/td\u003e\n\u003ctd\u003e56.989691 ± 0.471423\u003c/td\u003e\n\u003ctd\u003e2.293527 ± 0.004885\u003c/td\u003e\n\u003ctd\u003e-32.261 ± 0.089 %\u003c/td\u003e\n\u003ctd\u003e46.562 ± 0.086 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cp dir=\"auto\"\u003eThere seems to be no consistent improvement from using more Wikitext tokens for the importance matrix.\nK-quants score better on mean Δp than the legacy quants than e.g. KL divergence would suggest.\u003c/p\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eLLaMA 2 vs. LLaMA 3 Quantization comparison\u003c/h2\u003e\u003ca id=\"user-content-llama-2-vs-llama-3-quantization-comparison\" class=\"anchor\" aria-label=\"Permalink: LLaMA 2 vs. LLaMA 3 Quantization comparison\" href=\"#llama-2-vs-llama-3-quantization-comparison\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth align=\"left\"\u003eRevision\u003c/th\u003e\n\u003cth align=\"left\"\u003ef364eb6f\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eBackend\u003c/td\u003e\n\u003ctd align=\"left\"\u003eCUDA\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eCPU\u003c/td\u003e\n\u003ctd align=\"left\"\u003eAMD Epyc 7742\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eGPU\u003c/td\u003e\n\u003ctd align=\"left\"\u003e1x NVIDIA RTX 4090\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eMetric\u003c/th\u003e\n\u003cth\u003eL2 7b q2_K\u003c/th\u003e\n\u003cth\u003eL3 8b q2_K\u003c/th\u003e\n\u003cth\u003eL2 7b q4_K_M\u003c/th\u003e\n\u003cth\u003eL3 8b q4_K_M\u003c/th\u003e\n\u003cth\u003eL2 7b q6_K\u003c/th\u003e\n\u003cth\u003eL3 8b q6_K\u003c/th\u003e\n\u003cth\u003eL2 7b q8_0\u003c/th\u003e\n\u003cth\u003eL3 8b q8_0\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean PPL\u003c/td\u003e\n\u003ctd\u003e5.794552 ± 0.032298\u003c/td\u003e\n\u003ctd\u003e9.751568 ± 0.063312\u003c/td\u003e\n\u003ctd\u003e5.877078 ± 0.032781\u003c/td\u003e\n\u003ctd\u003e6.407115 ± 0.039119\u003c/td\u003e\n\u003ctd\u003e5.808494 ± 0.032425\u003c/td\u003e\n\u003ctd\u003e6.253382 ± 0.038078\u003c/td\u003e\n\u003ctd\u003e5.798542 ± 0.032366\u003c/td\u003e\n\u003ctd\u003e6.234284 ± 0.037878\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean PPL ratio\u003c/td\u003e\n\u003ctd\u003e1.107955 ± 0.001427\u003c/td\u003e\n\u003ctd\u003e1.564849 ± 0.004525\u003c/td\u003e\n\u003ctd\u003e1.014242 ± 0.000432\u003c/td\u003e\n\u003ctd\u003e1.028160 ± 0.000723\u003c/td\u003e\n\u003ctd\u003e1.002406 ± 0.000191\u003c/td\u003e\n\u003ctd\u003e1.003490 ± 0.000296\u003c/td\u003e\n\u003ctd\u003e1.000689 ± 0.000107\u003c/td\u003e\n\u003ctd\u003e1.000425 ± 0.000161\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean ΔPPL\u003c/td\u003e\n\u003ctd\u003e0.625552 ± 0.008725\u003c/td\u003e\n\u003ctd\u003e3.519934 ± 0.033863\u003c/td\u003e\n\u003ctd\u003e0.082526 ± 0.002530\u003c/td\u003e\n\u003ctd\u003e0.175482 ± 0.004620\u003c/td\u003e\n\u003ctd\u003e0.013941 ± 0.001110\u003c/td\u003e\n\u003ctd\u003e0.021748 ± 0.001852\u003c/td\u003e\n\u003ctd\u003e0.003990 ± 0.000624\u003c/td\u003e\n\u003ctd\u003e0.002650 ± 0.001006\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003ePPL correlation\u003c/td\u003e\n\u003ctd\u003e97.36%\u003c/td\u003e\n\u003ctd\u003e89.62%\u003c/td\u003e\n\u003ctd\u003e99.71%\u003c/td\u003e\n\u003ctd\u003e99.34%\u003c/td\u003e\n\u003ctd\u003e99.94%\u003c/td\u003e\n\u003ctd\u003e99.88%\u003c/td\u003e\n\u003ctd\u003e99.98%\u003c/td\u003e\n\u003ctd\u003e99.96%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean KLD\u003c/td\u003e\n\u003ctd\u003e0.108903 ± 0.000645\u003c/td\u003e\n\u003ctd\u003e0.445132 ± 0.001835\u003c/td\u003e\n\u003ctd\u003e0.012686 ± 0.000079\u003c/td\u003e\n\u003ctd\u003e0.031273 ± 0.000238\u003c/td\u003e\n\u003ctd\u003e0.002098 ± 0.000014\u003c/td\u003e\n\u003ctd\u003e0.005452 ± 0.000035\u003c/td\u003e\n\u003ctd\u003e0.000369 ± 0.000007\u003c/td\u003e\n\u003ctd\u003e0.001355 ± 0.000006\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean Δp\u003c/td\u003e\n\u003ctd\u003e-2.710 ± 0.023 %\u003c/td\u003e\n\u003ctd\u003e-9.123 ± 0.051 %\u003c/td\u003e\n\u003ctd\u003e-0.416 ± 0.008 %\u003c/td\u003e\n\u003ctd\u003e-0.596 ± 0.014 %\u003c/td\u003e\n\u003ctd\u003e-0.035 ± 0.003 %\u003c/td\u003e\n\u003ctd\u003e-0.007 ± 0.006 %\u003c/td\u003e\n\u003ctd\u003e-0.005 ± 0.002 %\u003c/td\u003e\n\u003ctd\u003e-0.019 ± 0.003 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMaximum Δp\u003c/td\u003e\n\u003ctd\u003e85.136%\u003c/td\u003e\n\u003ctd\u003e94.268%\u003c/td\u003e\n\u003ctd\u003e45.209%\u003c/td\u003e\n\u003ctd\u003e95.054%\u003c/td\u003e\n\u003ctd\u003e23.593%\u003c/td\u003e\n\u003ctd\u003e53.601%\u003c/td\u003e\n\u003ctd\u003e43.925%\u003c/td\u003e\n\u003ctd\u003e28.734%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e99.9% Δp\u003c/td\u003e\n\u003ctd\u003e37.184%\u003c/td\u003e\n\u003ctd\u003e50.003%\u003c/td\u003e\n\u003ctd\u003e17.461%\u003c/td\u003e\n\u003ctd\u003e27.084%\u003c/td\u003e\n\u003ctd\u003e7.798%\u003c/td\u003e\n\u003ctd\u003e13.613%\u003c/td\u003e\n\u003ctd\u003e3.387%\u003c/td\u003e\n\u003ctd\u003e6.402%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e99.0% Δp\u003c/td\u003e\n\u003ctd\u003e18.131%\u003c/td\u003e\n\u003ctd\u003e25.875%\u003c/td\u003e\n\u003ctd\u003e7.798%\u003c/td\u003e\n\u003ctd\u003e12.084%\u003c/td\u003e\n\u003ctd\u003e3.838%\u003c/td\u003e\n\u003ctd\u003e6.407%\u003c/td\u003e\n\u003ctd\u003e1.867%\u003c/td\u003e\n\u003ctd\u003e3.544%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMedian Δp\u003c/td\u003e\n\u003ctd\u003e-0.391%\u003c/td\u003e\n\u003ctd\u003e-2.476%\u003c/td\u003e\n\u003ctd\u003e-0.026%\u003c/td\u003e\n\u003ctd\u003e-0.024%\u003c/td\u003e\n\u003ctd\u003e-0.001%\u003c/td\u003e\n\u003ctd\u003e0.000%\u003c/td\u003e\n\u003ctd\u003e-0.000%\u003c/td\u003e\n\u003ctd\u003e-0.000%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e1.0% Δp\u003c/td\u003e\n\u003ctd\u003e-39.762%\u003c/td\u003e\n\u003ctd\u003e-87.173%\u003c/td\u003e\n\u003ctd\u003e-11.433%\u003c/td\u003e\n\u003ctd\u003e-19.567%\u003c/td\u003e\n\u003ctd\u003e-4.222%\u003c/td\u003e\n\u003ctd\u003e-6.767%\u003c/td\u003e\n\u003ctd\u003e-1.862%\u003c/td\u003e\n\u003ctd\u003e-3.698%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e0.1% Δp\u003c/td\u003e\n\u003ctd\u003e-79.002%\u003c/td\u003e\n\u003ctd\u003e-98.897%\u003c/td\u003e\n\u003ctd\u003e-26.433%\u003c/td\u003e\n\u003ctd\u003e-56.054%\u003c/td\u003e\n\u003ctd\u003e-9.091%\u003c/td\u003e\n\u003ctd\u003e-16.584%\u003c/td\u003e\n\u003ctd\u003e-3.252%\u003c/td\u003e\n\u003ctd\u003e-6.579%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMinimum Δp\u003c/td\u003e\n\u003ctd\u003e-99.915%\u003c/td\u003e\n\u003ctd\u003e-99.965%\u003c/td\u003e\n\u003ctd\u003e-83.383%\u003c/td\u003e\n\u003ctd\u003e-98.699%\u003c/td\u003e\n\u003ctd\u003e-43.142%\u003c/td\u003e\n\u003ctd\u003e-68.487%\u003c/td\u003e\n\u003ctd\u003e-9.343%\u003c/td\u003e\n\u003ctd\u003e-24.301%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eRMS Δp\u003c/td\u003e\n\u003ctd\u003e9.762 ± 0.053 %\u003c/td\u003e\n\u003ctd\u003e21.421 ± 0.079 %\u003c/td\u003e\n\u003ctd\u003e3.252 ± 0.024 %\u003c/td\u003e\n\u003ctd\u003e5.519 ± 0.050 %\u003c/td\u003e\n\u003ctd\u003e1.339 ± 0.010 %\u003c/td\u003e\n\u003ctd\u003e2.295 ± 0.019 %\u003c/td\u003e\n\u003ctd\u003e0.618 ± 0.011 %\u003c/td\u003e\n\u003ctd\u003e1.198 ± 0.007 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eSame top p\u003c/td\u003e\n\u003ctd\u003e85.584 ± 0.086 %\u003c/td\u003e\n\u003ctd\u003e71.138 ± 0.119 %\u003c/td\u003e\n\u003ctd\u003e94.665 ± 0.055 %\u003c/td\u003e\n\u003ctd\u003e91.901 ± 0.072 %\u003c/td\u003e\n\u003ctd\u003e97.520 ± 0.038 %\u003c/td\u003e\n\u003ctd\u003e96.031 ± 0.051 %\u003c/td\u003e\n\u003ctd\u003e98.846 ± 0.026 %\u003c/td\u003e\n\u003ctd\u003e97.674 ± 0.040 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eLLaMA 3 BF16 vs. FP16 comparison\u003c/h2\u003e\u003ca id=\"user-content-llama-3-bf16-vs-fp16-comparison\" class=\"anchor\" aria-label=\"Permalink: LLaMA 3 BF16 vs. FP16 comparison\" href=\"#llama-3-bf16-vs-fp16-comparison\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth align=\"left\"\u003eRevision\u003c/th\u003e\n\u003cth align=\"left\"\u003e83330d8c\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eBackend\u003c/td\u003e\n\u003ctd align=\"left\"\u003eCPU\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eCPU\u003c/td\u003e\n\u003ctd align=\"left\"\u003eAMD Epyc 7742\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003eGPU\u003c/td\u003e\n\u003ctd align=\"left\"\u003eN/A\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cp dir=\"auto\"\u003eResults were calculated with LLaMA 3 8b BF16 as \u003ccode\u003e--kl-divergence-base\u003c/code\u003e and LLaMA 3 8b FP16 as the \u003ccode\u003e--model\u003c/code\u003e for comparison.\u003c/p\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eMetric\u003c/th\u003e\n\u003cth\u003eValue\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean PPL(Q)\u003c/td\u003e\n\u003ctd\u003e6.227711 ± 0.037833\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean PPL(base)\u003c/td\u003e\n\u003ctd\u003e6.225194 ± 0.037771\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eCor(ln(PPL(Q)), ln(PPL(base)))\u003c/td\u003e\n\u003ctd\u003e99.990%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean ln(PPL(Q)/PPL(base))\u003c/td\u003e\n\u003ctd\u003e0.000404 ± 0.000086\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean PPL(Q)/PPL(base)\u003c/td\u003e\n\u003ctd\u003e1.000404 ± 0.000086\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean PPL(Q)-PPL(base)\u003c/td\u003e\n\u003ctd\u003e0.002517 ± 0.000536\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean KLD\u003c/td\u003e\n\u003ctd\u003e0.00002515 ± 0.00000020\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMaximum KLD\u003c/td\u003e\n\u003ctd\u003e0.012206\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e99.9% KLD\u003c/td\u003e\n\u003ctd\u003e0.000799\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e99.0% KLD\u003c/td\u003e\n\u003ctd\u003e0.000222\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e99.0% KLD\u003c/td\u003e\n\u003ctd\u003e0.000222\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMedian KLD\u003c/td\u003e\n\u003ctd\u003e0.000013\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e10.0% KLD\u003c/td\u003e\n\u003ctd\u003e-0.000002\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e5.0% KLD\u003c/td\u003e\n\u003ctd\u003e-0.000008\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e1.0% KLD\u003c/td\u003e\n\u003ctd\u003e-0.000023\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMinimum KLD\u003c/td\u003e\n\u003ctd\u003e-0.000059\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMean Δp\u003c/td\u003e\n\u003ctd\u003e-0.0000745 ± 0.0003952 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMaximum Δp\u003c/td\u003e\n\u003ctd\u003e4.186%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e99.9% Δp\u003c/td\u003e\n\u003ctd\u003e1.049%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e99.0% Δp\u003c/td\u003e\n\u003ctd\u003e0.439%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e95.0% Δp\u003c/td\u003e\n\u003ctd\u003e0.207%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e90.0% Δp\u003c/td\u003e\n\u003ctd\u003e0.125%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e75.0% Δp\u003c/td\u003e\n\u003ctd\u003e0.029%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMedian Δp\u003c/td\u003e\n\u003ctd\u003e0.000%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e25.0% Δp\u003c/td\u003e\n\u003ctd\u003e-0.030%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e10.0% Δp\u003c/td\u003e\n\u003ctd\u003e-0.126%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e5.0% Δp\u003c/td\u003e\n\u003ctd\u003e-0.207%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e1.0% Δp\u003c/td\u003e\n\u003ctd\u003e-0.434%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e0.1% Δp\u003c/td\u003e\n\u003ctd\u003e-1.016%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eMinimum Δp\u003c/td\u003e\n\u003ctd\u003e-4.672%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eRMS Δp\u003c/td\u003e\n\u003ctd\u003e0.150 ± 0.001 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eSame top p\u003c/td\u003e\n\u003ctd\u003e99.739 ± 0.013 %\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003cdiv class=\"markdown-heading\" dir=\"auto\"\u003e\u003ch2 tabindex=\"-1\" class=\"heading-element\" dir=\"auto\"\u003eOld Numbers\u003c/h2\u003e\u003ca id=\"user-content-old-numbers\" class=\"anchor\" aria-label=\"Permalink: Old Numbers\" href=\"#old-numbers\"\u003e\u003csvg class=\"octicon octicon-link\" viewBox=\"0 0 16 16\" version=\"1.1\" width=\"16\" height=\"16\" aria-hidden=\"true\"\u003e\u003cpath d=\"m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z\"\u003e\u003c/path\u003e\u003c/svg\u003e\u003c/a\u003e\u003c/div\u003e\n\u003cdetails\u003e\n\u003csummary\u003eLlama 2 70B Scoreboard\u003c/summary\u003e\n\u003cmarkdown-accessiblity-table\u003e\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eQuantization\u003c/th\u003e\n\u003cth\u003eModel size (GiB)\u003c/th\u003e\n\u003cth\u003ePerplexity\u003c/th\u003e\n\u003cth\u003eDelta to fp16\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ4_0\u003c/td\u003e\n\u003ctd\u003e36.20\u003c/td\u003e\n\u003ctd\u003e3.5550\u003c/td\u003e\n\u003ctd\u003e3.61%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ4_1\u003c/td\u003e\n\u003ctd\u003e40.20\u003c/td\u003e\n\u003ctd\u003e3.5125\u003c/td\u003e\n\u003ctd\u003e2.37%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ5_0\u003c/td\u003e\n\u003ctd\u003e44.20\u003c/td\u003e\n\u003ctd\u003e3.4744\u003c/td\u003e\n\u003ctd\u003e1.26%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ2_K\u003c/td\u003e\n\u003ctd\u003e27.27\u003c/td\u003e\n\u003ctd\u003e3.7339\u003c/td\u003e\n\u003ctd\u003e8.82%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ3_K_S\u003c/td\u003e\n\u003ctd\u003e27.86\u003c/td\u003e\n\u003ctd\u003e3.7019\u003c/td\u003e\n\u003ctd\u003e7.89%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ3_K_M\u003c/td\u003e\n\u003ctd\u003e30.83\u003c/td\u003e\n\u003ctd\u003e3.5932\u003c/td\u003e\n\u003ctd\u003e4.72%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ3_K_L\u003c/td\u003e\n\u003ctd\u003e33.67\u003c/td\u003e\n\u003ctd\u003e3.5617\u003c/td\u003e\n\u003ctd\u003e3.80%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ4_K_S\u003c/td\u003e\n\u003ctd\u003e36.39\u003c/td\u003e\n\u003ctd\u003e3.4852\u003c/td\u003e\n\u003ctd\u003e1.57%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ4_K_M\u003c/td\u003e\n\u003ctd\u003e38.54\u003c/td\u003e\n\u003ctd\u003e3.4725\u003c/td\u003e\n\u003ctd\u003e1.20%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ5_K_S\u003c/td\u003e\n\u003ctd\u003e44.20\u003c/td\u003e\n\u003ctd\u003e3.4483\u003c/td\u003e\n\u003ctd\u003e0.50%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ5_K_M\u003c/td\u003e\n\u003ctd\u003e45.41\u003c/td\u003e\n\u003ctd\u003e3.4451\u003c/td\u003e\n\u003ctd\u003e0.40%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eQ6_K\u003c/td\u003e\n\u003ctd\u003e52.70\u003c/td\u003e\n\u003ctd\u003e3.4367\u003c/td\u003e\n\u003ctd\u003e0.16%\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003efp16\u003c/td\u003e\n\u003ctd\u003e128.5\u003c/td\u003e\n\u003ctd\u003e3.4313\u003c/td\u003e\n\u003ctd\u003e-\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\u003c/markdown-accessiblity-table\u003e\n\u003c/details\u003e\n\u003c/article\u003e","errorMessage":null,"headerInfo":{"toc":[{"level":1,"text":"Perplexity","anchor":"perplexity","htmlText":"Perplexity"},{"level":2,"text":"LLaMA 3 8b Scoreboard","anchor":"llama-3-8b-scoreboard","htmlText":"LLaMA 3 8b Scoreboard"},{"level":2,"text":"LLaMA 2 vs. LLaMA 3 Quantization comparison","anchor":"llama-2-vs-llama-3-quantization-comparison","htmlText":"LLaMA 2 vs. LLaMA 3 Quantization comparison"},{"level":2,"text":"LLaMA 3 BF16 vs. FP16 comparison","anchor":"llama-3-bf16-vs-fp16-comparison","htmlText":"LLaMA 3 BF16 vs. FP16 comparison"},{"level":2,"text":"Old Numbers","anchor":"old-numbers","htmlText":"Old Numbers"}],"siteNavLoginPath":"/login?return_to=https%3A%2F%2Fgithub.com%2Fgabe-l-hart%2Fllama.cpp%2Ftree%2FBambaAbstractMemory%2Fexamples%2Fperplexity"}},"totalCount":3,"showBranchInfobar":true},"fileTree":{"examples":{"items":[{"name":"batched-bench","path":"examples/batched-bench","contentType":"directory"},{"name":"batched.swift","path":"examples/batched.swift","contentType":"directory"},{"name":"batched","path":"examples/batched","contentType":"directory"},{"name":"convert-llama2c-to-ggml","path":"examples/convert-llama2c-to-ggml","contentType":"directory"},{"name":"cvector-generator","path":"examples/cvector-generator","contentType":"directory"},{"name":"deprecation-warning","path":"examples/deprecation-warning","contentType":"directory"},{"name":"embedding","path":"examples/embedding","contentType":"directory"},{"name":"eval-callback","path":"examples/eval-callback","contentType":"directory"},{"name":"export-lora","path":"examples/export-lora","contentType":"directory"},{"name":"gen-docs","path":"examples/gen-docs","contentType":"directory"},{"name":"gguf-hash","path":"examples/gguf-hash","contentType":"directory"},{"name":"gguf-split","path":"examples/gguf-split","contentType":"directory"},{"name":"gguf","path":"examples/gguf","contentType":"directory"},{"name":"gritlm","path":"examples/gritlm","contentType":"directory"},{"name":"imatrix","path":"examples/imatrix","contentType":"directory"},{"name":"infill","path":"examples/infill","contentType":"directory"},{"name":"jeopardy","path":"examples/jeopardy","contentType":"directory"},{"name":"llama-bench","path":"examples/llama-bench","contentType":"directory"},{"name":"llama.android","path":"examples/llama.android","contentType":"directory"},{"name":"llama.swiftui","path":"examples/llama.swiftui","contentType":"directory"},{"name":"llava","path":"examples/llava","contentType":"directory"},{"name":"lookahead","path":"examples/lookahead","contentType":"directory"},{"name":"lookup","path":"examples/lookup","contentType":"directory"},{"name":"main","path":"examples/main","contentType":"directory"},{"name":"parallel","path":"examples/parallel","contentType":"directory"},{"name":"passkey","path":"examples/passkey","contentType":"directory"},{"name":"perplexity","path":"examples/perplexity","contentType":"directory"},{"name":"quantize","path":"examples/quantize","contentType":"directory"},{"name":"retrieval","path":"examples/retrieval","contentType":"directory"},{"name":"rpc","path":"examples/rpc","contentType":"directory"},{"name":"run","path":"examples/run","contentType":"directory"},{"name":"save-load-state","path":"examples/save-load-state","contentType":"directory"},{"name":"server","path":"examples/server","contentType":"directory"},{"name":"simple-chat","path":"examples/simple-chat","contentType":"directory"},{"name":"simple-cmake-pkg","path":"examples/simple-cmake-pkg","contentType":"directory"},{"name":"simple","path":"examples/simple","contentType":"directory"},{"name":"speculative-simple","path":"examples/speculative-simple","contentType":"directory"},{"name":"speculative","path":"examples/speculative","contentType":"directory"},{"name":"sycl","path":"examples/sycl","contentType":"directory"},{"name":"tokenize","path":"examples/tokenize","contentType":"directory"},{"name":"tts","path":"examples/tts","contentType":"directory"},{"name":"CMakeLists.txt","path":"examples/CMakeLists.txt","contentType":"file"},{"name":"Miku.sh","path":"examples/Miku.sh","contentType":"file"},{"name":"chat-13B.bat","path":"examples/chat-13B.bat","contentType":"file"},{"name":"chat-13B.sh","path":"examples/chat-13B.sh","contentType":"file"},{"name":"chat-persistent.sh","path":"examples/chat-persistent.sh","contentType":"file"},{"name":"chat-vicuna.sh","path":"examples/chat-vicuna.sh","contentType":"file"},{"name":"chat.sh","path":"examples/chat.sh","contentType":"file"},{"name":"convert_legacy_llama.py","path":"examples/convert_legacy_llama.py","contentType":"file"},{"name":"json_schema_pydantic_example.py","path":"examples/json_schema_pydantic_example.py","contentType":"file"},{"name":"json_schema_to_grammar.py","path":"examples/json_schema_to_grammar.py","contentType":"file"},{"name":"llama.vim","path":"examples/llama.vim","contentType":"file"},{"name":"llm.vim","path":"examples/llm.vim","contentType":"file"},{"name":"pydantic_models_to_grammar.py","path":"examples/pydantic_models_to_grammar.py","contentType":"file"},{"name":"pydantic_models_to_grammar_examples.py","path":"examples/pydantic_models_to_grammar_examples.py","contentType":"file"},{"name":"reason-act.sh","path":"examples/reason-act.sh","contentType":"file"},{"name":"regex_to_grammar.py","path":"examples/regex_to_grammar.py","contentType":"file"},{"name":"server-llama2-13B.sh","path":"examples/server-llama2-13B.sh","contentType":"file"},{"name":"server_embd.py","path":"examples/server_embd.py","contentType":"file"},{"name":"ts-type-to-grammar.sh","path":"examples/ts-type-to-grammar.sh","contentType":"file"}],"totalCount":60},"":{"items":[{"name":".devops","path":".devops","contentType":"directory"},{"name":".github","path":".github","contentType":"directory"},{"name":"ci","path":"ci","contentType":"directory"},{"name":"cmake","path":"cmake","contentType":"directory"},{"name":"common","path":"common","contentType":"directory"},{"name":"docs","path":"docs","contentType":"directory"},{"name":"examples","path":"examples","contentType":"directory"},{"name":"ggml","path":"ggml","contentType":"directory"},{"name":"gguf-py","path":"gguf-py","contentType":"directory"},{"name":"grammars","path":"grammars","contentType":"directory"},{"name":"include","path":"include","contentType":"directory"},{"name":"licenses","path":"licenses","contentType":"directory"},{"name":"media","path":"media","contentType":"directory"},{"name":"models","path":"models","contentType":"directory"},{"name":"pocs","path":"pocs","contentType":"directory"},{"name":"prompts","path":"prompts","contentType":"directory"},{"name":"requirements","path":"requirements","contentType":"directory"},{"name":"scripts","path":"scripts","contentType":"directory"},{"name":"src","path":"src","contentType":"directory"},{"name":"tests","path":"tests","contentType":"directory"},{"name":".clang-format","path":".clang-format","contentType":"file"},{"name":".clang-tidy","path":".clang-tidy","contentType":"file"},{"name":".dockerignore","path":".dockerignore","contentType":"file"},{"name":".ecrc","path":".ecrc","contentType":"file"},{"name":".editorconfig","path":".editorconfig","contentType":"file"},{"name":".flake8","path":".flake8","contentType":"file"},{"name":".gitignore","path":".gitignore","contentType":"file"},{"name":".gitmodules","path":".gitmodules","contentType":"file"},{"name":".pre-commit-config.yaml","path":".pre-commit-config.yaml","contentType":"file"},{"name":"AUTHORS","path":"AUTHORS","contentType":"file"},{"name":"CMakeLists.txt","path":"CMakeLists.txt","contentType":"file"},{"name":"CMakePresets.json","path":"CMakePresets.json","contentType":"file"},{"name":"CODEOWNERS","path":"CODEOWNERS","contentType":"file"},{"name":"CONTRIBUTING.md","path":"CONTRIBUTING.md","contentType":"file"},{"name":"LICENSE","path":"LICENSE","contentType":"file"},{"name":"Makefile","path":"Makefile","contentType":"file"},{"name":"README.md","path":"README.md","contentType":"file"},{"name":"SECURITY.md","path":"SECURITY.md","contentType":"file"},{"name":"build-xcframework.sh","path":"build-xcframework.sh","contentType":"file"},{"name":"convert_hf_to_gguf.py","path":"convert_hf_to_gguf.py","contentType":"file"},{"name":"convert_hf_to_gguf_update.py","path":"convert_hf_to_gguf_update.py","contentType":"file"},{"name":"convert_llama_ggml_to_gguf.py","path":"convert_llama_ggml_to_gguf.py","contentType":"file"},{"name":"convert_lora_to_gguf.py","path":"convert_lora_to_gguf.py","contentType":"file"},{"name":"flake.lock","path":"flake.lock","contentType":"file"},{"name":"flake.nix","path":"flake.nix","contentType":"file"},{"name":"mypy.ini","path":"mypy.ini","contentType":"file"},{"name":"poetry.lock","path":"poetry.lock","contentType":"file"},{"name":"pyproject.toml","path":"pyproject.toml","contentType":"file"},{"name":"pyrightconfig.json","path":"pyrightconfig.json","contentType":"file"},{"name":"requirements.txt","path":"requirements.txt","contentType":"file"}],"totalCount":50}},"fileTreeProcessingTime":20.974712,"foldersToFetch":[],"treeExpanded":true,"symbolsExpanded":false,"csrf_tokens":{"/gabe-l-hart/llama.cpp/branches":{"post":"0ZB5XWTmHyfPJvoCQr5Wpzw3gHphqlssvu-wPVlS9NPHa4M6MPDb2LtUZ2dP6HtbxMEoKFQihQcsF1oc764cWQ"},"/gabe-l-hart/llama.cpp/branches/fetch_and_merge/BambaAbstractMemory":{"post":"z8mCi8446Wl9IEhTC0Ab3mEY0Eiu9gFHxtEPk-qgizL74xs9FvGppeaoTF0ZrXWWJ8Gotdja9Q_dnjSrz4Zbgw"},"/gabe-l-hart/llama.cpp/branches/fetch_and_merge/BambaAbstractMemory?discard_changes=true":{"post":"wmWmruJcLzzAnFjZW7wHFuUeOsNPP30FJQM_eHMh7vT2Tz8YOpVv8FsUXNdJUWleo8dCPjkTiU0-TARAVgc-RQ"}}},"title":"llama.cpp/examples/perplexity at BambaAbstractMemory · gabe-l-hart/llama.cpp","appPayload":{"helpUrl":"https://docs.github.com","findFileWorkerPath":"/assets-cdn/worker/find-file-worker-7d7eb7c71814.js","findInFileWorkerPath":"/assets-cdn/worker/find-in-file-worker-1ae9fa256942.js","githubDevUrl":null,"enabled_features":{"code_nav_ui_events":false,"react_blob_overlay":false,"accessible_code_button":true,"github_models_repo_integration":false}}}
0