[go: up one dir, main page]

Skip to content
forked from jehna/humanify

Deobfuscate Javascript code using ChatGPT

License

Notifications You must be signed in to change notification settings

igorleonir/humanify

 
 

Repository files navigation

Humanify

Deobfuscate Javascript code using LLMs ("AI")

This tool uses large language modeles (like ChatGPT & llama2) and other tools to deobfuscate, unminify, transpile, decompile and unpack Javascript code. Note that LLMs don't perform any structural changes – they only provide hints to rename variables and functions. The heavy lifting is done by Babel on AST level to ensure code stays 1-1 equivalent.

➡️ Check out the introduction blog post for in-depth explanation!

Example

Given the following minified code:

function a(e,t){var n=[];var r=e.length;var i=0;for(;i<r;i+=t){if(i+t<r){n.push(e.substring(i,i+t))}else{n.push(e.substring(i,r))}}return n}

The tool will output a human-readable version:

function splitString(inputString, chunkSize) {
  var chunks = [];
  var stringLength = inputString.length;
  var startIndex = 0;
  for (; startIndex < stringLength; startIndex += chunkSize) {
    if (startIndex + chunkSize < stringLength) {
      chunks.push(inputString.substring(startIndex, startIndex + chunkSize));
    } else {
      chunks.push(inputString.substring(startIndex, stringLength));
    }
  }
  return chunks;
}

🚨 NOTE: 🚨

Large files may take some time to process and use a lot of tokens if you use ChatGPT. For a rough estimate, the tool takes about 2 tokens per character to process a file:

echo "$((2 * $(wc -c < yourscript.min.js)))"

So for refrence: a minified bootstrap.min.js would take about $0.5 to un-minify using ChatGPT.

Using --local flag is of course free, but may take more time, be less accurate and not possible with your existing hardware.

Getting started

First install the dependencies:

npm install

Next you'll need to decide whether to use ChatGPT or llama2. In a nutshell:

  • ChatGPT
    • Runs on someone else's computer that's specifically optimized for this kind of things
    • Costs money depending on the length of your code
    • Is more accurate
    • Is (probably) faster
  • llama2
    • Runs locally
    • Is free
    • Is less accurate
    • Needs a local GPU with ~60gb RAM (M1 Mac works just fine)
    • Runs as fast as your GPU does

See instructions below for each option:

ChatGPT

You'll need a ChatGPT API key. You can get one by signing up at https://openai.com/.

There are several ways to provide the API key to the tool:

echo "OPENAI_TOKEN=your-token" > .env && npm start --  -o deobfuscated.js obfuscated-file.js
export OPENAI_TOKEN="your-token" && npm start --  -o deobfuscated.js obfuscated-file.js
OPENAI_TOKEN=your-token npm start --  -o deobfuscated.js obfuscated-file.js
npm start -- --key="your-token"  -o deobfuscated.js obfuscated-file.js

Use your preferred way to provide the API key. Use npm start -- --help to see all available options.

llama2

Prerequisites:

  • You'll need to have a Python 3 environment with conda installed.
  • You need a Huggingface account with access to llama-2-7b-chat-hf model. Make sure to read the instructions on the model page about how to access the model.

Run the following command to install the required Python packages and activate the environment:

conda env create -f environment.yaml
conda activate humanify

You can now run the tool with:

npm start -- --local -o deobfuscated.js obfuscated-file.js

Note: this downloads ~13gb of model data to your computer on the first run.

Features

The main features of the tool are:

  • Uses ChatGPT functions/llama2 to get smart suggestions to rename variable and function names
  • Uses custom and off-the-shelf Babel plugins to perform AST-level unmanging
  • Uses Webcrack to unbundle Webpack bundles

Contributing

If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.

Licensing

The code in this project is licensed under MIT license.

About

Deobfuscate Javascript code using ChatGPT

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 84.4%
  • Python 15.6%