ShapeSecurity's Javascript VM: Part 2
Intro
In previous part, we talked about the VM Internals especifically about the VM Machinery. This second part we are going to talk about the VM Data and how they work together with the VM Machinery.
VM Data
The VM Data contains 11 data objects that are used as configuration for starting new threads, what op to operate on next, the bytecode, ops functions, etc.
You can think of the VM Data parts as the things that hold mutable andimmutable data, control-flow logic and the ops used by ShapeSecurity's VM. Once again the naming that I gave to these 11 variables do not reflect their actual descriptive meaning and the names were kept for legacy purposes.
1. XOR_MAP
Every single string that is constructed inside ShapeSecurity's VM is stored in this map. The strings are decrypted using a xor byte inside ShapeSecurity's VM. The way this works is that every time ShapeSecurity's VM needs to construct a string, they do it via xoring two strings from the array OBFUSCATED using the getXorValue() function.
The getXorValue function was replaced from this:
b1: {
var K = G;
var P = K + "," + E;
var s = p[P];
if (typeof s !== "undefined") {
var y = s;
break b1
}
var S = i[E];
var w = qp(S);
var X = qp(K);
var c = w[0] + X[0] & 255;
var J = "";
for (var b = 1; b < w.length; ++b) {
J += e(X[b] ^ w[b] ^ c)
}
var y = p[P] = J
}
var o = q.A.length;
into this:
function getXorValue(xorMap, obfuscatedStrings, strA, strB) {
const fullKeyName = strA + ',' + strB;
let value = xorMap[fullKeyName];
if (typeof value !== 'undefined') {
return value;
}
const strBDecoded = base64Decoder(obfuscatedStrings[strB]);
const strADecoded = base64Decoder(strA);
const thirdXor = strBDecoded[0] + strADecoded[0] & 255;
value = '';
for (let i = 1; i < strBDecoded.length; ++i) {
value += String.fromCharCode(strADecoded[i] ^ strBDecoded[i] ^ thirdXor);
}
xorMap[fullKeyName] = value;
return value;
}
Whenever getXorValue() is called inside ShapeSecurity's VM, the XOR_MAP is passed as the first parameter, OBFUSCATED(an array containing only strings) passed as second parameter, and strA and strB are passed as third and fourth parameter.
strA
is the actual raw string from one of the items of OBFUSCATED and strB
is not the actual raw string of another string inside OBFUSCATED but the index where it lies inside OBFUSCATED.
The fullKeyname
is a joint hashed name that is derived from the strA
+ ,
+ strB
indexed number. This is done to avoid wasting time xoring
the same string twice.
The xoring
mechanism is relatively straight forward.
-
- Converts
strA
andstrB
to a base64 bytes.
- Converts
-
- Create an array of base64 bytes named
strADecoded
from the stringstrA
- Create an array of base64 bytes named
-
- Create an array of base64 bytes named
strBDecoded
from the stringstrB
- Create an array of base64 bytes named
-
- Create third xor value,
thirdXor
, that is derived from adding the first byte ofstrADecoded
, the first byte ofstrBDecoded
and apply& 255
to the result.
- Create third xor value,
-
- Create a string named
value
to stored thexoring
text.
- Create a string named
-
- Start from 1 and iterate through all the bytes from the array
strBDecoded
- Start from 1 and iterate through all the bytes from the array
-
- For each iteration, use the index to access each byte from
strADecoded
andstrADecoded
thenxor
them all together with thethirdXor
value.
- For each iteration, use the index to access each byte from
-
- Concatenate the result from the previous step into
value
- Concatenate the result from the previous step into
-
- Add the end of the loop, just add the final
value
into XOR_MAP using thefullKeyName
. Avoiding double computation whenxoring
the samestrA
andstrB
together.
- Add the end of the loop, just add the final
2. OPS_FUNCTIONS
The OPS_FUNCTIONS was originally an array of ops functions but in the pretty version it was converted to an object with each key corresponding to the index in the original array. It would have been too difficult to find each op in an array of hundreds of ops visually when working along ShapeSecurity's VM. For this reason alone it was converted into an object instead of an array.
The functions inside OPS_FUNCTIONS were cleaned up to apply a simple conversion of 1 statement(ReturnStatement
, ExpressionStatement
,IfStatement
, etc.) represents 1 action. For the most part, all of the statements found inside each individual op
contained 1 action per statement, with the exception of a few:
throwIfTypeError()
//ORIGINAL
if (!(c in I)) {
throw new qd(c + " is not defined.")
}
//CONVERTED
function throwIfTypeError(_$A) {
if (!(_$A in window)) {
throw new ReferenceError(_$A + " is not defined.");
}
}
throwIfIsNotAnObject()
//ORIGINAL
if (w.A[w.A.length - 1] === null || w.A[w.A.length - 1] === void 0) {
throw new qJ(w.A[w.A.length - 1] + " is not an object")
}
//CONVERTED
function throwIfIsNotAnObject(_$A) {
if (_$A === null || _$A === void 0) {
throw new TypeError(_$A + " is not an object");
}
}
forInFunc()
//ORIGINAL
var q = [];
for (var s in w.A[w.A.length - 1]) {
f(q, s)
}
//CONVERTED
function forInFunc(stackValue) {
var arr = [];
for (var i in stackValue) {
arr.push(i);
}
return arr;
}
pushWasExceptionHandled()()
//ORIGINAL
var q = w.M.N();
var s = {
d: false,
Q: w.f,
r: w.r
};
w.x.q(s);
w.f = q.W;
w.r = q.r
//CONVERTED
function pushWasExceptionHandled(_vmContext) {
var errors = _vmContext.errors.pop();
var errorObj = {
wasExceptionHandled: false,
_errorOryIndex: _vmContext.yIndex,
_xIndex: _vmContext.xIndex
};
_vmContext.errorTracker.push(errorObj);
_vmContext.yIndex = errors._yIndex;
_vmContext.xIndex = errors._xIndex;
}
errorTrackerPopWithThrow()
//ORIGINAL
var q = w.x.N();
if (q.d) {
throw q.Q
}
w.f = q.Q;
w.r = q.r
//CONVERTED
function errorTrackerPopWithThrow(_vmContext) {
var errorTrack = _vmContext.errorTracker.pop();
if (errorTrack.wasExceptionHandled) {
throw errorTrack._errorOryIndex;
}
_vmContext.yIndex = errorTrack._errorOryIndex;
_vmContext.xIndex = errorTrack._xIndex;
}
The only argument that each op inside the array OPS_FUNCTIONS takes is an instance of a vmContext(). Each op function uses the vmContext() instance to do one or more of these actions:
-
- Access items from the HEAP, OBFUSCATED, ARRAY_OF_NUMBERS and/or ARRAY_OF_MAP_FILTER_FOR_EACH
-
- Increments the
yIndex
value by the number of uniquely referencedyIndex
instances
- Increments the
-
- Modifies the
stack
array by adding and/or removing items from it
- Modifies the
-
- Writes new keys into
_vmMemory
and/or reads keys from_vmMemory
by usinggetKey()
andsetKey()
methods
- Writes new keys into
-
- Makes conditional jumps by setting
yIndex
andxIndex
inside theconsequent
side of anIfStatement
- Makes conditional jumps by setting
-
- Creates new threads by calling createThread()
-
- Creates a new string using getXorValue()
-
- Starts a new "try catch mode" and/or ends a current "try catch mode"
-
- Throws an exception
-
- Defines a new value on an object and/or array by using
Object.defineProperty
- Defines a new value on an object and/or array by using
-
- Sets a new
xyIndex
from the currentyIndex
andxIndex
values and/or setsyIndex
toxyIndex.yIndex
andxIndex
toxyIndex.xIndex
- Sets a new
-
- Ends the thread execution bysetting the
returnValue
to something else thanexplicitReturn
- Ends the thread execution bysetting the
-
- Decreases the
stack
length
- Decreases the
We can break these actions into three groups:
- Actions 1 thru 2 are always done first
- Actions 3 thru 12 are done second
- Action 13 is done at the end
It is important to understand that not all ops will contains all these type of actions. Some ops will only contain actions from 3 thru 12, some might only contains actions 1 thru 2, and/or some might contain all of them in an op function.
Node: There are too many permutations of function ops to identify but there are a limited amount of lines that are used inside each function op.
In a later part, we will talk about ShapeSecurity's VM ops in more detail since they will require a full part to fully explained the inner workings.
3. OPS_SEQUENCE
ShapeSecurity VM's uses a dynamic mapping for determining two things:
- The function op to execute based on the index value on OPS_FUNCTIONS
- The
xIndex
value for the next op(unless is set inside the op by some action)
The OPS_SEQUENCE is a two dimensional array, you can think of the first dimension represents rows while the second dimension represents columns.
When the method next() on a vmContext instance is called it does 4 things:
-
- Uses the current
xIndex
value to access the first dimensional index on OPS_SEQUENCE
- Uses the current
-
- Uses the
yIndex
value to access a byte(0-255) on the HEAP array as the second dimensional index
- Uses the
-
- Sets the next
xIndex
value to the first element in the returning array.
- Sets the next
-
- Returns the second element in the returning array which corresponds to the index value on the OPS_FUNCTION array.
Object.defineProperty(L, "next", {
value: function () {
{
var w = OPS_SEQUENCE[this.xIndex][HEAP[this.yIndex++]];
this.xIndex = w[0];
return w[1];
}
}
});
The xIndex
value is not changed unless a specific action sets the xIndex
inside the executed op to another value. This only occurs on specific actions that are used to set a conditional jump.
4. HEAP
In the previous part I kept calling the HEAP the bytecode. It took me a while to learn that what I was calling the HEAP truly represents the bytecode in a VM.
The HEAP is converted into an array of bytes(base64Decoder()) and in the previous versions of ShapeSecurity's VM the HEAP remained immutable. Later on, they introduced a new action into their function ops that allowed them to change any byte in the HEAP.
A lot of the configuration values that are used for setting a memory key, reading a memory key, getting the index of an item in OBFUSCATED, etc. come from the values in the HEAP. These are the values that are first read and set into a temporary variable(example:_$A
) for later use.
5. explicitReturn and 6. returnValue
As previously mentioned in the previous part, the ShapeSecurity's VM runs until returnValue
changes. In other words, when returnValue
is set to explicitReturn
the function vmRunner() continuously run until it sets to a different value than explicitReturn
.
Inside the ops returnValue
is changed to 3 different type of values:
-
- returnEmptyObject which is basically an empty object(
{}
)
- returnEmptyObject which is basically an empty object(
-
- The last value in the
stack
array
- The last value in the
-
void 0
also known asundefined
The returnValue
value acts as a sort of ReturnStatement
from a regular function.
7. ARRAY_OF_NUMBERS and 8. OBFUSCATED
The ARRAY_OF_NUMBERS , as the name suggests, is simply an array that holds nothing but numbers in different formats. They are used for anything such as xoring bytes for encoding some values, encryption numbers, and numbers used for some of their signals.
OBFUSCATED should have really been called the ARRAY_OF_STRINGS since it complements ARRAY_OF_NUMBERS as it only contains an array of strings. However, when I first started labeling ShapeSecurity's VM internals I came across the getXorValue() function and noticed how this array was always used for decoding strings inside that function.
Nothing is deleted or added to these arrays as they both remain immutable the whole time.
9. NATIVE_FUNCTIONS
The NATIVE_FUNCTIONS is an array of multiple prototypes and functions that are used inside ShapeSecurity's VM as a way to reference native functions available as native code. ShapeSecurity's VM never adds any new items to this array and it remains immutable.
Inside one of the function ops, the NATIVE_FUNCTIONS is pushed to the stack
array and this remains the only way this NATIVE_FUNCTIONS is accessed.
10. ARRAY_OF_MAP_FILTER_FOR_EACH
This array contains 3 prototypes:
-
- The prototype of
Array.prototype.map
- The prototype of
-
- The prototype of
Array.prototype.filter
- The prototype of
-
- The prototype of
Array.prototype.forEach
- The prototype of
The prototypes are later used to check if a current object contains those methods defined. If they don't then the compiled code inside ShapeSecurity's VM implements an alternative function. This is not evident until later going thru the dissambled code.
11. THREAD_CONFIG
Every time a new thread is created, using the createThread() inside one of the ops, the first parameter always corresponds to the index value of THREAD_CONFIG. Therefor, the array THREAD_CONFIG holds all the thread configurations used to set the keys for each newly created thread.
Each item in that array always contains an object with at-least 3 keys:
-
keysFromArgs
: This represents the keys in order that will be set for each item inarguments
-
initializeKeys
: All the keys that will be set tovoid 0
including the keys fromkeysFromArgs
-
transferredKeys
: The keys transferred from the parent's vmMemory() instance.
Additionally, two other keys are sometimes set
-
arksKey
: The key used to setarguments
to.
-
workResultKey
: The key used to set itself, aka thethis
value.
With the exception of the first thread created, all future threads are created using the createThread() function with a value from the THREAD_CONFIG array as a configuring object.
Conclusion
We have barely scratched the surface in the inner details of ShapeSecurity's VM. In the next part we will dive into all the actions that make up the individual ops from OPS_FUNCTIONS and what kind of structures they produced. Make sure you tune in as there will be many more parts to come after Part 3 is done.