Pharo v. Cloudflare ==> CloudflareUn

In my pursuit to connect Pharo to the realtime order book feed of the Bittrex cryptocurrency exchange there are two main challenges:

  1. It uses Microsoft’s signalr protocol.
  2. The site is guarded by Cloudflare, which requires a Javascript puzzle to be solved.

Here I deal with passing Pharo through a Cloudflare guard. So lets get started…

On the wire

  1. First we should review how Bittrex libraries for other languages do it. Follow the installation instructions for python-bittrex-websocket and then also clone the repo to get its examples.
    $ git clone
    $ cd python-bittrex-websocket/bittrex_websocket/example
    $ python
  2. By default this library runs over HTTPS which impedes our ability to peek at it, so we need to hack it to use HTTP instead. To discover which file to modify, change the bottom of as follows…
     $ vi
            import inspect
            if __name__ == "__main__":

    Now when you run it the first line displays is the file to modify. Edit it to change all “https” to “http”….

    $ python
    <module 'bittrex_websocket.websocket_client' from 
        '/home/ben/.local/lib/python2.7/site-packages/bittrex_websocket/websocket_client.pyc' >
    $ vi /home/ben/.local/lib/python2.7/site-packages/bittrex_websocket/
                   urls = ['',
  3. To get a clean view of whats whats happening on the wire it helps to filter for the Bittrex IP address…
     $ ping
    ==> PING ( 

    I’ve observed the Bittrex IP addresses bounce around within a subnet so we’ll use that for our filter.

  4. Install Wireshark and go to “Capture > Capture filters…” to pre-define a capture filter. Click the plus and enter the bottom line shown here…
  5. Activate that capture filter by clicking the circular icon (fourth from left) to select your network interface (here wlp2s0) and click on the “…using this filter” tag (yellow or green) to choose your pre-defined “bittrex” filter from the list. Then click the Start icon. Note, nothing appears until the next step.
  6. In the “Apply a display filter…” box, enter “http || websocket”, then at the shell do… 
    $ python

    and you should see something like…Wireshark-capture

The initial request is shown in packet #4 with its response packet #14 setting a cookie __cfduid and supplying the puzzle to solve. The GET query string (below) decoded looks like connectionData=[{“name”: “coreHub”}]&clientProtocol=1.5.

GET /signalr/negotiate?connectionData=%5B%7B%22name%22%3A+%22coreHub%22%7D%5D&clientProtocol=1.5 HTTP/1.1\r\n 

HTTP/1.1 503 Service Temporarily Unavailable
Set-Cookie: __cfduid=df17ba99a887664411404c9f88347504f1517897211; expires=Wed, 06-Feb-19 06:06:51 GMT; path=/;; HttpOnly
Server: cloudflare
CF-RAY: 3e8bed0546114d2e-PER

<form id="challenge-form" action="/cdn-cgi/l/chk_jschl" method="get">
<input name="jschl_vc" type="hidden" value="2227c799ed495508dbc259c1fb59bc97" />
<input name="pass" type="hidden" value="1517897215.726-F3N87MTQOm" />
<input id="jschl-answer" name="jschl_answer" type="hidden" />

<script type="text/javascript">
  var a = function() {try{return !!window.addEventListener} catch(e) {return !1} },
  b = function(b, c) {a() ? document.addEventListener("DOMContentLoaded", b, c) : document.attachEvent("onreadystatechange", b)};
    var a = document.getElementById('cf-content'); = 'block';
      var s,t,o,p,b,r,e,a,k,i,n,g,f, rxJOrIr={"jdOijG":+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![]))};
      t = document.createElement('div');
      t.innerHTML="<a href='/'>x</a>";
      t = t.firstChild.href;r = t.match(/https?:\/\//)[0];
      t = t.substr(r.length); t = t.substr(0,t.length-1);
      a = document.getElementById('jschl-answer');
      f = document.getElementById('challenge-form');
     [truncated]        ;rxJOrIr.jdOijG*=+((+!![]+[])+(+[]));rxJOrIr.jdOijG-=+((+!![]+[])+(+!![]));rxJOrIr.jdOijG-=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]));rxJOrIr.jdOijG*=+!![];rxJOrIr.jdOijG+=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]));
      f.action += location.hash;
      }, 4000);
    }, false);

The first two fields of the HTML form are pre-seeded with values that (5 seconds later) are carried through to next request in packet #22 with the third field “jschl_answer” calculated by the javascript. The response packet #24 sets second cookie cf_clearance and redirects back to the original URL…

GET /cdn-cgi/l/chk_jschl?jschl_answer=381&jschl_vc=2227c799ed495508dbc259c1fb59bc97&pass=1517897215.726-F3N87MTQOm HTTP/1.1\r\n

Set-Cookie: cf_clearance=097eaac668c0eb0db0a3cceb265e6ef7ea0a384c-1517897216-10800; path=/; expires=Tue, 06-Feb-18 10:06:56 GMT;; HttpOnly
Server: cloudflare-nginx
CF-RAY: 3e8bed2597f14d40-PER

As redirected, packet #26 requests the original URI again but now including two cookies. The response packet #29 is the signalr protocol indicating that its ready to try connecting with websockets…

GET /signalr/negotiate?connectionData=%5B%7B%22name%22%3A+%22coreHub%22%7D%5D&clientProtocol=1.5 HTTP/1.1

HTTP/1.1 200 OK  (application/json)
{  "Url":"/signalr",

So if Pharo can somehow obtain those two cookies to enable it to receive a response like packet #29, then we’ll have successfully navigated through Cloudflare.


Cloudflare provides DDOS protection against massive bot attacks, but obviously Bittrex doesn’t mind reasonably behaved programs connecting to it. After all, they provide an API for this. But its a hurdle we need to get over.

The dependencies of python-bittrex-websocket include Cloudflare-scrape which in turn depends on nodejs to evaluate the Javascript puzzle. But Pharo calling a python library calling a javascript library seems a bit fragile. Also, now that we understand what is happening on the wire, there is no need to muck around in Pharo to parse the web page to extract the javascript challenge to pass nodejs. Instead we should just use a nodejs library that does the whole thing and returns the keys we need. For this cloudscraper looks like a reasonable candidate. Lets start by trialling it from the shell. As a virgin nodejs user, I needed to start with something super simple to check if nodejs was installed…

nodejs -e "console.log(17+25)" 

==> 42

Cool, its ready to go. After some playing around I found the following provides a concise list of the headers we need…

$ npm install cloudscraper
$ nodejs -e \
   '  var cloudscraper = require("cloudscraper"); 
         function(error, response, body) 
           {console.log(response.connection._httpMessage._header);}); '

_header: 'GET / HTTP/1.1
User-Agent: Ubuntu Chromium/34.0.1847.116 Chrome/34.0.1847.116 Safari/537.36
cookie: __cfduid=d9d16a4714d2db938df32a7e50d1f24001517999469;
Connection: close

Note User-Agent is important as noted here that “You must use the same user-agent string for obtaining tokens and for making requests with those tokens, otherwise Cloudflare will flag you as a bot.”

To invoke that from Pharo we’ll use OSProcess (since it seems to have better cross platform support than OSSubProcess). You can load it from the Pharo Catalog.

I haven’t used OSProcess before, so lets try the simplest thing first…

(PipeableOSProcess command: 'echo hi there') output inspect

==> hi there

Yep! That works fine. Lets try some simple nodejs…

(PipeableOSProcess command: 'nodejs -e "console.log(17+25)" ') output inspect

==> 42

Cool! Now lets shoot for what we really need…

headers := (PipeableOSProcess waitForCommand: 
    'nodejs -e ''var cloudscraper = require("cloudscraper"); 
        cloudscraper.get("", function(error, response, body) 
        {console.log(body, response); }); '' | grep "_header:" '  ) output.
headers inspect.

_header: 'GET / HTTP/1.1\r\n
User-Agent: Ubuntu Chromium/34.0.1847.116 Chrome/34.0.1847.116 Safari/537.36\r\n
cookie: __cfduid=dacadf9197092e503974603ad61e934401518009298; 
Connection: close\r\n\r\n',

Woo hoo! Now continuing in Playground, to extract our magic pass…

re := '.*(__cfduid=)([^;]*).*' asRegex.
re matchesPrefix: headers.
cfduid := re subexpression: 3. 

re := '.*(cf_clearance=)([^\\]*).*' asRegex.
re matchesPrefix: headers.
cf_clearance := re subexpression: 3. 

re := '.*(User-Agent\: )([^\\]*).*' asRegex.
re matchesPrefix: headers.
userAgent := re subexpression: 3. 

{cfduid . cf_clearance . userAgent} inspect.

And lets put that to use…

client := ZnClient new url: ''.
jar := client session cookieJar.
jar add: ((ZnCookie name: '__cfduid' value: cfduid) domain: '').
jar add: ((ZnCookie name: 'cf_clearance' value: cf_clearance) domain: '').
client headerAt: 'User-Agent' put: userAgent.
(response := client get) inspect.

    <title> - Bittrex, The Next Generation Digital Currency Exchange...

YES!!!! (*arms punch the sky*)


Well, I feel that was a worthwhile journey to properly understand how Cloudflare works. In the end its almost too simple to warrant a separate package, but to pull it all together I’ve uploaded minimal package CloudflareUn that can be used like this….

cloudflareun := (CloudflareUn knockUrl: '').
response := (cloudflareun client url: '') get. 
response inspect. 

    <title> - Bittrex, The Next Generation Digital Currency Exchange...

Your feedback and enhancements will be appreciated.

This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Pharo v. Cloudflare ==> CloudflareUn

  1. Pingback: Pharo v. Signalr | openInWorld

Leave a Reply