Ryan Harrison My blog, portfolio and technology related ramblings

Java 11 HTTP Client API

One of the more noteworthy features of the new Java 11 release is the new HttpClient API within the standard library, which has been in incubator status since JDK 9. Previously, the process of sending and retrieving data over HTTP has been very cumbersome in Java. Either you went through the hassle of using HttpURLConnection, or you bring in a library to abstract over it such as the HttpClient from Apache. Notably, both solutions would also block threads whilst doing so.

As these days dealing with HTTP connections is so common, JDK 11 finally has a modern API which can deal with these scenarios - including support for HTTP 2 (server push etc) and WebSockets.

Create a Client

The first step to make use of the new API is to create an instance of the HttpClient class. The library itself makes heavy use of the builder pattern to specify configuration options, as is the case in most new Java libraries.

You can configure things like HTTP version support (the default is set to HTTP 2), whether or not to follow redirects, authentication and a proxy for all requests that pass through the client.

HttpClient client = HttpClient.newBuilder()
      .version(Version.HTTP_2)
      .followRedirects(Redirect.SAME_PROTOCOL)
      .authenticator(Authenticator.getDefault())
      .build();

The HttpClient instance is the main entry point to send and receive requests (both synchronously or asynchronously). Once created you can reuse them, but they are also immutable.

Create Requests

An HTTP request is represented by instances of HttpRequest which holds the URL, request method, headers, timeout and payload (if applicable). By default GET is used if no other is specified.

HttpRequest request = HttpRequest.newBuilder()
               .uri(URI.create("https://something.com"))
               .build(); // GET request

The following snippet creates a POST request with a custom timeout. A BodyPublisher must be used to attach a payload to a request - in this case taking a JSON String and converting it into bytes.

HttpRequest request = HttpRequest.newBuilder()
      .uri(URI.create("https://something.com/api"))
      .timeout(Duration.ofMinutes(1))
      .header("Content-Type", "application/json")
      .POST(BodyPublishers.ofString(json)
      .build()

Send Requests

Once an HttpRequest is created, you can send it via the HttpClient previously constructed. Both synchronous and asynchronous operations are supported:

Sync

The HttpClient.send method will perform the HTTP request synchronously - meaning that the current thread will be blocked until a response is obtained. The HttpResponse class encapsulates the response itself including status code, body and headers.

HttpResponse<String> response = client.send(request, BodyHandlers.ofString());
System.out.println(response.statusCode());
System.out.println(response.body());

When receiving responses, a BodyHandler is provided to instruct the client on how to process the response body. The BodyHandlers class includes default handlers for the most common scenarios. ofString will return the body as an UTF-8 encoded String, ofFile accepts a Path to write the response to a file and ofByteArray will give you the raw bytes.

Async

One of the best features of the API is the ability to perform completely asynchronous requests - meaning that no thread is blocked during the process. Under the hoods, the implementation uses NIO and non-blocking channels to ensure no blocking IO operations are performed.

The HttpClient.sendAsync method takes the same parameters as the synchronous version, but returns a CompletableFuture<HttpResponse<T>> instead of just the raw HttpResponse<T>. Just as with any other CompletableFuture, you can chain together callbacks to be executed when the response is available. In this case, the body of the response is extracted and printed out. More details here on how to work with CompletableFuture.

CompletableFuture<HttpResponse<String>> future = client.sendAsync(request,
        BodyHandlers.ofString());

future.thenApply(HttpResponse::body) // retrieve body of response
      .thenAccept(System.out::println); // use body as String

Each HttpClient has a single implementation-specific thread that polls all of its connections. Received data is then passed off to the executor for processing. You can override the Executor on the HttpClient, by default it is a cached thread pool executor.

Kotlin

As the new API sits like any other in the default JDK, you can easily make use of it in your Kotlin projects as well! You can paste in the above examples into IntelliJ to perform the automagical Java-Kotlin conversion, or the below example covers the basics:

val client = HttpClient.newBuilder().build();

val request = HttpRequest.newBuilder()
               .uri(URI.create("https://something.com"))
               .build();

val response = client.send(request, BodyHandlers.ofString());
println(response.statusCode())
println(response.body())

Coroutines (Async)

Things get much more interesting when taking into account the new asynchronous capabilities and Kotlin coroutines. It would be great if we could launch a coroutine which sends the request and suspends until the response is available:

suspend fun getData(): String {
    // above code to construct client + request
    val response = client.sendAsync(request, BodyHandlers.ofString());
    return response.await().body() // suspend and return String not a Future
}

// in some other coroutine (suspend block)
val someData = getData()
process(someData) // just as if you wrote it synchronously

No need to deal with chaining together callbacks onto CompletableFuture, you get the same procedural code flow even though the implementation suspends and is completely non-blocking.

The magic comes from the CompletionStage.await() extension function provided by the coroutines JDK integration library:

return suspendCancellableCoroutine { cont: CancellableContinuation<T> ->
    val consumer = ContinuationConsumer(cont)
    whenComplete(consumer) // attach continuation to CompletionStage
}

Docs for the function.

More Information

https://download.java.net/java/early_access/jdk11/docs/api/java.net.http/java/net/http/package-summary.html

http://openjdk.java.net/groups/net/httpclient/intro.html

https://www.youtube.com/watch?list=PLX8CzqL3ArzXyA_lJzaNmrFqpLOL4aCEz&v=sZSdWq490Vw

Read More

Ktor - File Upload and Download

The ability to perform file uploads and downloads is a staple part of any good web server framework. Ktor has support for both operations in just a few lines of code.

File Upload

File uploads are handled through multipart POST requests in standard HTTP - normally from form submissions where the file selector field would be just one item (another could be the title for example).

To handle this in Ktor, you can get hold of the multipart data through receiveMultipart and then loop over each part as required. In the below example, we are just interested in files (PartData.FileItem), although you could also look at the individual PartData.FormItem as well (which would be the other form fields in the submission).

A Ktor FileItem exposes an InputStream via streamProvider which can be used to access the raw bytes of the file which has been uploaded. As in the below example, you can then simply create the appropriate file and copy the bytes from one stream (input) to the other (output).

post("/upload") { _ ->
    // retrieve all multipart data (suspending)
    val multipart = call.receiveMultipart()
    multipart.forEachPart { part ->
        // if part is a file (could be form item)
        if(part is PartData.FileItem) {
            // retrieve file name of upload
            val name = part.originalFileName!!
            val file = File("/uploads/$name")

            // use InputStream from part to save file
            part.streamProvider().use { its ->
                // copy the stream to the file with buffering
                file.outputStream().buffered().use {
                    // note that this is blocking
                    its.copyTo(it)
                }
            }
        }
        // make sure to dispose of the part after use to prevent leaks
        part.dispose()
    }
}

The documentation also includes a suspending copyTo method which can be used to save the upload to a file in a non-blocking way.

File Download

File downloads are very straightforward in Ktor. You just have to create a handle to the specified File and use the respondFile method:

get("/{name}") {
    // get filename from request url
    val filename = call.parameters["name"]!!
    // construct reference to file
    // ideally this would use a different filename
    val file = File("/uploads/$filename")
    if(file.exists()) {
        call.respondFile(file)
    }
    else call.respond(HttpStatusCode.NotFound)
}

By default, if called from the browser, this will cause the file to be viewed inline. If you instead want to prompt the browser to download the file, you can include the Content-Disposition header:

call.response.header("Content-Disposition", "attachment; filename=\"${file.name}\"")

This is also helpful if you save the uploaded file with a different name (which is advisable), as you can override the filename with the original when it gets downloaded by users.

More information about the Content-Disposition header

Read More

Ubuntu Server Setup Part 6 - HTTPS With Let's Encrypt

HTTPS is a must have nowadays with sites served under plain HTTP being downgraded in search results by Google and marked as insecure by browsers. The process of obtaining an SSL certificate used to be cumbersome and expensive, but now thankfully because of Let’s Encrypt it completely free and you can automate the process with just a few commands.

This part assumes that you already have an active Nginx server running (as described in Part 4) and so will go over how to use Let’s Encrypt with Nginx. Certbot (the client software) has a number of plugins that make the process just as easy if you are running another web server such as Apache.

Prepare the Nginx Server

Make sure have a server block where server_name is set to your domain name (in this case example.com).

This is so Certbot knows which config file to modify in order to enable HTTPS (it adds a line pointing to the generated SSL certificates).

server {
  	listen 80;
  	listen [::]:80;

  	server_name example.com www.example.com;
  	root /var/www/example;

  	index index.html;
  	location / {
  		try_files $uri $uri/ =404;
  	}
  }

That’s all the preparation needed on the Nginx side. Certbot will handle everything else for us.

Install and Run Certbot

Certbot is the client software (written in Python), that is supported by Let’s Encrypt themselves to automate the whole process. There are a wide range of different alternatives in various languages if you have different needs.

You should install Certbot through the dedicated ppa to make sure you always get the latest updates. In this example we install the Nginx version (which includes the Nginx plugin):

sudo apt-get update
sudo add-apt-repository ppa:certbot/certbot
sudo apt-get update
sudo apt-get install -y python-certbot-nginx

Once installed, you can run Certbot. Here the --nginx flag is used to enable the Nginx plugin. Without this, Certbot would just generate a certificate and your web server wouldn’t know about it. The plugin is required to modify the Nginx configuration in order to see the certificate and enable HTTPS.

sudo certbot --nginx

It will ask for:

  • an email address (you will be emailed if there are any issues or your certs are about to expire)
  • agreeing to the Terms of Service
  • which domains to use HTTPS for (it detects the list using server_name lines in your Nginx config)
  • whether to redirect HTTP to HTTPS (recommended)

Once you have selected these options, Certbot will perform a ‘challenge’ to check that the server it is running on is in control of the domain name. As described in the ACME protocol which is what underlies Let’s Encrypt, there are a number of different challenge types. In this case tls-sni was probably performed, although DNS might be used for wildcard certificates.

If the process completed without errors, a new certificate should have been generated and saved on the server. You can access this via /var/letsencrypt/live/domain.

The Nginx server block should have also been modified to include a number of extra ssl related fields. You will notice that these point to the generated certificate alongside the Let’s Encrypt chain cert. If you checked the option to redirect all HTTP traffic to HTTPS, you should also see another server block generated which merely captures all HTTP traffic and performs a redirection to the main HTTPS enabled server block.

You could stop here if all you want is HTTPS as this already gives you an A rating and maintains itself.

Test your site with SSL Labs using https://www.ssllabs.com/ssltest/analyze.html?d=www.YOUR-DOMAIN.com

Renewal

There is nothing to do, Certbot installed a cron task to automatically renew certificates about to expire.

You can check renewal works using:

sudo certbot renew --dry-run

You can also check what certificates exist using:

sudo certbot certificates

A+ Test

If you did the SSL check in the previous section you might be wondering why you didn’t get an A+ instead of just an A. It turns out that the default policy is to use some slightly outdated protocols and cipher types to maintain backwards compatibility with older devices. If you want to get the A+ rating, add the below config to your Nginx server block (the same one that got updated by Certbot). In particular we only use TLS 1.2 (not 1.0 or 1.1) and the available ciphers are restricted to only the latest and most secure. Be aware though that this might mean you site is unusable on some older devices which do not support these modern ciphers.

ssl_trusted_certificate /etc/letsencrypt/live/YOUR-DOMAIN/chain.pem;

ssl_session_cache shared:le_nginx_SSL:1m;
ssl_session_timeout 1d;
ssl_session_tickets off;

ssl_protocols TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
ssl_ecdh_curve secp384r1;

ssl_stapling on;
ssl_stapling_verify on;

add_header Strict-Transport-Security "max-age=15768000; includeSubdomains; preload;";
add_header Content-Security-Policy "default-src 'none'; frame-ancestors 'none'; script-src 'self'; img-src 'self'; style-src 'self'; base-uri 'self'; form-action 'self';";
add_header Referrer-Policy "no-referrer, strict-origin-when-cross-origin";
add_header X-Frame-Options SAMEORIGIN;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
Read More

Validate GitHub Webhook Signatures

I’ve mentioned using GitHub webhooks in a previous post where they got used to kick-off a new Jekyll build every time a new commit is pushed. This usually involves having some kind of web server (in my case Flask) running that listens for requests on some endpoint. The hope of course is that these requests only come from GitHub, but really there is nothing stopping any malicious actor from causing a denial of service attack by hitting that endpoint constantly and using up server resources.

To get around this, we need to perform some kind of validation to make sure that only requests from GitHub cause a rebuild.

Signature Header

The folks at GitHub have thought about this and so have included an extra header in all their webhook requests. This takes the form of X-Hub-Signature which, from the docs, contains:

The HMAC hex digest of the response body. This header will be sent if the webhook is configured with a secret. The HMAC hex digest is generated using the sha1 hash function and the secret as the HMAC key.

Therefore, as described in their example, a request coming from GitHub looks like:

POST /payload HTTP/1.1
Host: localhost:4567
X-Github-Delivery: 72d3162e-cc78-11e3-81ab-4c9367dc0958
X-Hub-Signature: sha1=7d38cdd689735b008b3c702edd92eea23791c5f6
User-Agent: GitHub-Hookshot/044aadd
Content-Type: application/json
Content-Length: 6615
X-GitHub-Event: issues
{
  "action": "opened",
  "issue": {
  ...

This mentions a secret which gets used to construct the signature. You can set this value in the main settings page for the webhook. Of course make sure that this is strong, not repeated anywhere else and is kept private.

GitHub Webhook

Signature Validation

As mentioned in the docs, the X-Hub-Signature header value will be the HMAC SHA-1 hex digest of the request payload using the secret we defined above as the key.

Sounds complicated, but it’s really quite simple to construct this value in most popular languages. In this case I’m using Python with Flask to access the payload and headers. The below snippet defines a Flask endpoint which the webhook will hit and accesses the signature and payload of the request:

@app.route('/refresh', methods=['POST'])
def refresh():
    if not "X-Hub-Signature" in request.headers:
        abort(400) # bad request if no header present

    signature = request.headers['X-Hub-Signature']
    payload = request.data

Next, you need to get hold of the secret which is used as the HMAC key. You could hardcode this, but that’s generally a bad idea. Instead, store the value in a permissioned file and read the contents in the script:

with open(os.path.expanduser('~/github_secret'), 'r') as secret_file:
    webhook_secret = secret_file.read().replace("\n", "")

We now have the header value, payload and our secret. Now all that’s left to do is construct the same HMAC digest from the payload we get and compare it with the one from the request headers. As only you and GitHub know the secret, if two signatures match, the request will have originated from GitHub.

secret = webhook_secret.encode() # must be encoded to a byte array

# contruct hmac generator with our secret as key, and SHA-1 as the hashing function
hmac_gen = hmac.new(secret, payload, hashlib.sha1)

# create the hex digest and append prefix to match the GitHub request format
digest = "sha1=" + hmac_gen.hexdigest()

if signature != digest:
    abort(400) # if the signatures don't match, bad request not from GitHub

# do real work after
...

Automate Jekyll with GitHub Webhooks

https://docs.python.org/3.7/library/hmac.html

https://developer.github.com/webhooks

Read More

Ubuntu Server Setup Part 5 - Install Git, Ruby and Jekyll

This part will take care of installing everything necessary to allow the new server to host your personal blog (or other Jekyll site). As a prerequisite, you will also need some kind of web server installed (such as Nginx or Apache) to take care of serving your HTML files over the web. Part 4 covers the steps for my favourite - Nginx.

Install Git

As I store my blog as a public repo on GitHub, Git first needs to be installed to allow the repo to be cloned and new changes to be pulled. Git is available in the Ubuntu repositories so can be installed simply via apt:

sudo apt install git

You might also want to modify some Git config values. This is only really necessary if you plan on committing changes from your server (so that your commit is linked to your account). As I only tend to pull changes, this isn’t strictly required.

sudo apt install git
git config --global color.ui true
git config --global user.name "me"
git config --global user.email "email

Helpful Git Aliases

Here are a few useful Git aliases from my .bashrc. You can also add aliases through Git directly via the alias command.

alias gs='git status'
alias ga='git add'
alias gaa='git add .'
alias gp='git push'
alias gpom='git push origin master'
alias gpu='git pull'
alias gcm='git commit -m'
alias gcam='git commit -am'
alias gl='git log'
alias gd='git diff'
alias gdc='git diff --cached'
alias gb='git branch'
alias gc='git checkout'
alias gra='git remote add'
alias grr='git remote rm'
alias gcl='git clone'
alias glo='git log --pretty=format:"%C(yellow)%h\\ %ad%Cred%d\\ %Creset%s%Cblue\\ [%cn]" --decorate --date=short'

More helpful aliases:

Install Ruby

Ruby is also available in the Ubuntu repositories. You will also need build-essential to allow you to compile gems.

sudo apt install ruby ruby-dev build-essential

It’s a good idea to also tell Ruby where to install gems - in this case your home directory via the GEM_HOME environment variable. Two lines are added to .bashrc to ensure this change is kept for new shell sessions:

echo '# Install Ruby Gems to ~/gems' >> ~/.bashrc
echo 'export GEM_HOME=$HOME/gems' >> ~/.bashrc
echo 'export PATH=$HOME/gems/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

You should now be able to run ruby -v to ensure everything is working.

To get more control over the Ruby installation (install new versions or change versions on the fly), check out rbenv or rvm.

Install Jekyll

Once Ruby is installed, the Jekyll gem can be installed via gem:

gem install jekyll bundler

I also use some extra Jekyll plugins which can also be installed as gems:

gem install jekyll-paginate
gem install jekyll-sitemap

As the path to the Ruby gems directory has been added to the PATH (in the previous section), the jekyll command should now be available:

jekyll -v
jekyll build

Automated Build

Here is a simple bash script which pulls the latest changes from Git, builds the Jekyll site and copies the site to a directory as to be served by your web server (default location is /var/www/html).

#!/bin/bash

echo "Pulling latest from Git";
cd ~/blog/ && git pull origin master;

echo "Building Jekyll Site";
jekyll build --source ~/blog/ --destination ~/blog/_site/;
echo "Jekyll Site Built";

echo "Copying Site to /var/www/html/";
cp -rf ~/blog/_site/* /var/www/html/;
echo "Site Copied Successfully";
Read More