Ryan Harrison My blog, portfolio and technology related ramblings

Ubuntu Server Setup Part 4 - Setup Nginx Web Server

Serving web pages is one of the most common and useful use cases of a cloud server. Nginx is popular and handles some of the largest sites on the web. It’s configuration is simplistic but very powerful and Nginx can often use less resources than an equivalent Apache server.

Install Nginx

Nginx is available in the default Ubuntu repositories, so installation is simple through apt:

$ sudo apt update
$ sudo apt install nginx

That’s all you need to do for the base install of Nginx. By default, the service is started and Nginx includes a simple default landing page (located in /var/www/html) which you should now be able to access via the web.

Access through the Web

First, make sure that Nginx is running on your system. If using a modern Ubuntu server installation, you can do this via systemd:

$ sudo systemctl status nginx
● nginx.service - A high performance web server and a reverse proxy server
   Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
...

If Nginx is not already running, use the following to start the service:

$ sudo systemctl start nginx

# other useful commands
$ sudo systemctl stop nginx
$ sudo systemctl restart nginx 
$ sudo systemctl reload nginx # reload config without dropping connections
$ sudo systemctl disable nginx # don't start nginx on boot
$ sudo systemctl enable nginx # do start nginx on boot

Also check that your firewall (if any) is setup to allow connections on port 80 (for HTTP). Refer to the previous part in this series for instructions using ufw.

Now you can check that everything is working correctly by accessing your web server through the internet. If you don’t already know the external IP for you server, run the following command:

$ dig +short myip.opendns.com @resolver1.opendns.com

When you have your server’s IP address, enter it into your browser’s address bar. You should see the default Nginx landing page.

http://your_server_ip

Customise Nginx Config

All of the Nginx configuration files are stored within /etc/nginx/ and it is laid out similarly to an Apache installation.

To create a new configuration - server block in Nginx, virtual host in Apache - first create a file within /etc/nginx/sites-available. It is good convention to use the domain name as the filename:

$ sudo nano /etc/nginx/sites-available/yourdomain.com

Within this file, create a new server block structure:

server {
        listen 80;
        listen [::]:80;

        root /var/www/html;
        index index.html index.htm index.nginx-debian.html;

        server_name yourdomain.com www.yourdomain.com;

        location / {
                try_files $uri $uri/ =404;
        }
}

This server block will listen to requests on port 80 (HTTP requests) and will serve resources from the default /var/www/html directory. This can be changed as necessary - ideally a dedicated root directory per server block. The server_name is set to the domain name(s) you wish to serve. This is useful if you want to add HTTPS via Let’s Encrypt later on.

Next, this server needs to be enabled by creating a symlink within the /etc/nginx/sites-enabled directory:

$ sudo ln -s /etc/nginx/sites-available/yourdomain.com /etc/nginx/sites-enabled/

You may also wish to delete the default configuration file unless you want to fall back to the defaults:

$ sudo rm /etc/nginx/sites-enabled/default

As we have added additional server names (our domains), it is good to correct the hash bucket size for server names to avoid potential conflicts later on:

$ sudo nano /etc/nginx/nginx.conf

Find the server_names_hash_bucket_size directive and remove the # symbol to uncomment the line:

...
http {
    ...
    server_names_hash_bucket_size 64;
    ...
}
...

Finally, it’s time to restart Nginx in order to reload our config. But first, you can see if there are any syntax errors in your files:

$ sudo nginx -t

If there aren’t any problems, restart Nginx to enable the changes:

$ sudo systemctl restart nginx

Nginx will now serve requests for yourdomain.com (assuming you have set up an A DNS record pointing to your server). Navigate to http://yourdomain.com and you should see the same landing page as before. Any new files added to /var/www/html will also be served by Nginx under your domain.

Enable HTTPS

If you already have SSL certificates for your domain names, you can easily setup Nginx to handle HTTPS requests. Makes sure that your firewall is setup to allow connections on port 443 first:

server {
        listen 443 ssl;
        listen [::]:443 ssl;

        root /var/www/html;
        index index.html index.htm index.nginx-debian.html;

        server_name yourdomain.com www.yourdomain.com;

        location / {
                try_files $uri $uri/ =404;
        }
        
        ssl_certificate /etc/ssl/certs/example-cert.pem;
        ssl_certificate_key /etc/ssl/private/example.key;
       
        ssl_session_cache shared:le_nginx_SSL:1m;
        ssl_session_timeout 1440m;

        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        ssl_prefer_server_ciphers on;
        ssl_ciphers "ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS";
}

The above uses the same configuration as Let’s Encrypt to set strong ciphers and disable old versions of SSL. This should get you an A in SSLTest. I will also add a post on setting up Let’s Encrypt with Nginx to automate the process of using free SSL certificates for your site.

Custom Error Pages

By default, Nginx will display it’s own error pages in the event of a 404/50x error etc. If you have your own versions, you can use the error_pages directive to specify a new path. Open up your server block config and add the following:

server {
    ...
    error_page 404 /custom_404.html;
    error_page 500 502 503 504 /custom_50x.html;
    ...
}

If required, you can also specify a completely new location (not in the main root directory of the server block) for your error pages by providing a location block which resolves the specified error page path:

server {
    ...
    error_page 404 /custom_404.html;
    location = /custom_404.html {
        root /var/html/custom;
        internal;
    }
    ...
}

Log File Locations

  • /var/log/nginx/access.log: Every request to your web server is recorded in this log file unless Nginx is configured to do otherwise.
  • /var/log/nginx/error.log: Any Nginx errors will be recorded in this log file.
Read More

Ubuntu Server Setup Part 3 - Installing a Firewall

By default, your server may not come with a firewall enabled - meaning that external users will have direct access to any applications listening on any open port. This is of course a massive security risk and you should generally seek to minimise the surface area exposed to the public internet. This can be done using some kind of firewall - which will deny any traffic to ports that you haven’t explicitly allowed.

I personally only allow a few ports through the firewall and make use of reverse proxies through Nginx to route traffic to internal apps. That way you can have many applications running on your server, but all traffic is run through port 443 (with HTTPS for free) first.

UFW Installation

The simplest firewall is ufw (Uncomplicated Firewall) and may already come preinstalled on your server. If it doesn’t you can get it by running:

$ sudo apt install ufw

Once installed, check that the ufw service is running:

$ sudo service ufw status

Configure Firewall Rules

The first thing you want to do is ensure that the port ssh is running under is allowed through the firewall (by default 22). If you don’t, then you won’t be able to log in to your server anymore!

$ sudo ufw allow 22
or
$ sudo ufw allow ssh 

Then start the firewall by running:

$ sudo ufw enable

Command may disrupt existing ssh connections. Proceed with operation (y|n)? y
Firewall is active and enabled on system startup

If you have a web server running, you will notice that any http or https requests no longer work. That’s because we need to allow port 80 and 443 through the firewall:

$ sudo ufw allow http
$ sudo ufw allow https

Your web server will now be properly accessible again. You can list the currently enabled rules in ufw by running:

$ sudo ufw status

Status: active

To                         Action      From
--                         ------      ----
22                         ALLOW       Anywhere
80/tcp                     ALLOW       Anywhere
443/tcp                    ALLOW       Anywhere
22 (v6)                    ALLOW       Anywhere (v6)
80/tcp (v6)                ALLOW       Anywhere (v6)
443/tcp (v6)               ALLOW       Anywhere (v6)

ufw also comes with some default app profiles:

$ sudo ufw app list

Available applications:
  Nginx Full
  Nginx HTTP
  Nginx HTTPS
  OpenSSH
  Postfix
  Postfix SMTPS
  Postfix Submission

You can then pass in the app name to the allow/deny commands:

$ sudo ufw allow OpenSSH

Refer to my post on Common Port Mappings to find out which ports you might need to allow through your firewall.

List and remove rules

To delete a rule, you first need to get the index:

$ sudo ufw status numbered

[ 1] 22                         ALLOW IN    Anywhere
[ 2] 80/tcp                     ALLOW IN    Anywhere
[ 3] 443/tcp                    ALLOW IN    Anywhere
...

If you wanted to delete the 443 (https) rule, pass the index 3 into the delete command:

$ sudo ufw delete 3

Deleting:
 allow 443/tcp
Rule deleted

Finally you can disable the firewall by running:

$ sudo ufw disable

Allow or Deny Specific IP’s

You can also allow or deny access from specific ip addresses. For example, to allow connections from only 151.80.44.180:

$ sudo ufw allow from 151.80.44.180

Or to only allow access to only port 22 from that specific ip:

$ sudo ufw allow from 151.80.44.180 to any port 22

Similarly, if you want to deny all connections from a specific ip use:

$ sudo ufw deny from 151.80.44.180
Read More

Kotlin - Add Integration Test Module

The default package structure for a new Kotlin project generated through IntelliJ looks like the following, whereby you have a main source folder with source sets (modules) for your main files and then test source files.

GitHub Webhook

Typically, you would place your unit tests within the auto-generated test module, and then run them all at once (within one JVM). IntelliJ is generally set up to support this use case and if that’s all you need, requires minimal setup and effort.

However, if you also need to add integration tests (or end-to-end etc), then this project structure can start to cause issues. For example, consider a typical project setup for a server-side app:

  • main - business logic and main app files
  • test - unit tests,
    • typically with JUnit or similar
    • spin up in-memory H2 database for easy DAO testing
  • test-integration - integration/e2e tests
    • typically testing API endpoints with Rest Assured or similar
    • start up full version of the server and any dependencies

You can’t merge all the tests into one module and run them all at once because you would need to start up multiple database instances etc. Conflicts arise and it’s apparent that you need to run them separately in their own dedicated JVM.

To add the above mentioned test-integration module, you can make some edits to your build.gradle file to define a new source set (IntelliJ module):

sourceSets {
    testIntegration {
        java.srcDir 'src/testIntegration/java'
        kotlin.srcDir 'src/testIntegration/kotlin'
        resources.srcDir 'src/testIntegration/resources'
        compileClasspath += main.output
        runtimeClasspath += main.output
    }
}

Here, a new source set for integration tests is created. Gradle is told where the Java and Kotlin source files live and we specify that the classpath inherits from the main source set. This allows you to reference classes of your main module within the integration tests (you might not need this).

Then, we provide a configuration and task for the new source set to ensure that the new module contains the same dependencies as within the main test module (defined using testCompile in your dependencies). Finally, define a new Task to run the integration tests, pointing it to the classes and classpath of the testIntegration source set instead of the inherited defaults from test:

configurations {
    testIntegrationCompile.extendsFrom testCompile
    testIntegrationRuntime.extendsFrom testRuntime
}

task testIntegration(type: Test) {
    testClassesDirs = sourceSets.testIntegration.output.classesDirs
    classpath = sourceSets.testIntegration.runtimeClasspath
}

Similarly to how you might have previously set the target bytecode version for the main and test modules, you need to do the same for the new module:

compileTestIntegrationKotlin {
    kotlinOptions.jvmTarget = "1.8"
}

If you run Gradle with the option to ‘Create directories for empty content roots automatically’, you should see a new module get created. You might notice one issue though, the new module is not marked as a test module within IntelliJ. You could do this manually, but it would get reset every time Gradle runs. To override this, you can apply the idea plugin and add the source directories of the new source set:

idea {
    module {
        testSourceDirs += project.sourceSets.testIntegration.java.srcDirs
        testSourceDirs += project.sourceSets.testIntegration.kotlin.srcDirs
        testSourceDirs += project.sourceSets.testIntegration.resources.srcDirs
    }
}

Now you will see the desired output after Gradle runs:

GitHub Webhook

WARNING - This approach is not without problems. If you look at the Test Output Path of the new module, it is defined as \kotlin-scratchpad\out\test\classes which is the same as the main test module. Therefore, all the compiled test classes will end up in the same directory - which causes issues if you try to Run All for example. To fix this, you have to manually update the path to \kotlin-scratchpad\out\testIntegration\classes. Alternatively, you might not apply the idea plugin and just mark the module for tests each time Gradle runs. Hopefully I will find a fix for this at some point.

Read More

Even More Favourite Kotlin Features

This post is a continuation of a previous post on some of my favourite language feature in Kotlin.

Coroutines

Not technically a language feature, as most of it is implemented as a library (which is cool in itself when you see the syntax), but still very useful. Dealing with Threads in Java has long been a bit of a pain point. The situation has got significantly better with the introduction of API’s such as ExecutorService and now CompletableFuture, but the syntax isn’t all that readable and you end up with RxJava like messes of chained method calls and lambdas everywhere.

Coroutines can be thought of as very lightweight threads - you can spawn thousands of coroutines without crashing your program because, unlike threads, they don’t consume any OS level resources. In Kotlin, coroutines revolve around the concept of functions which can suspend. That is, at some point they can be suspended and the thread that was running it can go off and run something else. Later on, the coroutine can resume (maybe on a different thread) with new values and as if nothing ever happened. Under the hoods, similarly to C# async/await, the compiler generates a simple state machine which handles the suspension and resumption of these functions.

The great benefit of this is that your asynchronous code looks just like you would write it synchronously - no dealing with the threading model directly or chaining callbacks together. It also makes error handling much easier to deal with.

suspend fun someWork(): String {
    // call a web service, do some calculation etc
}

suspend fun worker() {
    val first = async { someWork() }
    val second = async { someOtherWork() } 
    println("The answer is ${first.await() + second.await()}") // suspension point
}

In the example above, worker() will launch two coroutines to perform the two async blocks. By default, the coroutine dispatcher uses the shared ForkJoinPool which is where we get the real parallelisation from. In the println statement, the results are ‘awaited’. At this point, if the result is not immediately available, execution suspends and the thread is free to do something else. When the async block returns, worker() is resumed and the println can continue (maybe suspending again on the second await()).

You can also do simpler things like just launching a background coroutine to do some work if you are not interested in the result:

launch {
    // do some work
    delay(100) // not Thread.sleep()
    // coroutines use a special version which is non-blocking
}

The key factor is that you want to avoid blocking threads. In Java multithreading scenarios, this is much more common - maybe you block waiting for a queue to fill or until a web request is finished. With Kotlin coroutines you don’t have to block the entire thread - you suspend the coroutine instead and the thread is free to continue doing other work. This is especially handy in web frameworks such as Ktor which is built around coroutines.

This is just scraping the surface of what the coroutines library has to offer -

  • channels (think pub/sub)

  • support for custom dispatchers and contexts (restrain execution to your thread pool etc)

  • full support for cancellation and error handing

  • actors, producers and select expressions

The introductory guide is a great read if you want to learn more. In effect, they are similar to Go-routines and a lot more powerful than C# async/await. If you read the guide you can see the multitude of use cases which can be easily parallelised or made non-blocking.

Top Level Functions and Multi-Class Files

When you’ve got so many language constructs to remove boilerplate code and verbosity, it doesn’t make sense to force developers into having only one class per file. Thankfully, unlike Java, no such restriction is forced in Kotlin - you can have as many public classes, functions, definitions in one file as you want. When you have a lot of data classes or sealed classes, it makes life a lot simpler to bundle up similar classes into one file. When you go back to Java, it’s one thing that starts to get on your nerves that you may have never thought about before:

// All in one file
data class Foo(val bar: String)
data class Baz(val foo: Int)

class Printer {
    fun print(s: String) = println("Printer is printing $s")
}

sealed class Operation {
    class Add(val value: Int) : Operation()
    class Substract(val value: Int) : Operation()
}

You may have noticed that I said ‘top-level functions’ before. Yes, that’s right, in Kotlin a function doesn’t have to belong to a class - you can have it sitting with other related functions in their own file, or anywhere you want.

There is no need to have Utils classes anymore. You can (and should) name you source files as such, but the utilities themselves can sit in them without a class definition. And they can be used without referring to such a class:

// StringUtils.kt
fun replaceCommas(s: String) = s.replace(",", ".")
// and more helpers

// Work.kt
import StringUtils.replaceCommas
fun work() {
    println(replaceCommas("Some text, goes here"))
}

Even though Kotlin does have a language feature that could support a similar utils pattern you would see in Java - the object keyword - top-level helper functions are much neater.

It might not seem like much, but it really can dramatically reduce the overall number of source files you have to deal with. A common case is where you define a short interface and a couple of implementations. In Java they would have to be split into 3 separate .java files, regardless of the fact that it might all be 50 lines long. In Kotlin, you can put everything into one file.

Destructuring Declarations

Not quite Python-like multi-value returns from functions, but still some nice syntax to retrieve variables from data classes returned from your functions.

data class Student(val name: String, val age: Int)
fun getStudent(): Student {
    return Student("John", 22)
}
val (name, age) = getStudent()
println(name)
println(age)

Singleton (Object)

Another Kotlin feature which I like is the simplicity of defining a “singleton”. Consider the following example of how a singleton could be created in Java and then in Kotlin.

public class SingletonInJava {

    private static SingletonInJava INSTANCE;

    public static synchronized SingletonInJava getInstance() {
        if (INSTANCE == null) {
            INSTANCE = new SingletonInJava();
        }
        return INSTANCE;
    }   
}
object SingletonInKotlin {
}

// And we can call
SingletonInKotlin.doSomething()

Kotlin has a feature to define a singleton in a very clever way. You can use the keyword object, which allows you to define an object which exists only as a single instance across your app. No need to worry about initialising your instance of thread safety, everything is handling by the compiler.

Read More

Automate Jekyll with GitHub Webhooks

In the ongoing task of trying to automate as much of the blogging process as much as possible, actually rebuilding the static files is the most painful and could definitely use improvement. If you’re not using something like GitHub Pages (which does it automatically), but instead host everything on your own server, rebuilding the site after you add a new post could involve manually ssh’ing into the machine and running the Jekyll command manually.

That gets old and fast. Ideally you want a similar experience to GitHub Pages in that everything will happen as soon as you push your changes to the repo. If you have control over the VPS which is hosting everything, and can install a simple endpoint in Python/whatever, combined with webhooks you can get the exact same behaviour.

Create a GitHub Webhook

If your site is hosted on GitHub (presumably other providers have something similar), go into your Repo -> Settings -> Webhooks and click on Add Webhook.

GitHub Webhook

Here you can choose a URL which will be called by GitHub when certain actions happen (if you choose the ‘let me select’ option you can see the comprehensive list). In this case we are going to install a simple endpoint on our blog server, which when called will rebuild the Jekyll blog and redeploy.

For this use case we are just interested in the push event (i.e when you push a new post), but you could also perform actions on all sorts of other events if needed. In the content type I have left it as form-urlencoded as I’m not really interested in the payload, but you could also choose JSON if needed. The payload will include extra details of the event - in this case the files which have changed as a result of the git push. This could be helpful if you only wanted to regenerate certain portions of your site if they have been modified.

You can also provide a secret - of which the SHA-1 hash which will be added into the X-Hub-Signature header of each request. More details in the docs. Once finished click Add webook.

Note - GitHub webhooks have a timeout of 10 seconds so you might see them fail if your blog takes longer to rebuild. It doesn’t really matter though as the server will have still been notified.

Add a Flask Endpoint

Now a webhook is set up that will ping our server every time something is pushed to the repo, we have to install something to handle that request and cause a Jekyll rebuild. There is a sea of lightweight web frameworks available these days, in this case I’m going to use Python and Flask (mainly due to the fact that Python is already installed on Debian based servers). On your server run the following to install Flask:

$ pip install Flask

Now we can create a simple endpoint mapped to the request we specified in the webhook config:

from flask import Flask
import subprocess

app = Flask(__name__)

@app.route('/api/blogrefresh', methods=['POST'])
def blogrefresh():
    script_path = "~/bin/jekyll_rebuild.sh"
    subprocess.call([os.path.expanduser(script_path)])
    return "Success"

if __name__ == "__main__":
    app.run(host='0.0.0.0')

In the code above a new Flask app is created, with one endpoint which handles POST requests to our webhook URL. In the handler a simple shell script is called (living in the home dir of the running user) which is what actually runs jekyll build.

The contents of the jekyll_rebuild.sh script simply pulls the latest changes from the Git repo, rebuilds the site and copies the static files to the folder served by the web server - /var/www/html in this case:

#!/bin/bash

echo "Pulling latest from Git"
cd ~/blog/ && git pull origin master

echo "Building Jekyll Site";
jekyll build --source ~/blog/ --destination ~/blog/_site/;
echo "Jekyll Site Built";

echo "Copying Site to /var/www/html/";
cp -rf ~/blog/_site/* /var/www/html/;
echo "Site Rebuilt Successfully";

And that’s pretty much it. Just run the Flask web server via python3 refresh.py. By default Flask runs on port 5000, so you might need to either open up a port on your server (and update the webhook url) or proxy it through Apache/Nginx.

You could also integrate Elastic Jekyll into this process to give your site full-text search via ElasticSearch that gets automatically updated as you add new content!

Read More