Ryan Harrison My blog, portfolio and technology related ramblings

Inline code in WordPress

When posting recently it’s become quite annoying that there is no default way of presenting inline code in posts. I’ve resorted to making the text italic, which does provide a small amount of differentiation, yet does not offer the quality provided in most programming forums where the text inside inline code is in a monospaced font and sometimes has a border.

Luckily though customising WordPress is very easy, and it requires a very small amount of code to achieve the desired effect for inline code.

First of all a custom CSS class needed to be added to the styles.css file of the WordPress theme. This is the class I used:

  
.inlinecode {  
    padding: 2px;  
    background: #FFF;  
    border: 1px dotted #000;  
    color: #000;  
    font-family: monospace!important;  
}  

Which simply forces the text into a black monospace font and adds a small border around the outside. The only thing to do now is hook up WordPress to use this new style. WordPress uses shortcodes to accomplish this. Essentially you provide a code inside square brackets and then put the content in between an ending brace.

For example migrated over to Jekyll

For my site I use the il shortcode. We then need to tell WordPress what to do when it sees this new shortcode. In this case all that needs to happen is a span tag with the new CSS style is added to encapsulate the inline code. This is done through a simple PHP function:

  
function inlinecode( $atts, $content = null ) {  
    return '<span class="inlinecode">'.$content.'</span>';
}  

Finally one more line is needed to hook this new function into wordpress with the chosen shortcode:

  
add_shortcode("il", "inlinecode");  

And that’s it. This is one of the benefits of using WordPress - it’s so easy to customise and there are a load of guides and tutorials online to help you do it. Here is the resulting inline code (although it has been used throughout this post already):

Some example inline code (in Jekyll)

I will also hopefully be gradually adding the use of this new inline code into previous posts in order to improve the overall presentation and readability.

Read More

Java Regression Library - Linear Regression Model

Part 2 - Linear Regression Model

Welcome to part 2 of this tutorial series where we will be creating a Regression Analysis library in Java. In the last tutorial we covered a lot of theory about the foundations and applications of regression analysis. We finished off by coding up the RegressionModel abstract class, which will become the base of all our models in this library.

Prerequisites -

Make sure you have read and understand Part 1 of this tutorial series where I explained a lot of theory about regression analysis and regression models. I won’t be repeating much of the content so it’s a good idea to have a good understanding of it all before you read on with this tutorial.

Regression Library - Regression Models

In this tutorial we will be covering and implementing our first regression model - the simple linear regression model.

The Linear Regression Model

To start off with lets consider the first the Wikipedia article definition for the Simple Linear Regression Model:

Simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. In other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.

This is perhaps one of the easier to understand definitions. So in this model we have a single explanatory variable (X in our case) and we want to match a straight line through our points that that somehow ‘best fits’ all the points. in our data set.

This model uses a least squares estimator to find the straight line that best fits our data. So what does this mean? The least squares approach aims to find a line that makes the sum of the residuals as small as possible. So what are these residuals? The residuals are the vertical distances between our data points and our fitted line. If the best fit line passes through each of our data points, the sum of the residuals would be zero - meaning that we would find an exact fit for our data.

Consider this numerical example:

We have the data:

X         Y
2         21.05
3         23.51
4         24.23
5         27.71
6         30.86
8         45.85
10        52.12
11        55.98

We want to find a straight line that makes the sum of the residuals as small as possible. As it turns out the least squares estimator for this data set produces the straight line:

y = 4.1939x + 9.4763

as the line of best fit - that is, there exists no other straight line for which the sum of the residuals (the sum of the differences between the actual data and the modelled line) is smaller.

This makes a lot of sense as a line of best fit. There essentially exists no other straight line that could better follow our data set as that would mean the sum of the residuals would have to be smaller. So remember:

Residuals = the differences in the Y axis between the points of data in our data set and the fitted line from our model.

Good Model

So in the linear regression model we want to use a least squares estimator that somehow finds a straight line that minimises the sum of the resulting residuals. The obvious question now is how can we find this straight line?

The Math

So we want to find a straight line that best fits our data set. To find this line we have to somehow find the best values of our unknown parameters so that the resulting straight line becomes our best fit. Our basic equation for a straight line is:

Linear Equation

We want to find the values of a and ß that produce the final straight line that is the best fit for our data. So how do we find these variables?

Handily there is a nice formula for it (for those of us who don’t want to derive it):

Linear Regression Formula

Ok perhaps that formula isn’t that nice after all then at first glance. The good thing is it really isn’t all that bad when you get to know the symbols:

  • x with a line over the top is called xbar = the mean of the X values in our data set
  • y with a line over the top is called xbar = the mean of the Y values in our data set

The sigma (the symbol that looks like a capital E) is the sumnation operator. The good news for us programmers is that this symbol can be nicely converted into something we can all understand - the for loop - as we will see in a minute!

Now we could go off now and start trying to code up a solution to find ß, but it would be a lot easier if we could somehow modularise the formula a little more to make it easier to understand. Again handily the formula does that for us. Near the bottom is the formula we actually want:

Cov[x, y]
——————
Var[x]

Which stands for the covariance of x and y divided by the variance of x. Now we only have to find out those two calculations, divide them, and we have our result for ß.

We are only really worried about ß as finding a is easy once we have a value for ß. It’s just the mean of the y values minus the found value of ß multipled by the mean of the x values. Great!

Read More

Java - Calculate the Harmonic Mean

Here is a small snippet to calculate the harmonic mean of a data set.

The harmonic mean is defined as:


public static double harmonicMean(double[] data)  
{  
	double sum = 0.0;

	for (int i = 0; i < data.length; i++) { 
		sum += 1.0 / data[i]; 
	} 
	return data.length / sum; 
}
Read More

Java - Calculate the Geometric Mean

Here is a small snippet to calculate the geometric mean of a data set.

The geometric mean is defined as:

Formula


 
public static double geometricMean(double[] data)  
{
	double sum = data[0];

	for (int i = 1; i < data.length; i++) {
		sum *= data[i]; 
	}
	return Math.pow(sum, 1.0 / data.length); 
}
Read More

Java - Serialization Constructors

It is a common misconception that classes which implement the Serializable interface must also declare a constructor which takes no arguments.

When deserialization is taking place, the process does not actually use the object’s constructor itself. The object is instantiated without a constructor and is then initialised using the serialized instance data.

The only requirement on the constructor for a class that implements Serializable is that the first non-serializable superclass in its inheritance hierarchy must have a no-argument constructor. This is because when you serialize an object, the serialization process chains it’s way up the inheritance hierarchy of the class - saving the instance data of each Serializable type it finds along the way. When a class is found that does not implement Serializable, the serialization process halts.

Then when deserialization is taking place, the state of this first non-serializable superclass cannot be restored from the data stream, but is instead initialised by invoking that class’ no-argument constructor. The rest of the instance data of all the Serializable subclasses can then be restored from the stream.

For example this class which does not provide a no-arguments constructor:

public class Foo implements Serializable {  
	public Foo(Bar bar) {  
		...  
	}  
	...
	...  
}  

Although the class itself does not itself declare a no-arguments constructor, the class is still able to be serialized. This is because the first non-serializable superclass of this class, which in this case is Object, provides a no-arguments constructor which can be used to initialize the subclass during deserialization.

If however Foo extended from a Baz class which did not implement Serializable and did not declare a no-arguments constructor:

public class Baz {  
	public Baz(Bar bar) {  
	   ...
	}  
	...
}
public class Foo implements Serializable {  
	...
	...
}  

In this case a NotSerializableException would be thrown during the deserialization process as the state of the Baz class cannot be restored through the use of a no-arguments constructor. Because the instance data of the superclass Baz could not be restored, the subclass also cannot be properly initialised - so the deserialization process cannot complete.

Read More