Archive for category software development

WP-QREncoder Wordpress Plugin

I managed to get an Android powered phone recently and quickly discovered the Barcode Scanner app is a common and seemingly preferred method of encoding and transmitting data in the Android community (and others as well I’m sure).

In less than a week I’ve grown to love the Android platform and I’ve already got a few ideas for some Android apps to write. But first I wanted to make sure I could post links to my apps using handy-dandy QR-encoded images easily within Wordpress.

So, borrowing somewhat from the WP-Footnotes plugin I set about to create the first rendition of the WP-QREncoder plugin for Wordpress.

This plugin is capable of encoding any string of text (specific use case is a URL) and is still in it’s infancy so I would appreciate any feedback you might have.

Download WP-QREncoder plugin here.

For more information (including usage) see the plugin’s permanent page here.

Oh, and here’s an example of the plugin in action:

  • Share/Bookmark

Tags: , , , ,

Simple JSON-RPC updated to 0.9.5

The simple JSON-RPC package has been updated to 0.9.5 It has undergone some extensive refactoring and now includes documentation, and an example project. The source to this package is also available here.

For more information (and for future updates), visit the new project page here.

If you are interested in using, contributing to, or reporting bugs for this project, contact us!

  • Share/Bookmark

Tags: , , ,

Simple Java implementation of JSON-RPC

Preamble

Explanation of standard formats and protocols:

  • JSON (JavaScript Object Notation) is a lightweight1 data-interchange format with language bindings for C, C++, C#, Java, JavaScript, Perl, TCL and many others.
  • JSON-RPC is a simple remote procedure call protocol similar to XML-RPC although it uses the lightweight JSON format instead of XML.

What is it?

I love the simplicity of JSON-RPC when it comes to rapidly devloping cutting-edge web applications.

I recently decided to start writing more server-side code in Java servlets in order to take advantage of cloud-based infrastructures such as the Google App Engine. Until now I have written my web applications using PHP as the server-side language. Specifically using frameworks such as Symfony or Kohana, which makes writing simple JSON-RPC services relatively trivial.

So I started looking for a simple JSON-RPC system I could use in Java to abstract my business logic classes from the mundane tasks associated with handling web requests. I found several packages which all claimed to implement the JSON-RPC protocol via Java servlets, however they seemed to require far more by way of setup than I wanted and they all seemed to require the developer to write applications to their specific implementation’s standards.2

So I decided to write yet another Java implementation of the JSON-RPC specification myself with the following goals in mind:

  • Easy to implement. Setup for this package should be kept at a minimum. This includes both development as well as production setup.
  • Easy to code in. Application developers using this class should not need to know much about JSON-RPC beyond exposing methods in their code that can be called remotely from other applications via the web.
  • Non-invasive. Developers using this implementation should be able to reuse plain old Java object (POJO) classes as much as possible, making the transport layer of JSON-RPC as transparent as possible.

With these goals in mind I set off to develop the package com.werxltd.jsonrpc to be a simple wrapper designed to be used inside of a standard .war project.

Setting it up

You can either download the .jar file here to include in your project manually or (and this is my preferred method) you can import the com.werxltd.jsonrpc package as a dependency in your Maven-managed project by specifying the following in your project’s pom.xml configuration file:

<repositories>
    <repository>
        <id>werxltd</id>
        <url>http://maven.werxltd.com </url>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
        <releases>
           <enabled>true</enabled>
       </releases>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>com.werxltd</groupId>
        <artifactId>jsonrpc</artifactId>
        <version>0.9</version>
    </dependency>
</dependencies>

Next, you’ll need to specify endpoints in your .war file’s web.xml configuration file. Here’s an example:

<web-app>
    <servlet>
        <servlet-name>example</servlet-name>
        <servlet-class>com.werxltd.jsonrpc.RPC</servlet-class>
        <init-param>
            <param-name>rpcclasses</param-name>
            <param-value>YourClass</param-value>
        </init-param>
    </servlet>

    <servlet-mapping>
        <servlet-name>example</servlet-name>
        <url-pattern>/example</url-pattern>
    </servlet-mapping>
</web-app>

That’s it! Now your project is configured to filter all requests sent to /example through the JSON-RPC class which examines the class name you passed in (in this case, YourClass) for public methods it can expose. An instance of your class is created internally if your class is not static and it will remain in memory throughout the life of the servlet. Any exceptions your class generates are gracefully wrapped inside a JSON-RPC error message for proper handling upstream.

Using it

While the setup is pretty much straightforward, due to the loose typing found in JavaScript (and, as a result, JSON) there are some caveats in how methods are called. Specifically in how arguments are passed to those methods.

Scanned methods are stored internally with a signature consisting of the method name and how many arguements that method accepts. When a JSON-RPC request is made, the servlet determines how many parameters were included and attempts to match the method requested with a corresponding internal method which has the same number of parameters/arguements.

Because parameters are passed in via the web, only Java primitive data types along with three others, JSONObject, JSONArray and java.lang.String are accepted as valid parameter data types.

If a match of method name and number of parameters is found, an attempt is made to parse the passed-in data into the method’s required type. Any methods which accept as their first parameter a parameter of the JSONObject type, this method automatically takes prescience over all other parameters.

To use the JSON-RPC interface from another application, you must pass a valid JSON-RPC object to your final servlet as a parameter named “json” via either GET or POST. Here is an example of the valid JSON-RPC object you need to pass:

{
    "method":"add",
    "params":[1 2 3]
}

This JSON-RPC implementation accepts either named parameters or positional parameters like the ones shown above. Here is a named parameters example:

{
    "method":"echo",
    "params":{
        "text":"testing"
    }
}

The road ahead

Future development of this class will include more formalized access between the JSON-RPC layer and the underlying classes.

I’m also planning to post the Javadocs and the source to an example project that utilizes the JSON-RPC transport package.

Hope this helps someone else. I’m looking forward to using this class as a central component in many rich web projects I have planned.

  1. Lightweight in both size and resources required to process data encoded in JSON vs. XML. []
  2. I did find this project after finishing the first revision of my class. It looks great and like it would do much of what I wanted, however the code is proprietary. After I finish documenting my classes I plan on releasing them under an OSI-approved licence. If you are interested in helping me with this project, feel free to let me know! []
  • Share/Bookmark

Tags: , , , , , , , , , ,

Getting Maven and Eclipse to play together

I love using Maven for dependency management and code portability, and I love Eclipse as an enviroment to develop in. However, for the longest time I had trouble getting the two to play well together until I discovered the following commands that made combining the two much easier.

To add Maven repositories to your Eclipse workspace (for code completion and syntax verification) run the following command:

mvn -Declipse.workspace=/path/to/workspace eclipse:add-maven-repo

To add an Eclipse .project file to your project run the following command:

mvn eclipse:eclipse

That’s it! You should now be able to import the project into Eclispe. I haven’t figured out how to build Maven projects in Eclipse yet1 so building and testing your code still requires you to use Maven via the command line.

You’ll also need to re-create the Eclipse project file if you add any dependencies in order for them to be picked up properly in Eclipse.

Need more? Check out this site for more on Maven integration in Eclipse

  1. I’ve seen the plugins but haven’t gotten any to work well enough to rely on. []
  • Share/Bookmark

Tags: , , , ,

Hacking your router for effective internet monitoring

The Why: Preamble

Working in the information technology sector, one of the most common questions I get asked by parents is about monitoring internet access of their children.1

Most parents want to know what their children are doing online but also recognize that most off-the-shelf products are just as easy to disable or circumvent (or are far more restrictive/bloated than they want) as they are to install or operate. And sadly, enterprise solutions that capture and control network traffic at the most basic level (making circumvention next to impossible) is still very expensive and therefore out of reach for the average family.

What I needed was a cheap and hackable router that I could modify to send captured URLs to a central source for storage and processing.

The What: WRT54G

Linksys-WRT54G-Ultimate-HackingAfter studying my options I remembered reading a lot about the Linksys WRT54G-series routers and how they were originally based on a heavily modified version of Linux and how Linksys made headlines when it lost a court case regarding the GPLed code it used in their router’s firmware.

So I did a little digging.

What I found was a whole router-hacking subculture built around the WRT54G. While it seems that much of the initial fervor has subsided, many of the packages show a last update time of 2007 or so, the documentation is still valid for the most part. The most popular projects which provide custom firmware are the OpenWRT and DD-WRT. While OpenWRT is the original, I found DD-WRT to be a lot more polished and (as we’ll see later) configurable without much headache.

It’s important to note here that the WRT54G has many variants and its easy to fall into the trap of thinking that any old WRT54G will do but a little diligence and study of the differences between the hardware revisions will certainly save you time and money.

After buying a few different routers and bricking one (a Buffalo AirStation WHR-HP-54G2 ) and a false start with a newer WRT54G v7 (anyone need a highly configurable, albeit not-very-hackable router?) I discovered that the best router for hacking is the WRT54GL (which was designed by Linksys to allow for user modifications).

The How: URLSnarf and custom shell scripts

Space on a router is very limited. On the WRT54GL model I eventually ended up using I had 4Megs of space to work with.

The first order of business was to find a package that could monitor all of the network connections (wired and wireless) on the router and capture requested URLs. For this task I discovered  that URLSnarf, part of the dsniff OpenWrt package, worked quite well.

To install packages I used DD-WRT’s firmware modification kit which allowed me to simply add the scripts and packages I wanted without having to recompile everything.

Next I needed to transform the captured URL into a URLencoded string in order to send it to my monitoring service via a simple wget request. Initially I tried using several variations of user-generated Python and PHP packages but they both took up far more space than I could afford so, instead, I searched for a pure command-line based solution.

After some more digging I found a handy sed substitution script that worked like a charm. The script worked in two parts, the first one being the substitution script (/usr/bin/urlencode.sed):

s/%/%25/g
s/ /%20/g
s/ /%09/g
s/!/%21/g
s/"/%22/g
s/#/%23/g
s/\$/%24/g
s/\&/%26/g
s/'\''/%27/g
s/(/%28/g
s/)/%29/g
s/\*/%2a/g
s/+/%2b/g
s/,/%2c/g
s/-/%2d/g
s/\./%2e/g
s/\//%2f/g
s/:/%3a/g
s/;/%3b/g
s//%3e/g
s/?/%3f/g
s/@/%40/g
s/\[/%5b/g
s/\\/%5c/g
s/\]/%5d/g
s/\^/%5e/g
s/_/%5f/g
s/`/%60/g
s/{/%7b/g
s/|/%7c/g
s/}/%7d/g
s/~/%7e/g
s/	/%09/g

and the command line to use it:

sed -f urlencode.sed

To tie it all together, we can pass captured URLs to it via pipes from the command-line with:

urlsnarf | sed -f urlencode.sed

At this point, the only missing link of the capture chain is a script to continually read from the command line and send the urlencoded capture data to our storage application (described in the next part). For this task I used the following script (/usr/bin/urlmon.sh):

HOSTNAME=`hostname`
while read url; do
    DATE=`date +%s`
    echo $(wget -q -O- "http://myapp.appspot.com/log?l=$url&h=$HOSTNAME&t=$DATE")
done

exit 0

Finally, we need to have the router start listening for URLs as soon as it is booted. In a Linux environment this is generally done by init scripts. Since our router has limited capabilities, we don’t need to write a full init script. Here is the slimmed down init script I used (/etc/init.d/S50urlmon):

#!/bin/sh

/usr/sbin/urlsnarf -v "/(192.168.1.1|https\://myapp\.appspot\.com)/" | sed -f /usr/bin/urlencode.sed | /usr/bin/urlmon.sh

The Where: Google App Engine

I’ve been itching to try out Google’s App Engine for a while now and this project seemed to be a great fit since I didn’t know how much data to expect and I needed my receiving/processing/display application to be highly available and scalable. Especially if this works well enough that others might want to use it.

Since my initial phase is to merely capture the URLs requested from devices behind the router, and since the capture process should be as efficient and lean as possible (I don’t want the router to take very long logging a URL when it’s primary job is to retrieve that URL for the initial requester) I decided to make a simple Java servlet which simply takes the URLencoded log line generated by URLSnarf.

Google App Engine uses Java Data Objects enhanced by DataNucleus to store data in Google’s massive cluster. Here is the annotated JDO (LogLine.java) I used to store the captured URL:

import javax.jdo.annotations.IdGeneratorStrategy;
import javax.jdo.annotations.IdentityType;
import javax.jdo.annotations.PersistenceCapable;
import javax.jdo.annotations.Persistent;
import javax.jdo.annotations.PrimaryKey;

@PersistenceCapable(identityType = IdentityType.APPLICATION)
public class LogLine {
	@PrimaryKey
	@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
	private Long id;

	@Persistent
	private String host;

	@Persistent
	private Long time;

	@Persistent
	private String line;

	public void setId(Long id) {
		this.id = id;
	}

	public Long getId() {
		return id;
	}

	public void setLine(String line) {
		this.line = line;
	}

	public String getLine() {
		return line;
	}

	public String getHost() {
		return host;
	}

	public void setHost(String host) {
		this.host = host;
	}

	public Long getTime() {
		return time;
	}

	public void setTime(Long time) {
		this.time = time;
	}
}

And here is the servlet that processes the GET request (containing the captured URL in Apache Common Log format)

import java.io.IOException;
import java.net.URLDecoder;

import javax.jdo.PersistenceManager;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.werxltd.webmon.data.LogLine;

public class Log extends HttpServlet {
	private final static long serialVersionUID = 3;

	public void doGet(HttpServletRequest req, HttpServletResponse resp)
    	throws IOException {
			try {
				resp.setContentType("text/plain");

				LogLine logline = new LogLine();
				String logStr = URLDecoder.decode(req.getParameter("l"));
				logline.setLine(logStr);
				logline.setHost(req.getParameter("h"));
				logline.setTime(Long.parseLong(req.getParameter("t")));

				PersistenceManager pm = PMF.get().getPersistenceManager();
				pm.makePersistent(logline);
				pm.close();	

				resp.getWriter().println("OK");
			} catch (Exception e) {
				e.printStackTrace();
				resp.getWriter().println("FAIL");
			} finally {

			}

	}
}

The future

This project is still in it’s early stages. There is no real way to view the captured data just yet, though I plan on incorporating Polliwog, and the router software hasn’t been tested as much as I would like. I’m also leery of any security holes I may have introduced.

So if you have any suggestions or would like to know more, feel free to leave a comment below!

  1. Most actually ask about “controlling what their kids see online” but I generally argue for a observe-only approach as it helps open lines of communication with your child whereas silently blocking “bad” sites will only start a silent war which will only frustrate you once they do find a suitable workaround, such as a proxy. []
  2. I might have had better luck had I seen this helpful guide. Oh well, this gives me a future project in figuring out how to de-brick my WHR-HP-54G []
  • Share/Bookmark

Tags: , , , , , , ,

Beginner’s guide to load testing

Recently I got tasked with load testing an internal system and producing statistics for the team to show how well it will scale once it is put into production.

After some intense research I decided to go with “The Grinder” which allows multiple tests to be run by multiple machines which can all funnel their collected statistics back up to a central “console”. Tests are written in Python which, in turn, gets fed through Jython and converted into native Java bytecode to be run by participating Grinder agent instances. Grinder works on a single, user-definable port, for both pushing scripts to listening agents as well as gathering statistics from tests.

Initially I decided to try and capture the results of each individual test in a MySQL database but abandoned that idea when the tests ended up overloading the MySQL database server before the web app we were primarily testing. Logging results also proved to be an interesting feat since it swampped the agent’s filesystems after less than an hour (we were running multiple processes and threads) as well.

We eventually settled on simply capturing the combined statistical data at the root console level (the way Grinder is designed) and displaying it via a plugin in Hudson.

Overall Grinder worked great for testing the load of our web app (which passed with flying colors). And since Grinder works natively in Java we are also planning on testing specific Java classes directly in the future as well as their overall performance through a web based front-end such as a servlet.

  • Share/Bookmark

Tags: , , ,

Tweet Later Mobile

I’ve recently become a big fan of SocialOomph (formerly Tweet Later), which is a Twitter client that allows users to, among other things, schedule timed tweets. Until I discovered SocialOomph I have been using Ping.fm and still do from time to time if I want to update a broader range of social media services. However, one of the things I found lacking with SocialOomph was it’s lack of support for access from a mobile device.

Enter Tweet Later Mobile.

One of the best trends recently has been for sites to open up APIs to allow other developers like me to easily develop applications that interact with their service. So even though SocialOomph didn’t have a native way to access their service via a mobile device (specifically a mobile browser) they made building such an interface possible.

So check it out, let me know what you think. I built this utility for myself but I am interested in seeing if anyone else finds it useful. You’ll need to sign up with SocialOomph first and copy your API key to use when you register at Tweet Later Mobile. The rest should be self-explanatory but if you have any questions or comments, feel free to let us know!

  • Share/Bookmark

Tags: , , ,

SOAPjr Demo

A demo of SOAPjr using PHP/Symfony and ExtJS is now available at http://dev.communitybookshelf.org/

This demo showcases these custom components:

Questions/comments? We’d love to hear from you!

  • Share/Bookmark

Tags: , , ,

Running PHP in Java

Many might consider even the thought of running PHP inside of a Java Virtual Machine to be anathema. Others will wonder why bother (apart from the novelty). However running PHP in Java has one crucal benefit: it future-proofs your code.

Quercus is a nifty utility that will allow you to run PHP code in clouds such as Google App Engine1. This means your Drupal and Wordpress sites can now be distributed across a highly avaliable and scalable cloud infrustructure.

Now if we can only get an MVC framework like Kohana or Symfony to work on top of this system..

  1. Other great articles on running PHP in Google’s App Engine can be found here and here. IBM has also highlighted this utility. []
  • Share/Bookmark

Tags: , , , , , ,

Using Python with Hadoop

First, some review

Hadoop is a very powerful MapReduce framework based on a white paper released by Google documenting how they have successfully tackled the issue of processing large amounts of data (on the scale of petabytes in many cases) using their proprietary distributed filesystem, GFS. Hadoop is the open source version of this distributed file system1, heavily supported by companies like Yahoo, Google, Amazon, Adobe, Facebook, Hulu, IBM, RackSpace, etc. and and has a growing number of related projects hosted by the Apache Foundation.

Why we need to learn “yet another language”

Yet, even with all of the buzz and hoopla many people find it difficult to setup and start writing applications capable of levreging the awesome power of an Hadoop cluster, many find the learning curve of Java and the Hadoop APIs very steep.

Fortunately one of the features available in Hadoop is HadoopStreaming which allows programmers to specify any program (or script) as a mapper and/or reducer. Consequently, one of the most popular scripting languages to use alongside Hadoop is Python2.

One of the reasons Python is well suited to this type of work is it’s ability to be functional provided you are careful how you write it. This makes chopping well-written Python map/reduce scripts up into distributable units much easier.

There’s a framework for that

While it is possible to write plain Python scripts, the folks at last.fm have helped create an excellent Python framework for Hadoop called Dumbo to help streamline the process of writing MapReduce jobs in Python. Dumbo seems to be a fairly simple framework with plenty of examples you can adapt to your particular needs.

There’s a framework for that too

Hadoop has many sub-projects, and one that is fairly popular is called HBase which allows a more structured, database-like, approach to storing and retrieving data. An excellent Python framework for quickly parsing data into HBase tables is Zohmg. This framework allows programmers to define tables in a YAML configuration file and corresponding mappers as simple Python scripts.

Bringing it back home

One of the biggest drawbacks to using HadoopStreaming is that it is inherently less optimal than writing MapReduce jobs in Java since the target script or application has to be initialized, the data then has to be serialized, sent to the target application/script, processed, and then sent back (if there are any reducers). All this context switching adds overhead that wouldn’t exist if the MapReduce job were kept in the JVM where Hadoop runs.

Jython is a viable answer for converting existing Python applications into Java bytecode to prevent incurring as much of a performance penalty. This utility can come in handy if you decide that your “quick and dirty” Python script needs to be moved into a production environment.

  1. Technically Hadoop is an umbrella name whereas HDFS is the technical name for the GFS alternative. []
  2. If you aren’t familiar with Python and want to learn, here is an excellent site for diving into the language and here is an excellent video series walking you through the basics. []
  • Share/Bookmark

Tags: , , , , , , ,