Saturday, August 4, 2012

Dealing with International Characters in Web Services [Java]

Recently, we had to develop an API where data (resource in RESTful terms) gets created with an emailId as a primary key and then a subsequent lookup is performed to access that resource.

There is an RFC that allows email ids to have international characters:

To support international characters in our API, we had to do the following:
We develop our web services in Java Servlets using the Spring Framework and deploy them using Tomcat. The Create (resource) API had the emailId in the requestBody and the lookup API was a HTTP GET call with the emailId in the URL. 

To get the Create (resource API) working, the client needs to make the request with the charset parameter in the ContentType header:
Content-Type: application/json; charset=utf-8

This instructs the Servlet container to treat the requestBody as an UTF8 encoded string. Without the charset parameter, the encoding is assumed to be ISO-8859-1 (default for Java Servlets). If you ever have to store the string in a DB or encrypt it, remember to retrieve the string in UTF8 encoding. String.getBytes("UTF-8") like stuff.

Now for the lookup API, where the emailId is in the URI, you need to keep two things in mind:
  1. Special characters in URIs are percent encoded.  For a non-ASCII character, it is typically converted to its byte sequence in UTF-8, and then each byte value is percent encoded.
  2. Instruct the Servlet container about the UTF8 URI encoding. This can be done by updating the HTTP connector in server.xml:
    <Connector connectionTimeout="20000" port="8080" 
    protocol="HTTP/1.1" redirectPort="8443" URIEncoding="UTF-8"/>
    

For additional insight, the below StackOverflow link has great nuggets of knowledge:
http://stackoverflow.com/questions/138948/how-to-get-utf-8-working-in-java-webapps/138950#138950

I did try the CharacterEncodingFilter filter approach but didn't get it to work. I did set it up as the first filter, also did the force encoding bit wasn't doing anything.

No comments:

Post a Comment