On-Web Hashtable … What I’m Up To

I tweeted a bit about a need for a “hashtable” on the Web. My needs are so constrained that it was hard to give me good answers. (Unless you were all just trying not to give me good answers. (No, that’s not possible. You’re nice people. Must be my fault.)) So here’s a bit more about what I’m up to.

The Constraints

I’m trying to build a little script in Linden Scripting Language, LSL, the scripting language for Second Life. LSL brings new meaning to the word “rudimentary”.  There is a very limited built-in library of calls, mostly aimed at making things in Second Life move or talk or such. As for web connections, you can send and receive short emails, and you can make an http request. This request has a URL, a few parameters, and a body. Parameters include HTTP_METHOD (GET, POST, PUT, DELETE), HTTP_MIME_TYPE (text/* and apparently application/x-www-form-urlencoded), HTTP_BODY_MAXLENGTH (not supported, locked at 2048), and HTTP_VERIFY_CERT.

The x-www… MIME type will let you URL encode a body with var=value&var2=value2 kinds of strings.

I want to use an existing on-line service, such as Amazon S3 or one of the hosted CouchDB or MongoDB services. Free usage at low volumes is a plus. Low cost usage at low volumes is a must.  Access using this rudimentary capability is, of course, a sine qua non.

I’m just messing around, so do not have the time or inclination to build something on my own site to host it and respond to a language of my own invention. Something out of the box is what I’m looking for.

The Problem

It should go without saying that a scripting language this limited also has limited storage. What I’d like to do is to store a moderate number (around a thousand) strings, by key. The key will be some unique string, the value another string. Keys around 20 or 30 characters, values around 200 or 300.

The script needs to be able to get a value string from the Web system, using the key, to put one back giving the key and value, and to delete a key. Simple hashtable kind of function.

The LSL language has an http_response “event” that will receive any response headers and the response body. My guess is that the content type of the body must be a string. I know for sure that LSL will not receive XML, JSON, Atom, RSS, or PLS content types.

Once the string is back, one can do reasonable parsing of it. Imagine what you could do with simple string functions: pretty much anything if you have the time.

So I’m looking for something on the Web, free or cheap, working with simple get, put, delete by key, storing a fairly small number of fairly short string objects, that can be accessed via a very simple text-focused URL request, returning a text-like response.

But I’m Very Dumb …

It does look to me as if S3 or MongoDB or CouchDB can probably do what I want. But since I have zero tools available in LSL, and little time to play on my PC to try to figure out what the URLs have to look like for get, put, and delete, I’m looking for a system whose docs show what the requests and responses actually look like, or for some examples that do the same.

Very lazy, I know. This is a tiny little hobby, and I can’t be investing hours and hours in it.

Got advice or ideas? Comments are open. Thanks!

Posted on:

Written by: Ron Jeffries

Categorization: Articles

11 Responses to “On-Web Hashtable … What I’m Up To”

Ron Jeffries

May 10, 2011

10:42 am

permalink

Thanks, Raul, I’ll have a good look at these. Simple examples will be way high on my list. I feel like I just need one random example of a get/put/delete to build on. I can make a string that looks like anything … what I don’t know yet is what the string has to look like! Thanks again!

Sam Leitch

May 10, 2011

11:12 am

permalink

Unfortunately, this one has tweaked my interest. You’re stuck with me :)

It would help to have a little more context around the data.

How often does the data need to be updated?
Does it need to be globally shared, or is it more of a backup/save?
Is there any way to organize it into chunks?

1000s of 200-300 byte objects are a little ugly for something like S3, but 100s of 2000-3000 byte string would be fine. The free tier gives 2000 updates per month and 20000 downloads (although 1000 more updates is $0.01).

So, if there’s a way to break up the data into chunks, you could store each chunk in S3 (using a group-key to organize them). You could use a list of key/value pairs to store the data in the script and push it to the cloud whenever it needs to be stored.

In your case, you would need to use the Query String Request Authentication Alternative described in the bottom of http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?RESTAuthentication.html

This method only uses Method, URL and Body. It does require an SHA-1 hash, but a brief hit of the LSL wiki shows am llSHA1String so you’re good.

The basic structure would look something like this.

For PUT:
Create an HTTP_REQUEST
The URL starts with /group-key
The URL includes the Authentication details
The Method is PUT
The Content-Type is application/x-www-form-urlencoded
The body is your key-values pairs that will magically be encoded as an html form

For GET:
The same reversed :)

That only works if the data is treated as a long term save. However, If you want shared data the process would be very similar.

David Hoppe

May 10, 2011

11:57 am

permalink

Interesting problem, and maybe I am thinking too simplistically but this is what I might do: Sign up for a webhosting site and publish all my keys to the webhost. The key would look like http://user:pass@xyzwebhost.com/keys/jkf298djk4 where http://user:pass@xyzwebhost.com/keys/ is a base url and the rest is the key. You may have to keep several directories for the keys depending on how many you have and how many files per directory the host can deal with. Then you can use llHTTPRequest to GET and PUT to the url(key) a file which is the value.
When you are done with a value use DELETE to get rid of it or PUT a special empty file into it.
Key management may be a special problem, coming up with a way to make a new key name and getting a list of current keys. If there is a static list of keys you will be using this is a non issue otherwise lets think about it more.

In short the file name is the key and the contents of the file the value.

You probably have this already but looks helpful: http://www.lslwiki.net/lslwiki/wakka.php?wakka=llHTTPRequest my user:pass may be too simplistic, they show a call to an escape function check it out.

While it would be fun and interesting to use an S3 or MongoDB maybe that hammer is too big.

Ron Jeffries

May 10, 2011

2:27 pm

permalink

Sam: If it solves my problem, I’m happy to be stuck with you. Updates, after initial loading, will be a few per day. Unfortunately, I need direct access by my base keys, I think, though I might be able to work up a key partitioning that would let me read a batch. Why does S3 not like my basic setup plan?

Keys are discovered and initially stored essentially randomly: I will need to create the initial stored objects directly from LSL. Monthly updates will likely be well below 2000.

My keys will come from detection in-world, so I do not have to save them. I’m basically retrieving previously discovered information about a key that has turned up again.

I’m not sure what you mean by “treated as a long-term save”. What happens if I put some new data to that key? Does Amazon start charging me shipping again or something? :)

Thanks … I’m getting it, sort of …

Ron Jeffries

May 10, 2011

2:30 pm

permalink

David: Thanks … I suspect I could just about do what you suggest on my existing web site, perhaps with a little help from the hosting guys setting up suitable folders. I will “only” have a couple of thousand keys max, initially more like 500. Partitioning them into folders may be tricky but I suppose I could checksum them or something to decide where to put one.

I’m not quite clear on just what the format is of the http request to put, get, or delete a file. Can you give me a bit more of a push on that.

I agree that the S3/Mongo/Couch is a big hammer but they are also kind of fun and it could be good to know for some more godly purpose.

Thanks!

Sam Leitch

May 10, 2011

3:04 pm

permalink

Haha nope, Amazon doesn’t do anything crazy. In fact what I’m describing is very similar to what to what David describes. The big difference is that Amazon requires some extra authentication on every request, where as your own hosting doesn’t have that restriction.

By long-term save, I was thinking something equivalent to user preferences. Something accessed when the user logs in and saved on demand or when the user logs out. The data would then have a nature group-key that is the user id. That would be different from a shared hashtable where multiple users access the same shared data set with no natural grouping of data. It sounds like your problem is more the shared hashtable type.

Like David said, you can just GET or PUT data to the URL http://s3.amazonaws.com/your-bucket-name/key?{authentication_details} where you choose the bucket name and key. The body of the message is the string you want to store.

To store data:
In the llHTTPRequest
url: http://s3.amazonaws.com/your-bucket-name/key?{authentication_details}
body: The string you want to save (Any string will do, no special formatting needed)
HTTP_METHOD: PUT
HTTP_MIME_TYPE: text/plain;charset=utf-8

In http_response
ensure that status is not 400+

To retrieve data:
In the llHTTPRequest
url: http://s3.amazonaws.com/your-bucket-name/key?{authentication_details}
HTTP_METHOD: GET

In http_response
ensure that status is not 400+
body is your data (The same string you PUT without any additions)

That’s it. The hard part is really figuring out the authentication portion. It appears to be a combination of MD5, SHA-1 hashes along with some llGetUnixTime and your body and key.

David Hoppe

May 10, 2011

8:48 pm

permalink

Found a little more interesting stuff in the link referenced, there is a service that looks like it will do exaclty what you want they also have sample LSL code. See http://w-hat.com/httpdb I will paraphrase here:

string DBNAME = “http://user:pass@somefakeserver.domain/database”;
string DBKey = “hashKeyName”; // set this how ever you need to
key requestId;

//save a value to the database:
saveToHash(string name, string value) {
llHTTPRequest(DBNAME + DBKey, [HTTP_METHOD, "PUT"], value);
}

// use this to request data from the web hash
getFromHash( string name ) {
requestId = llHTTPRequest(DBNAME + DBKey, [HTTP_METHOD, "GET"], “”);
}

//this is an event loop, when the data has been received from the server
// the http_response will be called.
default {
http_response(key reqid, integer status, list meta, string body) {
// this would be a good place to check “status” (200 is good)
if (reqid == requestId) {
//body contains the value for the requested key.
}
}
}

//end of the code.

this code does have a little issue, it appears that there can only be one read request “in-flight” at a time. Over come this by associating a “requestID” with the key name being loaded and resolving this association in the event loop.

I think this would work fine with the s3 service, you just have to build the url a little differently maybe something like

//globalish things
string DBNAMEPrefix = “http://s3.amazonaws.com/your-bucket-name/”;
string DBNAMEPrefix = “?{auth_details}”;
string DBKey = “hashKeyName”; // set this how ever you need to in your
string myURL = “nothing?”;

// call this then you can use myURL for the first
// argument of saveToHash or getFromHash
buildURL ( string key ) {
myURL = DBNAMEPrefix + key + DBNAMESuffix;
}

//end code

That part involves a LOT of guessing about LSL code, the warranty period has already expired ;-)

some reference material:
http://www.lslwiki.net/lslwiki/wakka.php?wakka=llHTTPRequest
http://lslwiki.net/lslwiki/wakka.php?wakka=http_response
and the one I used a lot of:
http://w-hat.com/httpdb
(these people may be able to hold your entire database if 250k is enough and it may not be given your estimates)

before you get too deep in LSL code you can test GETs and PUTs from a command line with “curl”

GET with curl:
curl http://somefakewebsite.domain/dbname/keyname

PUT with curl:
curl -T file_with_sample_data http://somefakewebsite.domain/dbname/keyname

I did find out that PUT may not be as common a feature as I thought for a hosted website, be sure to check with a provider to see if they support PUT.

The Amazon stuff looks pretty cool, will be checking that out myself.

Make sure to have a lot of fun with this.

David

David Hoppe

May 10, 2011

8:51 pm

permalink

Oh yeah, if you are on a linux based host, your limited to 31998 files (assuming ext2/ext3) file system. So don’t worry about separating into separate directories. I always worry about that because I burned myself in the past on a system that was unhappy to have more than a very small number if files in each directory.

Have a great day,

David

Marty Nelson

May 11, 2011

4:21 pm

permalink

I recommend using node.js and any of the numerous free hosting sites. For quick and dirty in 5 minutes, go to jsapp.us and try the following code (then hit ctrl+b):

var http = require(‘http’),
url = require(‘url’);
var store = {};

http.createServer(function (req, res) {
var query = url.parse(req.url, true).query;
res.writeHead(200, {‘Content-Type’: ‘text/plain’});
if(!(query.key)) {
res.end(‘No key provided’);
} else if (query.value) {
store[query.key] = query.value;
res.end(‘Added Key ‘ + query.key + ‘ with value ‘ + query.value);
} else {
res.end(store[query.key] || ‘Invalid key’);
}
}).listen();

Now you have a working endpoint you can work with to test with your script.

I used GET for everything, using ?key=foo to get and ?key=foo&value=bar to set. You can test req.method if you want to use other http verbs:

switch(req.method) {
case ‘GET’:
res.end(store[query.key] || ‘Invalid Key’);
break;
case ‘PUT’:

I just used an in-memory array. If this sounds good, there are real easy stores you can use instead.

Recent Articles

Issues with SAFe

Recent Twitter conversations inspire me to pull out some of my concerns with SAFe and talk about them.