My URL Shortener

One of the projects that I've been working on recently involved coming up with a multi-purpose URL shortener, like bit.ly's or tinyurl's. I had put a good bit of thought into it but I haven't gotten around to actually implementing it for the project yet. I know the concept is sound and lots of other people use it, so it ultimately comes down to how short and is it long enough to cover (and some) what you plan on using it for.

So I get kind of bored and decided I'll implement a command-line url shortener that ties into my website (tesuji.org) and give it a whirl. I opted to be ghetto-extreme and use, not 3, but only 2 characters in my shortened URL. I can shorten anything that I want, like http://tesuji.org/qy or http://tesuji.org/LN. I'm using the unreserved and sub-delimiter characters from RFC-3986 (well, most of them) since in a shortener context, at least in mine, most of those sub-delimiters are useless. So I've gone ahead and made shortened URLs for all of my archived news posts:

http://tesuji.org/lP => http://www.tesuji.org/old_news/android_wallpaper.html
http://tesuji.org/XW => http://www.tesuji.org/old_news/brett_keisel.html
http://tesuji.org/ax => http://www.tesuji.org/old_news/cinco_de_mayo_2005.html
http://tesuji.org/fM => http://www.tesuji.org/old_news/cinco_de_mayo_2006.html
http://tesuji.org/HT => http://www.tesuji.org/old_news/cinco_de_mayo_2007.html
http://tesuji.org/wu => http://www.tesuji.org/old_news/end_of_wudan.html
http://tesuji.org/Bt => http://www.tesuji.org/old_news/evageeks.html
http://tesuji.org/=k => http://www.tesuji.org/old_news/everquest_screenshots.html
http://tesuji.org/oH => http://www.tesuji.org/old_news/filing_technique.html
http://tesuji.org/tq => http://www.tesuji.org/old_news/firefox_and_ctrl_q.html
http://tesuji.org/Ph => http://www.tesuji.org/old_news/fractals.html
http://tesuji.org/2E => http://www.tesuji.org/old_news/fractals_and_electropaint.html
http://tesuji.org/7n => http://www.tesuji.org/old_news/gt4.html
http://tesuji.org/j_ => http://www.tesuji.org/old_news/hair_ball.html
http://tesuji.org/Uf => http://www.tesuji.org/old_news/honeypot.html
http://tesuji.org/Z~ => http://www.tesuji.org/old_news/index.html
http://tesuji.org/d5 => http://www.tesuji.org/old_news/jerminal.html
http://tesuji.org/Ec => http://www.tesuji.org/old_news/kerafyrm_the_sleeper_re-revisited.html
http://tesuji.org/J, => http://www.tesuji.org/old_news/lazy.html
http://tesuji.org/z2 => http://www.tesuji.org/old_news/libmatroska_and_mplayer.html
http://tesuji.org/;9 => http://www.tesuji.org/old_news/linux_certified_rant.html
http://tesuji.org/*! => http://www.tesuji.org/old_news/linux_edid_nvidia.html
http://tesuji.org/qR => http://www.tesuji.org/old_news/lupin_the_third.html
http://tesuji.org/MY => http://www.tesuji.org/old_news/mail_reader.html
http://tesuji.org/Rz => http://www.tesuji.org/old_news/mail_reader_update.html
http://tesuji.org/4O => http://www.tesuji.org/old_news/mail_reader_update2.html
http://tesuji.org/gV => http://www.tesuji.org/old_news/mail_reader_update3.html
http://tesuji.org/lw => http://www.tesuji.org/old_news/mandelbrot_generator.html
http://tesuji.org/WL => http://www.tesuji.org/old_news/metacity_is_lame.html
http://tesuji.org/aS => http://www.tesuji.org/old_news/mobile_device.html
http://tesuji.org/fJ => http://www.tesuji.org/old_news/new_laptop.html
http://tesuji.org/Gs => http://www.tesuji.org/old_news/new_laptop_2.html
http://tesuji.org/wj => http://www.tesuji.org/old_news/no_more_ctwm.html
http://tesuji.org/BG => http://www.tesuji.org/old_news/no_more_honda.html
http://tesuji.org/.p => http://www.tesuji.org/old_news/norio_wakamoto.html
http://tesuji.org/og => http://www.tesuji.org/old_news/old_recordings.html
http://tesuji.org/tD => http://www.tesuji.org/old_news/petr_sykora_and_the_pens.html
http://tesuji.org/Om => http://www.tesuji.org/old_news/random_plus_superbowl.html
http://tesuji.org/27 => http://www.tesuji.org/old_news/setting_fires.html
http://tesuji.org/7e => http://www.tesuji.org/old_news/ssh_brute_force_attacks.html
http://tesuji.org/i= => http://www.tesuji.org/old_news/summercon_2003.html
http://tesuji.org/U4 => http://www.tesuji.org/old_news/summercon_2004.html
http://tesuji.org/Zb => http://www.tesuji.org/old_news/summercon_2007.html
http://tesuji.org/c; => http://www.tesuji.org/old_news/summercon_2008.html
http://tesuji.org/E1 => http://www.tesuji.org/old_news/summercon_2011.html
http://tesuji.org/J8 => http://www.tesuji.org/old_news/the_sleeper.html
http://tesuji.org/yB => http://www.tesuji.org/old_news/the_sleeper_revisted.html
http://tesuji.org/$Q => http://www.tesuji.org/old_news/toshokan_downloader.html
http://tesuji.org/*X => http://www.tesuji.org/old_news/vintage_gp.html

The whole thing is really ghetto. It's just a bash script that I run on the command-line that calls a small java program that does some simple math and base conversion, then tickles a few files on the file system. The script will even return an old shortened URL if I ask for the same URL to be shortened again; so asking for http://myanimelist.net/animelist/Ornette&show=0&order=4 to be shortened a second time will still result in the same http://tesuji.org/RU. I guess that's not bad for 20 minutes of work.

Unfortunately, since this URL shortening solution is customized for me only, it's pretty much 99.9% useless to everyone else who isn't running a custom CMS system (e.g. emacs). If anyone wants to actually see the java code, let me know, it's simple enough that it's not worth porting to a more commonly used content generation language like php; you could just write it from scratch yourself just as easily. The more tedious part is integrating it into a site.

Here's a couple of other shortened pages:
http://tesuji.org/4v => http://www.tesuji.org/anime.html
http://tesuji.org/-K => http://www.tesuji.org/autotrain.html
http://tesuji.org/ll => http://www.tesuji.org/cockboat.html
http://tesuji.org/WI => http://www.tesuji.org/doublechicken.html
http://tesuji.org/9r => http://www.tesuji.org/everquest.html
http://tesuji.org/fi => http://www.tesuji.org/france_2005.html
http://tesuji.org/GF => http://www.tesuji.org/gatsby.html
http://tesuji.org/vo => http://www.tesuji.org/lord_stanley.html
http://tesuji.org/B- => http://www.tesuji.org/monopoly.html
http://tesuji.org/.C => http://www.tesuji.org/ohshit_its_rei.html
http://tesuji.org/n* => http://www.tesuji.org/omlette.html
http://tesuji.org/t6 => http://www.tesuji.org/picard_facepalm.html
http://tesuji.org/Od => http://www.tesuji.org/skate2.html
http://tesuji.org/1. => http://www.tesuji.org/thinkpad_t60p.html
http://tesuji.org/73 => http://www.tesuji.org/thinkpad_w700.html
http://tesuji.org/ia => http://www.tesuji.org/top_ten_best_words_in_the_english_language.html
http://tesuji.org/T$ => http://www.tesuji.org/shanghai_2006/shanghai_2006.html
http://tesuji.org/Z0 => http://www.tesuji.org/japan_2007/japan_2007.html
http://tesuji.org/cZ => http://www.tesuji.org/thailand_2009/thailand_2009.html
http://tesuji.org/yW => https://market.android.com/details?id=org.jonlin.wallpaper

Probably the best thing of all, at least for a little while, I can link people shortened URLs and they think, "oh, it's one of Jon's links", they click on it and BLAMO, gay porn.

Filed under: Computers
8/16/2011


HOME,CONTACT, TWITTER,