there are numbers of url shortening services available these days, including the good old tinyurl and something really short like u.nu. now when you get the short url shortened by using any of these services, you dont know where your browser is taking you! so if you are interested to figure out the original url hiding behind these short url, you need to have a little knowledge on how these services actually work. if you go to any of these short urls, they tell your browser “HTTP 30X: Object has moved” HTTP HEADER (optionally, some does it, some doesn’t) and then asks your browser to move to the original url using “Location” in HTTP HEADER. so all you have to do is just get the HTTP HEADER out first (PHP and Curl is pretty good at doing this, heh heh) and then parse the “Location” parameter from it.
lets see how that works in code
< ?php $url = "http://tinyurl.com/2dfmty"; $ch = curl_init($url); curl_setopt($ch,CURLOPT_HEADER,true); curl_setopt($ch,CURLOPT_RETURNTRANSFER,true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION,false); $data = curl_exec($ch); $pdata = http_parse_headers($data); echo "Short URL: {$url}<br/>"; echo "Original URL: {$pdata['Location']}"; function http_parse_headers( $header ) { $retVal = array(); $fields = explode("\r\n", preg_replace('/\x0D\x0A[\x09\x20]+/', ' ', $header)); foreach( $fields as $field ) { if( preg_match('/([^:]+): (.+)/m', $field, $match) ) { $match[1] = preg_replace('/(?< =^|[\x09\x20\x2D])./e', 'strtoupper("")', strtolower(trim($match[1]))); if( isset($retVal[$match[1]]) ) { $retVal[$match[1]] = array($retVal[$match[1]], $match[2]); } else { $retVal[$match[1]] = trim($match[2]); } } } return $retVal; } ?>
now you see that the output of this code is
Short URL: http://tinyurl.com/2dfmty Original URL: http://ghill.customer.netspace.net.au/embiggen/
pretty interesting huh? if you analyze the full headers for each of these services you will find that most of them are using PHP in backend with Apache. only http://u.nu is using mod_rails (hence RoR) and bit.ly uses nginx π
have fun in expanding!
Another great post π
Yah.. another one…
Hasin bhai Rocks~ π
awesome tips
that is too good hasin bhai :). don’t think about it before like this.
Very cool using Curl! Alternatively you can use:
$url = “http://tinyurl.com/2dfmty”;
$realLocation = get_headers($url,1);
echo $realLocation[‘Location’];
If you just want the url. But there definitely is a lot that can be done with curl. Thanks! =)
@iDayDream – that’s also cool and even easier when you dont have curl available in your hosting accoutn! i didn’t notice get_headers() function before!! thanks π
@iDayDream really thats cool π
What a trix!
Thanks guru for another interesting post, I knew that they are using php header location like function to do this type of job, now it’s more clear to all of us. π
@iDayDream thanks for the short way, I didn’t know the get_headers() function before. This function really impressed me a lot.
Hasin bhai,
I was getting the following error while I was just running your code.
Warning: preg_replace() [function.preg-replace]: Compilation failed: unrecognized character after (?< at offset 3 in C:\xampp\htdocs\lifeundersun.php on line 18
totally new things for me, even get_headers()! thanks hasin vai and idaydream
Too bad this doesn’t always work… Like on digg.com shorteners!!! Stupid framed crap. Also if you do it a lot on tr.im they will ip block you. At least that’s my experience.
Thanks, Hasin Vaiya.
http://bit.ly has a rich set of API including the “/expand” API. It lets you expand bit.ly URLs using either shortURLs or a bit.ly hash.
Another way is sending only HEAD request and parse the headers. Dont know how to send HEAD request by CURL. But using stream_* functions it can be done
Here’s the python version:
http://masnun.com/2009/05/11/python-tool-to-expand-short-urls/
Hasin Bhai …..rockz……..!!!
This code shows error like:
Warning: preg_replace() [function.preg-replace]: Compilation failed: unrecognized character after (?< at offset 3 in C:\xampp\htdocs\test\short.php on line 18
can any one help me quickly
Of course this will help me to understand curl and it’s app..thanks vaia.
I have to do it… thanks for tips!
This is just what I was looking for…
Thanks
Thanks , thatβs exactly what I needed and it worked at the first try. Thanks again
Great, is there any way to use this script in a wordpress blog?
I get Warning: preg_replace() [function.preg-replace]: Compilation failed: unrecognized character after (?< at offset 3 in /var/www/html/dev01/test2.php on line 21
yup that’s great but is there any other way for wordpress blog?
Anything similar in java?
i am getting this error
Warning: curl_setopt() [function.curl-setopt]: CURLOPT_FOLLOWLOCATION cannot be activated when in safe_mode or an open_basedir is set in /home/designer/public_html/curl.php on line 11
HTTP error: 301. What to do?
thanks for the post.
in python you can simply do this:
import urllib
urllib.urlopen(“http://bit.ly/VihMr”).geturl()
I agree with Khashayar in python you can simply do this by using the above given syntax. But it’s good to come up with different ideas and techniques for a single change.
Thanks for this post it’s very useful for me thank you very much.
Awesome post keep it up and thanks for sharing your informative post.
Hughesnet Broadband
great to share this article awesome post.
$url = βhttp://tinyurl.com/2dfmtyβ;
$realLocation = get_headers($url,1);
echo $realLocation[‘Location’];
Internet Hosting
Great trick, thanks a lot (even if I prefer the IDaydream solution)
Interesting post. Thanks for sharing this information.
Logo Design
Simpler:
function expand_shortlink($url) {
$headers = get_headers($url,1);
if (!empty($headers['Location'])) {
$headers['Location'] = (array) $headers['Location'];
$url = array_pop($headers['Location']);
}
return $url;
}
This code handles multiple levels of redirection as well.
Thanks you very very much you know i was always thinking how they create short URL of our website i wish to know the script of this work thanks to provide me the scripts of short URL
Some awesome tricks to make shorturls.
Great tips to make shorturl
Okay, three people have mentioned the bug about an invalid character in preg_replace, and no response to that?
I guess I’ll have to figure it out myself π¦
Great post you posted here
Wonderful post! I like your blog, and am a regular follower. I will be back
The information mentioned in the article are some of the best available.
Resources like the one you mentioned here will be very useful to me!
hello!! Very interesting discussion glad that I came across such informative post. Keep up the good work friend. Glad to be part of your net community.
Note that using get_headers() is slower than using cURL by a lot. Nearly twice as slow. get_headers() uses GET instead of sending a head request which is what cURL does when you pass the option. You may also want to consider following directs (another option in cURL) because some short URLs can be shortened again by other services and who knows whatever other redirects. You also may wish to set a timeout option in the cURL option as well for safety.
If anyone is considering using this in a batch capacity… Also note that in my tests, resolving about 4,000 links took about an hour and a half. So keep in mind how intensive this process is because of all the dns resolving. Note that there are caching options with cURL that I’m not sure you benefit from with get_headers().
I would always use cURL, though if all you’re looking for is the URL you may wish to simplify the regex to something like:
if(preg_match_all(‘/Location:\s(.+?\s)/i’, $headers, $matches)) {
$url = trim($matches[1][count($matches[1]) – 1]);
}
This accounts for possible redirects…Note if you follow them, the cURL is going to output all the headers and you’ll be after that last Location: xxxxx value.
I really like this site, its such a nice site.
ENT Instruments
I really like this site, its such a nice site.
Gynecology Instruments
Airport security is necessary inconvenience but it gives us peace of mind when traveling.
Just wanted to say that I read your blog quite often and am always amazed at some of the stuff people post here. But keep up the good work, it is always interesting.
curl is very useful. Thanks for sharing a wonderful hack
function get_expand_url_api($url){
$detail=$longurl=”;
$parsedata=array();
$url= “http://api.longurl.org/v2/expand?url=”.urlencode($url).”&response-code=1&format=php”;
//get the contents from the site by file_get_contents.
$sXML= @file_get_contents($url);
$data= unserialize($sXML);
//print_r($data); die();
if ($data[‘response-code’]==200){
return empty($data[‘long-url’])? $url : $data[‘long-url’];
}
else {
return ”;
}
}