Regular Expression (RegEx) for matching Urls as in the rfc

When you try to find a regular expression for a url you will end up finding 20 different expressions which all have here and there some problems
Recently i've found a very good webpage which directly transformed the rfc-definition into a regular expression
look here
My modification to the original regex are:
  • keeping only ftp and http since all other protocols aren't required here
  • replacing the / with \/ to make php preg_replace() happy
  • added anchor to the end of http to allow urls like (this one surely needs improvements - but I don't know where the valid anchor signs are listed

And finaly I come along with that:
If you don't need ftp or http - they are quite easily divided so it's no problem to strip one thing out.

It can be used to transform links in normal text (e.g. a forum-post) into an <a href="..."> sheme
So in the following some tests to demonstrate that it works how it should work:
http://12345.0.0.1 http://12345.0.0.1
http://localhost/ http://localhost/
muh asd
http://b.d/?a=%20@/ http://b.d/?a=%20@
http://dreilä http://dreil
http://blub.deVeryLongEnding/asd http://blub.deVeryLongEnding/asd

As you see there is an Error with that Regex - i think "/" should be matched too - also german umlauts are valid urls now too.
I also removed ftp and all upercase versions inside the regex to simplify it.
The strange ?: are also removed
Next i fixed the invalid ips

Version 2

http://localhost/ http://localhost/
muh asd
http://b.d/?a=%20@/ http://b.d/?a=%20@/
http://dreilä http://dreilä
http://blub.deVeryLongEnding/asd http://blub.deVeryLongEnding/asd

remaining Errors

At the bottom I added those urls which are not valid but can match..
The Problem is, that i have to keep a List of all valid Top Level Domains
The 256 Ip shouldn't match since Ips only go up to 255 - but this would look to complex inside the regex,
the same goes for the 65540 port - I think the highest possible is 65535


Collect more testing Urls (maybe with usersuggestion)
Simplify more (since it was autocreated there might be some redundancy
