Unix Programming - Regex for anchor Tag

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > January 2008 > Regex for anchor Tag





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Regex for anchor Tag
meendar@gmail.com

2007-12-30, 1:27 pm

Hi,

anyone know the RE expression for finding the anchor tags in an html
page.


Data : xxxxxxxxxxxx<a href ="xxxx.com" ></a>

I just need <a href ="xxxx.com


Thanks,
Meendar
Barry Margolin

2007-12-30, 1:27 pm

In article
<4eadab70-70ed-4da7-9867-1839fa4d5c6e@l6g2000prm.googlegroups.com>,
meendar@gmail.com wrote:

> Hi,
>
> anyone know the RE expression for finding the anchor tags in an html
> page.
>
>
> Data : xxxxxxxxxxxx<a href ="xxxx.com" ></a>
>
> I just need <a href ="xxxx.com


You don't want the '"' at the end of the URL? And what about the
closing '>'?

< *[aA] [^>]

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
meendar@gmail.com

2008-01-01, 1:36 am

On Dec 30 2007, 10:50 pm, Barry Margolin <bar...@alum.mit.edu> wrote:
> In article
> <4eadab70-70ed-4da7-9867-1839fa4d5...@l6g2000prm.googlegroups.com>,
>
> meen...@gmail.com wrote:
>
>
>
>
> You don't want the '"' at the end of the URL? And what about the
> closing '>'?
>
> < *[aA] [^>]
>
> --
> Barry Margolin, bar...@alum.mit.edu
> Arlington, MA
> *** PLEASE post questions in newsgroups, not directly to me ***
> *** PLEASE don't copy me on replies, I'll read them in the group ***


> < *[aA] [^>]


There is some possiblity to have any text after the href end ie..

<a href = "xxxx.com" title ="new"></a>

I am looking for only <a href = "xxxx.com
Barry Margolin

2008-01-01, 1:36 am

In article
<7e658650-0442-492a-aae4-405e7721072c@t1g2000pra.googlegroups.com>,
meendar@gmail.com wrote:

> On Dec 30 2007, 10:50 pm, Barry Margolin <bar...@alum.mit.edu> wrote:
>
>
> There is some possiblity to have any text after the href end ie..
>
> <a href = "xxxx.com" title ="new"></a>
>
> I am looking for only <a href = "xxxx.com


< *a +href *= *"[^"]*

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
Scott Lurndal

2008-01-01, 7:23 pm

meendar@gmail.com writes:
>On Dec 30 2007, 10:50 pm, Barry Margolin <bar...@alum.mit.edu> wrote:
>
>
>There is some possiblity to have any text after the href end ie..
>
><a href = "xxxx.com" title ="new"></a>
>
>I am looking for only <a href = "xxxx.com


Use an xsl stylesheet processed by xsltproc.

e.g. something like:
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:template match="//a">
&lt;a href="<xsl:value-of select="attribute::href"/>
</xsl:template>
</xsl:stylesheet>


Run this through xsltproc:

$ cat /tmp/a.html
<html>
<head>
</head>
<body>
<a href="test1" fred="joe">test</a>
<a href="test2" fred="billbob">frod</a>
</body>
</html>
$ cat /tmp/a.xsl
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:template match="//a">
&lt;a href="<xsl:value-of select="attribute::href"/>
</xsl:template>
</xsl:stylesheet>
$ cat /tmp/a.html | xsltproc /tmp/a.xsl -
<?xml version="1.0"?>





&lt;a href="test1

&lt;a href="test2


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com