c# - Regular expression to remove link from image in html -


what c# / regex syntax remove link first image in body of text like:

text <a href="..." class="..."><img src="..." class="..." width="..." /></a> more text <a href="..." class="..."><img src="..." class="..." width="..." /></a> more text 

so final result be:

text <img src="..." class="..." width="..." /> more text <a href="..." class="..."><img src="..." class="..." width="..." /></a> more text 

any advice appreciated! in advance.

using html agility pack (project page, nuget), trick:

htmldocument doc = new htmldocument(); doc.loadhtml("text <a href=\"...\" class=\"...\"><img src=\"...\" class=\"...\" width=\"...\" /></a> more text"      +" <a href=\"...\" class=\"...\"><img src=\"...\" class=\"...\" width=\"...\" /></a> more text\"");  var firstimage = doc.documentnode.descendants("img").where(node => node.parentnode.name == "a").firstordefault();  if (firstimage != null) {     var anode = firstimage.parentnode;     anode.removechild(firstimage);     anode.parentnode.replacechild(firstimage, anode); }  var fixedtext = doc.documentnode.outerhtml; //doc.save(/* stream */); 

i find lot easier on eyes, states trying accomplish.

  1. find first img inside tag
  2. store img temporarily
  3. remove swap img , tag
  4. save results.

Comments

Popular posts from this blog

delphi - How to convert bitmaps to video? -

jasper reports - Fixed header in Excel using JasperReports -

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -