Help - Search - Members - Calendar
Full Version: remove specific string
Pixel2Life Forum > Help Section > PHP, ASP, MySQL, JavaScript and other Web/Database Programming Help
derek.sullivan
I use this code to recieve utf8-text for what I am wanting to view:

CODE
function strip_html_tags($string) {

$string = preg_replace(
array(
'@<head[^>]*?>.*?</head>@siu',
),
array(
'',
'',
),
$string);

return strip_tags($string);

}

$url = "http://thechristianchat.com/echat45/public/rmessages.html";
$raw_file = file_get_contents($url);
preg_match( '@<meta\s+http-equiv="Content-Type"\s+content="([\w/]+)(;\s+charset=([^\s"]+))?@i',
    $raw_Text, $matches );
$encoding = $matches[3];
$utf8_text = iconv( $encoding, "utf-8", $raw_file );
$utf8_text = strip_html_tags( $utf8_text );
$utf8_text = htmlentities($utf8_text);
$utf8_text = html_entity_decode( $utf8_text, ENT_QUOTES, "UTF-8" );


and here is the output:

Occupants: [Reverse Message Order]
if (parent.frames[2].ignore.indexOf("|derek|") == -1) {document.write('derek - logged off - using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 on 10/14 at 2:43pm CST)'); }
if (parent.frames[2].ignore.indexOf("|derek|") == -1) {document.write('derek - logged on - using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 on 10/14 at 2:19pm CST)'); } gator has timed out.
if (parent.frames[2].ignore.indexOf("|gator|") == -1) {document.write('gator - blessings to everyone'); }
if (parent.frames[2].ignore.indexOf("|gator|") == -1) {document.write('gator - in facxt some areas in Iowa had some snow flurries down around interstate 80 moving east'); }
if (parent.frames[2].ignore.indexOf("|gator|") == -1) {document.write('gator - almot midnight and the weather here in iowa feels like thanksfiving week'); }

what I want to do is get rid of the:

[code]if (parent.frames[2].ignore.indexOf("|whatever|") == -1) {document.write('whatever - says');} but what I don't want to get rid of in this part of the string is the text between the single quotes in document.write... eg document.write('whatever - says'); I want to keep whatever - says just not all the crud around it... any suggestions?
Demonslay
So are you wanting to get rid of any JavaScript, so it isn't executed by the browser? Simply add to your match array for preg_replace() anything in <script> tags. There may be other ways to make sure you aren't subject to any XSS attacks, but that should get you started. I would look more into other CMS systems and see what matches they do for user content, I know some of them can get rather complicated.
Hayden
Here's what I found out. I couldn't figure away to do it without multiple ereg_replace requests but...oh well.

CODE
$curl = new CURL;

$html = $curl->get('http://thechristianchat.com/echat45/public/rmessages.html');

$html = ereg_replace('<script[^>]+>.*</script>','',$html);
$html = ereg_replace('<script[^>]+>.*</script>','',$html);
$html = eregi_replace('<NOSCRIPT>.*</NOSCRIPT>','',$html);


cURL Class Source
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.