Code valid in both PHP and SSI
In a project that I’m working on, web pages are generated, and sent by FTP to a remote server. Some of those pages are simple HTML pages, others SHTML (i.e.: use server side include, or SSI, to include some blocks of information), and some others are PHP pages.
I’ve been in a case where a given block of code would have to be used both by SHTML and by PHP page. I can generated any code that I’d like, but the same code must be used for the PHP and SHTML blocks. In addition, it would be nice if this block, when seen directly as HTML (no PHP neither SSI parsing), would display correctly.
Note: WordPress, the soft that I use to maintain this blog, doesn’t allow me to output nicely the marks for an HTML comment. In the text below please delete any @ that you see… Sorry for the inconvenience… The attached file has all the code, with no parasite ‘@’ marks.
A standard SSI would look like
<!@--#include virtual="included" --@>
and a standard PHP include is
<?php include("included"); ?>
Clean SHTML solution
Here is a first solution:
<!@--#if expr="1=0"--@>
<?php include("included"); ?>
<!@--#endif--@>
<!@--#include virtual="included" --@>
It gives a nice source code when seen as SHTML, but some junk remain when seen as PHP, including the page name. This is an information that you don’t want to leak to the outside world.
When seen as standard HTML, display is OK, but page name appears in source code.
Clean PHP solution
This solution also gives the correct result when displayed as both SHTML and PHP. However, SHTML source code reveals the name of the included page.
<?php if (false) {?>
<!@--#include virtual="included" --@>
<?php } ?>
<?php include("included"); ?>
‘no leak’ solution
The next solution is a bit more complex, but doesn’t leak the included page name whether the page is interpreted as SHTML or PHP. However, the page name is (twice) in the source code, so if the page
is rendered as standard HTML (no parsing done), it will be viewable. This is unavoidable, so we won’t speak anymore about this point here under. However, we take care that, at least, if the code is interpreted as HTML, it won’t display anything (even if the included page name is in the source code).
<!@-- <?php if (false) {?> --@>
<!@--#include virtual="included" --@>
<!@--#if expr="1=0"--@>
<!@--
<?php } ?>
--@>
<?php include("included"); ?>
<!@--
<?php if (false) { ?>
--@>
<!@--#endif--@>
<!@-- <?php } ?> --@>
Nearly last solution
The above solution is quite satisfying, but better could be done. I’m not happy repeating twice the name of the included page in my code. Could we do better? Yeah, as the following code will show you:
<!@-- <?php /* --@><!@--#if expr="1=0"--@><!@--'; */
$x='<!@--#endif--@><!@--#include virtual="included" --@><!@--#if expr="1=0"--@><!@--';
echo '-'.'->'; include(substr($x,35,$x-30)); echo '<!@-- '; /* <!@--#endif--@><!@--'; */ ?> --@>
Remember to delete all the @ marks, which are here just to avoid a bug in my blog software, but shouldn’t really be there…
When interpreted as SHTML code, this becomes:
<!@-- <?php /* --@>Content of included file<!@--'; */ ?> --@>
Nothing will be displayed, but some junk remains in two HTML comments. No information leak is done, however.
When interpreted as PHP code, this becomes:
<!@-- -->Content of included file<!@-- --@>
which is correct HTML code, with just two empty HTML comments. Again, no information leak is done.
When interpreted and display as pure HTML, nothing is displayed at all (everything is in HTML comments).
Last critic
The above solution is not perfect yet: if interpreted as pure HTML, the page is not XML compliant: XML forbids to have ‘–’ inside comments, but the above solution doesn’t respect that rule…
In my case, it isn’t a problem, since this will always be interpreted as either SHTML or PHP. But in a more general case, it might be troublesome.