Up to [Local Repository] / interchange / lib / Vend
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
Fix design flaw in the new child-process tag: * Need to double-fork and close sockets and filehandles to disconnect from the parent process so that the parent doesn't wait for the child to finish. * Add a new routine &Vend::Server::cleanup_for_exec to handle clearing the private socket variables, usable from other child code as well. * Can no longer return a useful PID because of the double-fork, so return nothing from the tag. Some other tweaks as well: * Change process description and allow it to be customized, especially useful for long-running processes. * Don't bother retrying if fork attempts fail, as that gets prohibitively complicated on double-forks and probably indicates a suffocating system we shouldn't add load to anyway. * For now, at least, don't bother with optional single-fork implementation which is mostly useful for a tiny bit of improved performance that's not worth the hassle for a background process. * Ignore not just an empty body, but also one with only whitespace. * Remove documentation for unimplemented umask option.
Fix indenting.
Allow XML posts by e.g. Google Checkout, which broke in Interchange 5.6.0 (RT #219). Thanks to Andy <ic@tvcables.co.uk> for the patch.
Allow XML posts by e.g. Google Checkout, which broke in Interchange 5.6.0 (RT #219). Thanks to Andy <ic@tvcables.co.uk> for the patch.
Support "secure cookies", which are sent only over SSL connections. Use [set-cookie ... secure=1] to enable. This is from a patch by Frederic Steinfels <fst@highdefinition.ch> from 2006-05-19, which fell between the cracks. Thanks, Frederic!
Change UserTrack behavior to better match expectations. * "UserTrack no" formerly also disabled TrackFile, because the whole Vend::Track module was disabled. This was not expected behavior. People are apparently using TrackFile fairly commonly, so this would make "UserTrack no" pretty unattractive. * Make "UserTrack no" only disable sending the X-Track HTTP response header. * As before, leaving TrackFile undefined will stop logging to a track file. * Make UserTrack default to false now, which is an incompatible change, but one that I don't expect to adversely affect anyone, as the X-Track response header doesn't seem to get used. Adding "UserTrack yes" to catalog.cfg brings it back. In short, most people upgrading will stop having an X-Track response header sent, and otherwise will notice no difference.
Fix overly-tight MIME type matching. Match on end of MIME type boundary with \b instead of end of string, since a boundary or other things can follow. Match case-insensitively in several places since MIME types are not case-sensitive. Match on word boundaries in a few places.
Various minor UTF-8 changes.
Correct attribution of &Vend::CharSet::display_chars (which is from
perluniintro manpage).
Enable localization of an error string.
Match content type more tightly in 2 spots ("text" is only trustworthy
in the MIME major type, not minor, and even that may be a stretch).
Simplify request method matching in a few places for readability and a
(trivial) performance benefit.
Use conventional $c lexical instead of $g for catalog hashref.
Fix tab/space differences to match context.
Update copyrights of files changed in 2008.
* Committing Sonny Cook's UTF-8 patches, along with a fix for the
PreFork issue caused by the patches. Thanks, Sonny!
* From Sonny's original article on interchange-core:
There are two variables that will need to be added to your
catalog.cfg: MV_HTTP_CHARSET and MV_UTF8. They should be set
like so:
Variable MV_HTTP_CHARSET UTF-8
Variable MV_UTF8 1
The MV_UTF8 variable tells the system that we are using UTF-8
for stuff internally when that needs to be specified. Perl mostly
does the right thing wrt UTF-8, but when we need to explicitly
specify for one of a handful of reasons, this variable lets us
configure that.
The MV_HTTP_CHARSET specifies which character set that the web
pages are going to be encoded with. UTF-8 is the only value that
has been tested at the moment, although it probably generalises
to whatever you would like to use.
Communication with the database introduces three database
directives. These are required to ensure that data is properly
communicated with the database:
PG_ENABLE_UTF8
MYSQL_ENABLE_UTF8
GDBM_ENABLE_UTF8
These can be set on a table by table basis or with DatabaseDefault.
You will probably want to set the one for the sql database you are
using and one for GDBM, like so:
DatabaseDefault PG_ENABLE_UTF8 1
DatabaseDefault GDBM_ENABLE_UTF8 1
You will need to make sure that your database is encoded in UTF-8
and that all of your data is encoded that way as well.
Enabling UTF-8 should not cause any problems if your data is all in
US-ASCII, but might cause problems if other encodings are involved.
* Note: This commit is missing the latest safeuntrap/reval/safetrap
code, which should be added ASAP. In the meantime, the following
works in the interchange.cfg file (with Perl 5.8.8):
SafeUntrap rand require caller dofile print
* New SocketReadTimeout global configuration directive to control the
amount of time we will wait for data on a socket. The default is
one second, which is the same as the previously hard-coded timeout
value.
* Changes in the _read() subroutine:
-- select() returns -1 upon error, whereas sysread() returns undef,
so we need to allow for both.
-- Catch EAGAIN as well as EINTR as soft errors to retry on.
-- Read the entire available amount of data in one hit instead of
forcing the data to be read in 512-byte chunks.
Avoid multiple identical cookies (#150).
* Making explicit the various implicit dependencies between PreFork,
PreForkSingleFork, and StartServers.
+ PreForkSingleFork should only ever affect behavior in conjunction with
PreFork true, ensuring the prefork code path is entirely controllable by
the value of PreFork.
+ Fixed condition on StartServers where a positive value for that parameter
when not in PreFork mode spawned a StartServers number of superfluous
daemons that were never used. Now, StartServers is effectively ignored
unless PreFork is also true.
Conditions were discovered by Brian Miller and Jon Jensen while combining
atypical combinations of the settings involved.
Clean up some things in Vend::Server: * In &connection, get function arguments first, for clarity. * Fix some indenting. * Remove extra cycle count increment. * Comment out $pretty_vector setup, only used for commented-out logDebug calls.
Make PreFork handle idle children more intelligently. If a child is idle, then don't needlessly respawn it at PIDcheck seconds. To make sure we still catch child processes that have spun out of control, track idle vs. active state. Developed by Mark Johnson <mark@endpoint.com>.
Removed MV_DOLLAR_ZERO workaround for a bug fixed 5 years ago.
New Vend::Server::set_process_name sub which is used to change the status of the process name indicator. This respects the MV_DOLLAR_ZERO settings.
Comment out noisy debug statements in new PreFork code.
Applied patch from Mark Johnson <mark@endpoint.com> to fix problem with RPC mode preforking too many children on server startup due to race condition.
delay server started message as long as possible disable soap and print warning on stdout if Vend::SOAP fails to load (#46)
* Remove catalog status files when removing catalog. Also call remove_catalog at server stop -- would be nice for cleanup anyway.
Allow parameters passed to jobs, acknowledges --email commandline option now (#103).
New Free Software Foundation Address in headers of various files
New Free Software Foundation Address in headers of various files
Update copyright year.
Correct Interchange's handling of incoming requests where a form element has a space in the name. Before the fix, when it gets to values space it still has the plus. However true '+' characters will have also been decoded, so you can't distinguish the two. This change switches pluses to spaces before %2B gets switched to '+'. Fixed by Brian Miller <brian@endpoint.com>. (Merged from development branch.)
Correct Interchange's handling of incoming requests where a form element has a space in the name. Before the fix, when it gets to values space it still has the plus. However true '+' characters will have also been decoded, so you can't distinguish the two. This change switches pluses to spaces before %2B gets switched to '+'. Fixed by Brian Miller <brian@endpoint.com>.
* The NotRobotUA and RobotUA checks were being blocked when both
HostnameLookups and RobotHost were configured.
call autoflush only if we need to, further examination of the problem reported by Ron Phipps nonwithstanding
map_inet_socket is used by SOAP server as well, so we need to include the actual mode in the error message also add IP address we try to bind on, which might be causing the failure
added job flagging to allow better job control, especially from within the catalog, main purpose is to safely continue jobs caught by timeout
* Patch for a DoS exploit, pointed out by Donald Alexander. Thanks
Donald.
A carefully crafted HTTP POST request could cause an Interchange
page processor to hang until it's killed by Interchange's periodic
housekeeping routine.
If several of these requests are received in quick succession
then it could be possible to disable all of the page processors,
rendering Interchange unresponsive for a while.
* Patch for a DoS exploit, pointed out by Donald Alexander. Thanks
Donald.
A carefully crafted HTTP POST request could cause an Interchange
page processor to hang until it's killed by Interchange's periodic
housekeeping routine.
If several of these requests are received in quick succession
then it could be possible to disable all of the page processors,
rendering Interchange unresponsive for a while.
Big copyright and version number update to prepare for 5.3.2 release.
* Add ability to run Jobs from cron. The idea of a delay is removed since you can schedule exactly when it will run. There is no queue action, as fitting it into the current queue setup is a bit difficult. Jobs are specified with: 0 0 * * * * =standard hourly 0 1 2 * * * =standard daily 0 2 4 * * 7 =standard weekly 0 0 3 1 * * =standard monthly * Add separator of ";" for specifying multiple cron tasks on the same line. * Think about the idea of a catalog-based cron. We should make that minute-based instead of seconds-based, for one thing. One possibility is to allow a :catalog_cron job which looks at the state of any Cron settings in catalogs and runs those as Jobs when appropriate. * Actually it makes sense to use the Jobs facility for a lot of cron tasks, and I will think of the possibility of modifying run_jobs() to not have a catalog base.
* Add new cron-style facility for determining HouseKeeping jobs.
* Default is no change, i.e. no cron.
* The recommended method to add the file is:
HouseKeepingCron <crontab
That will use the file etc/lib/crontab by default in the tarball,
or /etc/interchangec/crontab in an LSB configuration.
* Requires the Set::Crontab module, which has been added to
Bundle::Interchange.
* Structure of the crontab file is just like crontab(5) in UNIX
except that a seconds column is added.
The targets are GlobalSub or anything which you can make run
with Vend::Dispatch::run_macro. Bear in mind there is no
catalog context.
Two special targets exist, :reconfig and :jobs. They allow calling
of the catalog reconfig routines and jobs routines, respectively.
The etc/reconfig and etc/jobsqueue files will be ignored if these
targets are not present -- a warning will be issued at startup
(and crontab change) if they are not there.
A target prepended with > runs *after* the reconfig/restart/jobs/pid
mgmt cycle. Normal specifications run before.
The basic entry to implement "HouseKeeping 5" would be:
HouseKeeping 1
HouseKeepingCron <<EOC
*/5 * * * * * :restart
*/5 * * * * * :jobs
EOC
(Note that would normally be in etc/lib/crontab or /etc/interchange/crontab.)
To only check the jobs queue every five minutes (on the minute), you
do:
*/5 * * * * * :restart
0 */5 * * * * :jobs
If you want to run the GlobalSub "checkit" once a day at 4am, you would
do:
0 0 4 * * * checkit
* If you set HouseKeeping to a granularity besides 1 (or if for some
reason Interchange skips a second), it does the cron check for
every intervening second. This ensures a job will not be skipped.
The :restart and :jobs entries will only run once, but if you have
a frequent GlobalSub job that pushes the granularity of HouseKeeping
it can be run twice in succession.
* WARNING: You should not put long-running jobs in a GlobalSub! You have
been warned. Use the Jobs facility for that.
* Probably should implement the ability to call out jobs, but not quite
sure how to specify and do. Can we just call run_jobs() directly?
If so, then maybe an = sign introduces a job:
0 0 * * * * =standard_cat hourly
0 0 4 * * * =standard_cat daily
0 0 2 * * 7 =standard_cat weekly
* Include bin/crontab script to edit the crontab and submit to the
running IC daemon. BUG: Cannot run as root.
* Bring warnings change to Server.pm
use status 404 if no corresponding catalog has been found for SOAP access
* Implement new AccumulateCode and TagRepository directives. The rationale
is:
-- There is a huge base of Interchange code, much of which is not
needed in even the standard catalog with full UI. This causes a
larger memory profile than necessary.
-- It is difficult to determine from the page code what code is
needed, especially when a [tag] can call a $Tag can call
a filter can call some sort of Action.
-- A feature is needed to allow building catalogs with a more
nearly optimal set of code than just "everything".
If AccumulateCode is no, operation is exactly as before. There have
been some code initialization changes and routine calling changes,
but the data structures are identical and no difference in operation
should be seen.
If you set AccumulateCode to "Yes" and specify a TagRepository that
contains all known UserTag, ActionMap, Filter, Widget, etc. etc.
code, Interchange starts accumulating and compiling these as
needed.
The code is sent to the master process for compilation and
incorporation, so that the next iteration of a page after HouseKeeping
seconds will find the code already compiled and ready to go.
It also copies the code file to the "code" (actually $Global::TagDir)
directory in the "Accumulated" subdirectory tree. When you restart
Interchange, these tags/filters/widgets/checks are read normally
and need not be recompiled on the fly.
Over time, as you access pages and routines, a full set of tags
will be developed and you can turn AccumulateCode to "No".
* There can be failures due to calling a $Tag from within embedded
Perl for the first time, particularly when it uses a MapRoutine or
calls another $Tag within. This is due to Safe, and there is probably
not much to be done about it. The good news is that the error should
go away after HouseKeeping seconds when the tag gets compiled by the
master.
This could be avoided in the case of an AllowGlobal catalog, and it
might be possible to make a directive that turns on AllowGlobal only
when in AccumulateCode mode.
The area, tmp, tmpn, and image tags are known to fail in this
way in the standard catalog. Tags that are frequently called
in this fashion should probably be placed in a "code/Vital"
directory and not be accumulated.
* This is only recommended for development -- it might
be possible to remove a tag/filter/etc. from the master
and recompile these on the fly, but I haven't looked at that
yet.
Another nice feature is that you can easily add a tag simply
by adding its code to the TagRepository and having it
compiled.
* WARNING: Nice features are often dangerous! Don't run this in
production -- you have been warned!
* WARNING: OrderCheck is not yet implemented, and a full audit has
not been done on all compiled code directives.
* WARNING: Not fully tested in Prefork mode, and really not intended for
that mode.
* WARNING: Including multiple tags in a file may have unpredictable
behavior. You should try to keep related Alias and tag things in
the same file.
* This feature only applies to Global code -- Catalog-based code
shows no change.
* Passes the regression tests 100% when called with an empty "code"
directory, compiling every tested tag and executing without error.
* Avoid warnings.
* More warning removal.
* Map convenience/performance variables $::Variable and $::Pragma when using SOAP.
* Fix ISINDEX query detection. This code has probably been in here 5 years without a bug report until now. 8-)
* Discovered reason we had so many "page server NNNNN would not die" errors in prefork mode. We weren't removing dead page servers from the %Page_pids hash, so we were trying to terminate/kill non-existent servers. * When we say kill, mean it. We were not actually doing a KILL; whether we should is questionable, I guess; but all servers should accept a TERM unless totally hung.
* Completely remove all DBI cache entries when in PreFork mode, so that we won't have "MySQL server gone away" errors. It doesn't make too much sense to cache connections anymore, anyway, as DBI does that for you.
Correct minor inaccuracy in comment.
* Initial instances of pre-forked servers could start out with the same rand value, causing their first session number to be identical (along with possible other side-effects). This probably didn't cause many problems in practice, but it is a bug.
* Add NotRobotUA directive, which allows setting of UserAgent strings that are not to be treated as a robot. * Change scan order to do IP first, to be able to block robots by address range regardless of their UA.
* Fix X-Track headers so that they will always be canonical.
* Occasionally in PreFork mode you will find a server that gets "starved",
in other words never seems to win the battle and receive a page request.
It just sits there forever, not killable or anything.
Add ChildLife directive which times out a page server after a period
of time.
ChildLife 30 minutes
This is the usual Interchange time_to_seconds value.
If ChildLife is not set, the default, the server will act just
like it does now, stuck in that internal loop forever until
kill -9 happens.
All it does is set the start_time of the server, and then when
HouseKeeping seconds goes by it checks the current time and lasts
the server (just like MaxRequestsPerChild, basically) if it
has expired.
Should clear up the problem people have with a growing number of
servers over time.
* Also removed setting of $C->{Source}->{$var} for time_to_seconds
types, as it triggered a bug within Perl and caused my system
to barf.
* Add Status: and Content-Type: headers if we are the recipient of an internal redirect. * Remove references and tests on $Vend::InternalHTTP and $Vend::OnlyInternalHTTP, which are no longer wanted with the removal of the internal HTTP server.
* When no PATH_INFO is specified, normally we go to find_special_page('catalog').
This change checks the REQUEST_URI when that condition occurs, and if
the REQUEST_URI doesn't begin with SCRIPT_PATH we assume the web server
has used the Interchange SCRIPT_PATH as the index entry in DirectoryIndex.
This allows in (at least Apache's) httpd.conf:
DirectoryIndex index.html /cgi-bin/foundation
When the index.html page is not found, /cgi-bin/foundation is called.
If the URI is a subdirectory as in the request /foo/, then the REQUEST_URI
will be /foo/. We then use /foo/ as the Interchange path, allowing
transparent flowthrough of non-existent entries to Interchange.
In other words, you can create an empty directory /var/www/html/foo,
and when /foo/ comes in as a request it will automatically go to
/cgi-bin/foundation/foo/ while still appearing to be /foo/ on the
browser.
If you combine this with the following in catalog.cfg:
DirectoryIndex index.html
DeliverImage yes
And the following in interchange.cfg:
AcceptRedirect Yes
And finally:
<LocationMatch "^/(.*)/.*\.html">
ErrorDocument 404 /cgi-bin/foundation
</LocationMatch>
you can run a complete set of
no DNS lookups for SOAP calls unless HostnameLookups is set
* Reinstate http_log_msg for logging SOAP accesses, since there may be no other place it is logged. Will consider where this should be -- we probably need a "LogDir" directive in the global. What would make more sense is a complete upgrade to the logging capability of Interchange. I have been considering working on this for some time; it would make sense for all logs to have multiple ways of being entered (DBI, syslog, files in globally controlled tree, standard local file).
fix problem with invalid cookie if FullUrl is enabled and there is no path, e.g. Catalog linuxia /home/racke/linuxia www.linuxia.de www.linuxia.de:443
Remove Vend::Server::http_log_msg which is only called for SOAP accesses.
* Change wrong logic in server_start_message() -- it always needs to be $$ when in PreFork mode.
Tolerate empty CGI key/value pairs to avoid unfriendly 500 errors. We already tolerated one at the beginning and one or more at the end of the query parameters: http://www.icdevgroup.org/i/dev/index?&id=abcd http://www.icdevgroup.org/i/dev/index?id=abcd&&&&&& Now these are handled gracefully too: http://www.icdevgroup.org/i/dev/index?&&id=abcd http://www.icdevgroup.org/i/dev/index?id=abcd&&&xyz=123
* Fix faulty patch associated with previous Server.pm and Config.pm commit. Why did patch let me do this without warning?
* Add RedirectCache directive which allows redirected page requests to be set to mv_tmp_session then written to the target from which it was redirected. This allows a complete web site to be mirrored to static HTML as it is requested, accompanied with the proper setting of AcceptRedirect in Interchange and ErrorDocument in the Apache server. To use: * Set ErrorDocument 404 to the Interchange URL in Apache. * Set "AcceptRedirect Yes" in interchange.cfg. * Set "RedirectCache /var/www/html" in interchange.cfg (use your document root in place of /var/www/html). When a page http://yourdomain.tld/subdir/page.html is not found, Interchange gets a redirect which causes it to set mv_tmp_session=1. If Interchange doesn't find the page, then it returns "missing" and no writing is done. If IC does find the page, it is written to /var/www/html/subdir/page.html and the page will be found on next access. Exclude on HTTP server side can be done with permissions -- don't set it writable by IC daemon if you don't want it written. TODO: Improve permissions mask setting options, allow excludes from IC side, add NoClobber option. * Fix problem where defining blank GlobalSub would kill *all* globalsubs.
changed name of jobs queue from jobs to jobsqueue to avoid conflict with existing jobs directory
* Jobs are now listed and queued in $Global::RunDir/jobs instead of $Global::RunDir/reconfig * Add global Jobs configuration directive: Jobs MaxLifetime 3600 Jobs MaxServers 3 After a job has been running for MaxLifetime seconds, it will be removed by next housekeeping run. The default MaxLifetime is 10 minutes. This setting is only available if PIDcheck has been set. MaxServers is the number of jobs allowed to be run simulatenously, which should be significantly smaller than the value in the MaxServers directive to avoid unaccessibility of Interchange for users. The default for MaxServers is 1. TODO: * don't check jobs on any housekeeping run * introduce delay for jobs which serves as method to expire jobs (delay = 4 minutes for jobs called from cron every 5 minutes)
* Never set a cookie when mv_tmp_session in effect.
* Fix nasty context problem in calling getppid from subroutine reference.
* Fix for broken getppid() on Linux systems with threads enabled.
To implement, use
Variable MV_GETPPID_BROKEN 1
in interchange.cfg. It substitutes a syscall(64) for the getppid
call.
This should only be necessary on systems with threads enabled,
which is NOT recommended for IC.
Call tracking functions only if Vend::Track object exists. E.g. in jobs tracking isn't enabled. Disabling tracking with a configuration option might follow as well.
The great copyright, email address, URL, and version update.
continued removal of static page building related code
* Don't honor $Global::Mall if CookieDomain set.
reverted last modification on http_soap which wasn't necessary
complete revision of SOAP error handling
ensure that SOAP server adds a blank line between HTTP header and HTTP body
* Allow systems with broken locks to not destroy the pidfile lock by reading the file. Alleviates inability to use "interchange -stop". Requires setting Variable MV_BAD_LOCK 1 in interchange.cfg. Thanks to Daniel Hutchinson for finding the problem!
Renamed Cron configuration directive to Jobs because that is a more appropriate. Commandline options --job and --cron renamed as well to --jobgroup and --runjobs.
updated LINUXIA branch to 4.9 sources in order to use it as testbed again
Update copyright dates.
Merge from trunk: * Fix misspelling.
Merge from trunk: Apache 2.0.x compatibility fix by Mike Heins. Fixes the problem described here: http://www.icdevgroup.org/pipermail/interchange-users/2002-August/024212.html
Merge from trunk: * Fix whitespace transform, tolerate leading whitespace on header lines.
Merge from trunk: * Output proper header so missing script will be seen as 404.
Merge from trunk: * Fix mistake in inactive debug statement.
* Fix whitespace transform, tolerate leading whitespace on header lines.
Add new global directive TrustProxy, which allows the administrator to designate certain IP addresses or hostnames as trusted proxies, whose claims (via the HTTP_X_FORWARDED_FOR environment variable) about the original requesting host will be believed. When using a front-end proxy for Interchange, all requests appear to come from that proxy, say 127.0.0.1 if on the same machine, which is effectively the same as running WideOpen because sessions can be easily hijacked. This offers a way to bring back a little discernment about what host we're really dealing with. Usage is identical to the RobotIP directive's, for example: TrustProxy 127.0.0.1, 10.0.0.* I'm not sure why anyone would want to do this, but it could also be used with external HTTP proxies in general (which hopefully aren't lying), with a simple 'TrustProxy *'.
Sweeping update of Akopia/Red Hat references, to prepare for 4.8 release with current Interchange URLs and contact information.
* Output proper header so missing script will be seen as 404.
* Never want to send a cookie if temporary session.
Fix mistake in inactive debug statement.
Fix misspelling.
* Fix bug where multiple pre-forked servers could generate same "random" session ID. Thanks to Jeff Dafoe for reporting.
* Added the new HostnameLookups directive which allows Interchange to lookup the hostname from a supplied IP address. This is a 'yesno' directive and the default is 'No'. If not enabled then Interchange will expect the web server to have already performed the DNS lookup. If the web server is also configured to not perform DNS lookups then the following features will not work: (1) RobotHost checks and (2) maintenance of sessions from users who connect via AOL-style proxies. No DNS lookups will be performed for temporary sessions unless the RobotHost list needs to be checked. This lookup will only happen if (1) HostnameLookups is enabled and (2) the web server has not already performed the lookup and found the hostname. The securiry checks performed when connecting via SOAP and INET-mode links will make use of a DNS lookup, regardless of the HostnameLookup setting and other considerations. * If an entry in the RobotHost list contains the wildcard "*.domain.com" then the base "domain.com" will also be checked.
* Fix bug where multiple pre-forked servers could generate same "random" session ID. Thanks to Jeff Dafoe for reporting.
Apache 2.0.x compatibility fix by Mike Heins. Fixes the problem described here: http://www.icdevgroup.org/pipermail/interchange-users/2002-August/024212.html
* Only call gethostbyaddr() to look up the remote hostname if
$Global::RobotHost is set.
* Added a RobotHost directive to identify robots by hostname.
This is in addition to the existing RobotUA and RobotIP
identification lists.
* RobotHost and RobotIP are handled by the new list_wildcard_full
routine. This routine does not perform a substring match, so
either the full string must be specified, or something along
the lines of *.domain.com must be used instead.
* The RobotUA handler still performs a substring match using
the list_wildcard routine.
* Both list_wildcard and list_wildcard_full now generate a
case-insensitive regex.
* A new 'spider' key has been added to the session, which may
be accessed using [data session spider], [if session spider]
and $Session->{spider} etc. Please treat this facility with
care, as some search engines take a dim view of so-called
"search engineering."
* Move most all code out of bin/interchange. The only routines that remain are: dontwarn version usage catch_warnings parse_options main_loop Once the initial startup for Interchange is done, this code is completely out of the picture. * Create new Vend::Dispatch module which contains the bulk of the code removed from bin/interchange. * Move the important update_data() subroutine to Vend::Data. * Move the session-related routines to Vend::Session. * Move the order-related routines do_order() and update_quantity() to Vend::Order. * Change many ::uneval() calls to plain uneval() or Vend::Util::uneval(). * Remove various unused tags and routines....
* Add robot tolerance facility, where mv_tmp_session is set when either a RobotUA or RobotIP wildcard matches. In interchange.cfg: RobotUA Inktomi, Scooter, Site*Sucker RobotIP 209.135.65, 64.172.5 After that, it is all automatic. mv_tmp_session gets set to one, the Scratch values mv_no_session_id and mv_no_count are set to one, and normal pages don't get IDs put out by area. What this will do for the user: 1. Allow those UAs to follow a URL. 2. Prevent useless session files from cluttering disk 3. Prevent session writes from inhibiting disk performance. We should probably allow a Profile to be run based on a Robot match too. I will think about that.
* Add new content management features. This allows Interchange to:
-- Accept Apache error redirects, i.e. handle 404 errors
-- Initially process page, process page after variables, and
process page before image substitution with configurable subroutines
-- Take puts for DAV-style publishing
* New "AcceptRedirect" directive. If "Yes", will look for REDIRECT_URL,
REDIRECT_QUERY_STRING, etc. and use those to provide the request.
This allows:
ErrorDocument 404 /cgi-bin/foundation
At that point, a request for /index.html that is not found will
be equivalent to /cgi-bin/foundation/index.html and will be
indistinguishable from the real page by the client.
* New Pragmas init_page, pre_page, post_page
init_page Run before Variable substitution
pre_page Run after Variable substitution, before interpolation
post_page Run before Image substitution
Example -- you want your users to be able to edit pages and just put
in <A href="someotherpage.html">. You can use post_page to handle
this. To do it, you want an entry in catalog.cfg:
Pragma post_page=relative_urls
(Can also be in the page).
### Take hrefs like <A HREF="about.url" and make relative to current
Sub <<EOR
sub relative_urls {
my $page = shift;
my @dirs = split "/", $Tag->var('MV_PAGE', 1);
pop @dirs;
my $basedir = join "/", @dirs;
$basedir ||= '';
$basedir .= '/' if $basedir;
my $sub = sub {
my ($entire, $pre, $url) = @_;
return $entire if $url =~ /^\w+:/;
my($page, $form) = split /\?/, $url, 2;
my $u = $Tag->area( { href => "$basedir$page", form => $form } );
return qq{$pre"$u"};
};
$$page =~ s{
(
(
<a \s+ (?:[^>]+?\s+)?
href \s*=\s*
)
(["']) ([^\s"'>]+) \3
)}
{
$sub->($1,$2,$4)
}gsiex;
return;
}
EOR
You can do multiple ones if you set it in catalog.cfg, by
making the value post_page=routine1,routine2. (Currently, no
commas are accepted in [pragma name value], but that should
change.)
* Allow PUT operations. Add
[value-extended test=isput] Check for a PUT
[value-extended put_contents=1] Return PUT string
[value-extended put_ref=1] Return ref to PUT string (scalar)
Some more DAV-type features can be done, I think, but they are not yet
scoped.
* Add new [deliver ....] tag that allows you to deliver some content
without worrying about [tag op=header] and page spacing issues.
Adds new global variable $Vend::Sent which is authoritative notification
that all content is sent and that all further parsing of ITL
should stop.
Allows this:
[perl]
if($CGI->{foo}) {
# Oh, we need to send foo as text
$Tag->deliver( { type => 'text/plain', body => $Scratch->{foo} });
return;
}
else {
# Go about parsing ITL
}
[/perl]
Also will work with
[deliver type=text/plain][scratch foo][/deliver]