Feature Proposal: Explicitly control the storage location of temporary files used by Foswiki
Motivation
Tasks.Item10408 has exposed an issue caused when multiple foswiki installations are hosted on the same server. File collisions in /tmp with different owners were causing failures.
It is not possible to resolve this using the
$ENV{TEMPDIR}
setting because the parameter is ignored by
File::Spec
and
File::Temp
when taint checking is enabled,
and when it is tainted. The solution is to ensure that the environment variables used on various platforms are all untainted.
See also
Tasks.Item9233 for windows temporary file issues.
Description and Documentation
Foswiki has a deprecated / hidden temporary file location -
$Foswiki::cfg{TempfileDir}
that is documented as retained for possible use by plugins.
This proposal is to:
- Define
{TempfileDir}
in Foswiki.spec
. as an expert parameter.Default $Foswiki::cfg{WorkingDir}/tmp
(no change from current default)
- Add a
TempfileDir.pm
checker that ensures a sane default. Suggest alternatives from the environment variables if available.
- Suggest
$Foswiki::cfg{WorkingDir}/requestTmp
as another alternative.
-
Update any modules using File::Temp
and File::Spec
to use a configurable directory.
-
Use a temporary root for each use of temporary files. These would be implemented as constants in their respective modules rather than adding more config variables. For example {TempfileDir}/sessions for cgi session files created by LoginManager, {TempfileDir}/meta for files created during attachment handling. etc.
Searching for references to
File::Spec
and
File::Temp
- the following modules appear to use temporary files.
-
Foswiki::Store::VC::Handler::mkTmpFilename
(This does not appear to be referenced anywhere, is it a dead function? The only place it is referenced is in the RCS unit tests, and in the GitPlugin store.)
-
Foswiki::Sandbox::sysCommand()
Cache to capture STDERR. Uses {WorkingDir}/tmp
explicitly. Update to use =File::Spec->tmpdir()=
-
Foswiki::Plugins::EmptyPlugin
Contains examples of using File::Temp
-
Foswiki::Meta::attach()
Temporary storage during attachment processing. Already uses File::Temp
-
Foswiki::Configure::Package
Stores extension files fetched from the repository. Already uses File::Temp
-
Foswiki::Configure::Util
Storage for expanded archive files. Already uses File::Temp
-
Foswiki::Cache
(temporary file storage is already explicitly set in the configuration. $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/cache';
)
-
rcs
uses the environment variable TMPDIR
provided that it is not tainted.
Searching for explicit use of
cfg{WorkingDir}
, all of the current usage in core + default extensions appears to be consistent with the intended use.
-
{WorkingDir}/cache
used by page cache DBI for file storage
-
{WorkingDir}/htpasswd.lock
used by Users::HtPasswdUser
as the lock file.
-
{WorkingDir}/sqlite.db
also used by page cache.
-
{WorkingDir}/registration_approvals
used by UI::Register
-
{WorkingDir}/tmp
used by:
-
Foswiki::LoginManager
for session files, and session/ip map files..
-
{WorkingDir}/work_areas
used by Foswiki::Store
for plugin persistent file storage.
-
{WorkingDir}/languages.cache
use by Foswiki::I18N
Other temporary file usage
The CGI code has its own temporary file implementation, used primarily to hold files during upload. This change currently does not apply to CGI or the Foswiki::Request::Upload code.
See
CPAN:CGI
Examples
Impact
Implementation
--
Contributors: GeorgeClark - 25 Feb 2011
Discussion
I'm curious about when to use
File::Spec
or just
blah/blah
to maintain portability... for example,
$Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache';
could be re-written as
File::catdir($Foswiki::cfg{WorkingDir}, 'tmp', 'cache')
--
PaulHarvey - 26 Feb 2011
we should always use
File::Spec
- that way, the code is more likely to work with fewer modifications on platforms we're not using.. like, mmm, someone had tmwiki running on VMS once, and another on some s/360 etc - and who knows what will happen in future...
ok, additionally, if you cna figure out howto get rcs to use the set dir, rather than /tmp, quite a few admins will thank you - as that has made them cry a few times.
--
SvenDowideit - 26 Feb 2011
I'd forgotten about rcs, but according to some man pages I've found searching around:
Temporary files are created in the directory containing the working file, and also in the temporary directory (see TMPDIR under ENVIRONMENT ). ... TMPDIR
Name of the temporary directory. If not set, the environment variables TMP and TEMP are inspected instead and the first value found is taken; if none of them are set, a host-dependent default is used, typically /tmp.
Does this not work? I pulled down rcs source to take a look: Could we set TMPDIR prior to invoking RCS commands and make sure sandbox passes it through into the environment?
if (!s
&& !(s = cgetenv("TMPDIR")) /* Unix tradition */
&& !(s = cgetenv("TMP")) /* DOS tradition */
&& !(s = cgetenv("TEMP")) /* another DOS tradition */
--
GeorgeClark - 26 Feb 2011
/tmp
(and by implication
File::Spec->tempdir()
is
usually defined as being "a scratch area which you can use to hold files and directories for short periods of time" and "cleared whenever the system is "booted up" and by the system administrator when the directory gets full". Most of us regard
/tmp
as a relatively small, server-specific, transitional partition that can be cleared as and when we feel like it. Because
/tmp
is local, we tend to regard it as "fast".
Does the way we use
working/tmp
(ignoring
configure
) correspond to this view?
- Originally
working/tmp
started out as a home for session files (which is how it got the name tmp
, because these files were moved there from /tmp
. Session files are nottrue temp files, because they (can) persist well beyond the end of process activation / request handling.
- Killing session files arbitrarily would (1) force users to log in again and (2) cause loss of session variables.
- It also serves as the home of the
ip2sid
map - used on very, very few installs, I suspect, but a persistent file and definitely not /tmpmaterial.
- Killing
ip2sid
arbitrarily would require connected users to log in again.
- Next came passthrufiles which were closer to "true" tmp files, in that they have a strictly defined life cycle.
- Killing
passthru
files could break requests, especially authentication.
So what we have is close to
/tmp
but not quite the same; it's not really a scratch area, it's a managed storage area.
So, what about other uses of
/tmp
? George captured them:
-
Foswiki::Store::VC::Handler::mkTmpFilename
is used on Windows only, IIRC, for very short-lived files created during checkin
-
Foswiki::Sandbox::sysCommand()
Cache to capture STDERR
-
Foswiki::Plugins::EmptyPlugin
Contains examples of using File::Temp
-
Foswiki::Meta::attach()
Temporary storage during attachment processing
-
Foswiki::Cache
(temporary file storage is already explicitly set in the configuration. $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache';
)
1, 2 and 4 are temp files held only for the duration of a single request - true temp files that can be purged almost as soon as they are closed. Arbitrary deletion isn't going to do them any favours. But they are all server-local and need to be fast (which is why they were left in
/tmp
). 3 and 5 I'm not so sure about, but in general:
- Foswiki uses
/tmp
as fast, local, request-specific store. Files created there are not expected to live beyond the end of a request, and are specific to a single request.
-
working/tmp
on the other hand is for longer-lived, Foswiki-managed files that are expected persist over many requests.
Can these two file types coexist in a single directory? I'm not so sure. If we need to provide a cushion for
/tmp
then I'd prefer (subject to someone persuading me otherwise) to add
working/request_tmp
.
--
CrawfordCurrie - 26 Feb 2011
working/request_tmp
sounds fine. I wonder about renaming /tmp to session_tmp at least for new installations might make sense. I guess I'm guilty of not reading the README in
working/tmp
but I had just "assumed" that anything
/tmp
would be for any temporary file use.
Okay, how about separate temporary files into explicit "Life of session"
working/session_tmp
and "Life of request"
working/request_tmp
directories. and define them with two expert configuration parameters - {sessionTmp} and {requestTmp}. This way the two classes of transient storage are documented in the configuration, and can be modified to accommodate requirements on shared hosting or other unique installation.
- In Foswiki.pm,
- if
sessionTmp
is undefined, default to working/tmp
. Upgraded sites then would not have any loss of session data.
- if
requestTmp
is undefined, guess per the current rules, such as using File::Spec.
- In configure
- If
sessionTmp
undefined, checker determines if working/tmp
exists and contains other than the README. If yes, set to working/tmp
otherwise create the working/session_tmp
directory and use that for session files.
- if
requestTmp
is s undefined, checker can guess using the File::Spec tempdir setting, or as appropriate for the platform. This way there is no significant change for simple installations. Sites with multiple foswiki's installed under different users, or with other unique requirements can set the expert parameter.
And the modules identified above, use the configured
sessionTmp
or
requestTmp
explicitly in all temp file request (and set into the environment for rcs). Document the use of requestTmp in
EmptyPlugin.pm.
--
GeorgeClark - 27 Feb 2011
The title of this topic implies that explicit control of the temp directory is the best solution. There is nothing inherently wrong with using /tmp if the name is sufficiently redundant to avoid conflicts between installs, or is there? One solution is to add something like the current time, for example, as I suggested in
Tasks.Item10408. Is there some other benefit to a require a more substantial fix?
--
RaymondLutz - 10 Mar 2011
The cleanup scripts might well be simplified by putting the various types of temp files in separate directories.
Temporary files are a tricky business - intuitive approaches such as using PIDs or adding times have multiple dangerous failure modes. Use
File::Temp; use file handles (and never temp file names). See
http://perldoc.perl.org/File/Temp.html (read the whole thing, especially the warnings), the
Security::Temporary Files section of the camel book, and
open( ..,'>+', undef) for some basic information.
Please don't re-invent solutions for uniqueness - you'll have a painful experience, open security holes, rediscover portability issues - and consume time that can be applied to more productive uses.
--
TimotheLitt - 20 Apr 2011
This advice is good, however there are applications where there is no way to pass through a file handle. I don't see as we have any choice but to pass through a filename. For example when capturing the STDERR / STDOUT from a Sandbox script. The file name is opened in a different thread. And it's also not good to leave the file handle open by multiple writers, so we have to close it. If you have suggestions for a portable solution here, it would be appreciated.
# Note: Use of the file handle $fh returned here would be safer than
# using the file name. But it is less portable, so filename will have to do.
my ( $fh, $stderrCache ) = tempfile(
"STDERR.$$.XXXXXXXXXX",
DIR => "$Foswiki::cfg{WorkingDir}/tmp",
UNLINK => 0
);
close $fh;
This is the use case that triggered this work. Also to another comment above,
/tmp
is a good location, except on Windows, and possibly some other platforms. There have been cases where the temporary files all end up in the
C:/
root location, or when that location is not writable, Foswiki crashes.
--
GeorgeClark - 20 Apr 2011
I realize that this is already accepted, but I've learned a bit more along the way. It won't be as far reaching.
-
File::Spec
should work fine provided that the $ENV variables for the temporary directory are untainted. The primary effect of the change will be to make sure that the environment variables are set correctly and untainted.
- the
working/requestTmp
directory will be suggested, but not used by default, except on Windows.
- I won't bother renaming
working/tmp
to working/sessionTmp
. Not sure of the value.
--
GeorgeClark - 24 Oct 2012
I'm glad that this is slated for a correction as I decided not to chance an upgrade until this was fixed. I think having a separate temporary file area for each install should fix it, and I think reducing the scope of the change is a smart move. I wonder if it is not also prudent to check that the proposed filename already exists before attempting to use it and then enduring a hard-stop failure.
I was trying to determine if this was already corrected in 1.1.3. Why do I see
Tasks.Item10408 as closed in
http://foswiki.org/Tasks/TasksByRelease?release=1.1.3 ??
--
RaymondLutz - 26 Oct 2012
The bug was "fixed" or at least minimized in 1.1.3 by adding the pid to the temporary file name.
--
GeorgeClark - 27 Oct 2012