Design for fchdir in DJGPP ~~~~~~~~~~~~~~~~~~~~~~~~~~ Introduction ============ The new POSIX standard requires that an implementation provides the fchdir function. DJGPP CVS (to be 2.04) could support an fchdir function. fchdir is like chdir, but it changes to the directory specified by a file descriptor. This file descriptor refers to a directory. So supporting fchdir would require the normal Unixy I/O functions - open, read, write, etc. - to work with file descriptors referring to directories. The fd_props mechanism, which associates some flags and a filename with a file descriptor, can be used, to mark the file descriptor as a directory. This will allow the Unixy I/O functions to handle directories specially. A new flag FILE_DESC_DIRECTORY would be added. Only the C stream, POSIX and other Unixy functions will be modified to support file descriptors for directories. The file descriptor will be created so that: * it is dup'd off nul; * it is in binary mode; * it is non-inheritable. The first two steps can be accomplished using __FSEXT_alloc_fd. Note that we use __FSEXT_alloc_fd for convenience only - this is *not* a proposal to use the FSEXT mechanism to support directory file descriptors. There are two ways to make the file descriptor non-inheritable. Both methods will be used: * ensure that the file descriptor is > 19 - file handles above 19 are not inheritable, due to a misfeature in the DOS exec call; * set the no-inherit (close-on-exec) bit using fcntl(..., F_GETFD) and fcntl(..., F_SETFD, FD_CLOEXEC|...). (See move_fd in src/debug/common/dbgredir.c on how to do the first part.) POSIX allows us to restrict how directories are handled by the Unixy I/O functions. So let's make the following restrictions: * Directories can only be opened read-only. * Directories can only be read using readdir(). The directory-specific code would be after File System Extensions (FSEXTs) have had their chance to handle I/O operations. The restrictions on the POSIX functions should also be adhered to by the C stream functions. Now on to how specific functions would be updated or implemented. POSIX functions =============== open ---- src/libc/posix/fcntl/open.c I suggest that directories are handled first as a special case in open. Once the normal open call has failed, if the filename refers to a directory (check with access(..., D_OK), open would call this function, say __opendir_as_fd, and then return its return code. __opendir_as_fd would: * check that the directory was being opened as read-only - if not, fail with errno == EISDIR; * allocate a file descriptor for the directory; * set the FILE_DESC_DIRECTORY flag in fd_props for the file descriptor; * return this file descriptor. We should probably also check that O_APPEND was not specified. If it has, return errno == EINVAL. LATER: Don't check this. Since the call will fail, if write flags are given, is there any point checking for O_APPEND? Since _open is a DOS-specific function described as "a direct connection to the MS-DOS open function call", it requires no attention. It fails for directories. ISSUE: Do we want to support the O_DIRECTORY option that Linux does? Here's a description, taken from open(2): "O_DIRECTORY If pathname is not a directory, cause the open to fail. This flag is Linux-specific, and was added in kernel version 2.1.126, to avoid denial-of-ser- vice problems if opendir(3) is called on a FIFO or tape device, but should not be used outside of the implementation of opendir." I think not, but I mention it here for completeness. If we ever want to use open in libc to create a directory, then we should support O_DIRECTORY. But I can't think of any another reason. read ---- src/libc/posix/unistd/read.c We can handle read by adding a check after the termios hook has been called. It would try to find the flags for the file descriptor by using __get_fd_flags. If it can and the file descriptor refers to a directory, it would fail with errno == EISDIR. (EISDIR indicates that readdir should be used instead.) NB: There's no need to wait for the FSEXT handler to be called, since we can't hook directories with FSEXTs. write ----- src/libc/posix/unistd/write.c We can handle write by adding a check after the termios hook has been called and the check for count == 0. The check would try to find the flags for the file descriptor by using __get_fd_flags. If it can and the file descriptor refers to a directory, it would fail with errno == EACCES (LATER: EBADF). NOTE: POSIX seems a bit unclear here to me. It says that the errno == EBADF should be used when "The fildes argument is not a valid file descriptor open for writing." This covers two error conditions: invalid file descriptor; file descriptor not open for writing. So I'm not sure that errno == EACCES is correct above. LATER: EACCESS to mean a lot of things in DJGPP (because of DOS's error codes), so we should use EBADF. IMPLEMENTATION NOTE: It'll be important here that the file descriptor is in binary mode, so that no ASCII-conversion code is executed in write for the directory's file descriptor. If the file descriptor is in binary mode, it will just call _write. NB: There's no need to wait for the FSEXT handler to be called, since we can't hook directories with FSEXTs. close ----- No changes would be required. lseek, llseek, tell ------------------- src/libc/posix/unistd/lseek.c src/libc/compat/unistd/llseek.c src/libc/dos/io/tell.c These all go through llseek. It would try to find the flags for the file descriptor by using __get_fd_flags. If it can and the file descriptor refers to a directory, it would always return 0. ftruncate --------- src/libc/compat/unistd/ftruncat.c It would try to find the flags for the file descriptor by using __get_fd_flags. If it can and the file descriptor refers to a directory, it would fail with errno == EINVAL. fchmod ------ src/libc/posix/sys/stat/fchmod.c src/libc/posix/sys/stat/chmod.c No changes would be required, since this uses chmod() on the file name obtained from fd_props or the long filename (LFN) API. fchown ------ src/libc/compat/unistd/fchown.c As fchown.c says: /* MS-DOS couldn't care less about file ownerships, so we at least check if given handle is valid. */ Since the directory will be dup'd off nul, it will have a valid handle (aka file descriptor). No changes would be required. ioctl ----- src/libc/compat/ioctl/ioctl.c DOS ioctls: Assume the programmer knows what he/she is doing. Unix ioctls: There aren't any interesting ones. No changes would be required. fcntl ----- src/libc/posix/fcntl/fcntl.c The locking code would be updated to fail with -1 and errno == EINVAL for directories. Other than that, no changes would be required. lockf, llockf ------------- src/libc/compat/unistd/lockf.c src/libc/compat/unistd/llockf.c These use the fcntl interface. No changes would be required. fstat ----- src/libc/posix/sys/stat/fstat.c fstat_assist currently assumes that it will not have to handle directories, because directories cannot be opened. This will need re-examining. LATER: We should handle directories specially in fstat, just after the fstat FSEXT hook has been called. select ------ src/libc/compat/time/time.c select() assumes that the file descriptor refers to a file or device. Disk files are always considered to be ready for reading and writing. I suggest that directories should always be considered to be ready for reading and never for writing (since they're supposed to be read-only). fd_input_ready and fd_output_ready could be modified to check for directories. Or perhaps it would be better to add another check after the __FSEXT_ready check, so that we don't pollute fd_(in|out)put_ready with directory-specific code. fsync ----- src/libc/compat/unistd/fsync.c This would fail with errno == EINVAL, since we can't sync directories. fchdir ------ Phew, finally. 8) Probably: src/libc/posix/unistd/fchdir.c This would the get the directory's filename from the fd_props and use chdir to change to it. If fchdir was called on a normal file, it would fail with errno == ENOTDIR. fdopen ------ See the section below. I've included it in the C stream section, because it makes more sense there. C stream functions ================== fopen ----- src/libc/ansi/stdio/fopen.c This requires some changes. To behave similarly to the POSIX function open, we allow the directory to be opened read-only. Since fopen calls open, nothing special is needed here. But we've decided to disallow reads and writes with directories. So none of the read/write flags should be set for the stream. If these flags are clear, then _filbuf and _flsbuf (which are used by fread, fwrite and other I/O calls) will always return EOF. Since there's no data to be read, the EOF flag should also be set. fdopen ------ src/libc/compat/stdio/fdopen.c This will require similar changes to fopen with regard to ensuring that the flags set up correctly. Text mode should be ignored for directories. They should always be in binary mode. freopen ------- src/libc/ansi/stdio/freopen.c This will require similar changes to fopen with regard to ensuring that the flags set up correctly. Text mode should be ignored for directories. They should always be in binary mode. fread ----- src/libc/ansi/stdio/fread.c src/libc/ansi/stdio/filbuf.c Reads on directories should fail with EOF. fread eventually calls _filbuf. The flags we set in fopen (not readable) will cause _filbuf to return with EOF. So, no changes would be required. getc, fgetc ----------- include/libc/file.h src/libc/ansi/stdio/getc.c src/libc/ansi/stdio/fgetc.c Reads on directories should fail with EOF. These eventually call _filbuf. The flags we set in fopen (readable clear) will cause _filbuf to return with EOF. So, no changes would be required. fgets ----- src/libc/ansi/stdio/fgets.c Reads on directories should fail with NULL. fgets calls getc. getc will fail with EOF, which will cause fgets to fail. So, no changes would be required. ungetc ------ src/libc/ansi/stdio/ungetc.c This is like a write to a directory. So it should fail with EOF. As long as the file is at EOF, ungetc will fail with EOF. So, no changes would be required. fwrite ------ src/libc/ansi/stdio/fwrite.c src/libc/ansi/stdio/flsbuf.c Writes on directories should fail with EOF. These eventually call _flsbuf. The flags we set in fopen (writeable clear) will cause _flsbuf to return with EOF. So, no changes would be required. putc, fputc ----------- include/libc/file.h src/libc/ansi/stdio/putc.c src/libc/ansi/stdio/fputc.c Writes on directories should fail with EOF. These eventually call _flsbuf. The flags we set in fopen (writeable clear) will cause _flsbuf to return with EOF. So, no changes would be required. fputs ----- src/libc/ansi/stdio/fputs.c Writes on directories should fail with NULL. fputs calls putc. putc will fail with EOF, which will cause fputs to fail. So, no changes would be required. fflush ------ src/libc/ansi/stdio/fflush.c Writes on directories should fail. The writeable flag should be clear, so nothing will be flushed. So, no changes would be required. fclose ------ src/libc/ansi/stdio/fclose.c We need to tidy up directories too. The tidy up code currently only gets executed for streams with the read or write bits set. It should also be executed for directories. NB: This routine can call _write. We should make sure that _write is not called for directories. fgetpos, ftell -------------- src/libc/ansi/stdio/fgetpos.c src/libc/ansi/stdio/ftell.c Neither of these functions should fail. They should behave like lseek(fd, 0, SEEK_CUR) and return the current offset, which will always be 0. fgetpos calls ftell. ftell will fail, if the stream has neither the readable nor the writeable flags set. If the stream has neither the readable nor the writeable flags set, ftell should check whether the stream's file descriptor is for a directory. If it is, the function should just return 0. fsetpos, fseek -------------- src/libc/ansi/stdio/fsetpos.c src/libc/ansi/stdio/fseek.c Neither of these functions should fail. They should behave like lseek does and always succeed without changing the file offset of 0. fsetpos calls fseek. fseek will fail, if the stream has neither the readable nor the writeable flags set. If the stream has neither the readable nor the writeable flags set, fseek should check whether the stream's file descriptor is for a directory. If it is, the function should just return 0. rewind ------ src/libc/ansi/stdio/rewind.c rewind resets the EOF flag on the stream. We want the stream for a directory to always have the EOF bit set, to make sure certain functions behave the way we want. So rewind should make sure that the EOF flag is not cleared for directories. feof ---- src/libc/ansi/stdio/feof.c Directories cannot be read, so we pretend that they're at EOF. Since the stream for a directory will have the EOF flag set, this will always return EOF for directories. So, no changes would be required. ferror ------ src/libc/ansi/stdio/ferror.c No changes would be required. setbuf, setbuffer, setvbuf, setlinebuffer ----------------------------------------- src/libc/ansi/stdio/setbuf.c src/libc/ansi/stdio/setbuffe.c src/libc/ansi/stdio/setvbuf.c src/libc/ansi/stdio/setlineb.c setbuf, setbuffer and setlinebuffer call setvbuf. No changes would be required. fpurge ------ src/libc/compat/v1/fpurge.c No changes would be required. Finally ======= Thanks to Eli Zaretskii for reviewing and commenting on the design. Richard Dawe $Id: fchdir.txt,v 1.1 2003/11/22 12:41:41 rich Exp $