Section 3 文件IO

starting: 2017/12/12

3.1 introduction

  describing the function available for file I/O.Most file I/O on Unix system can be performed using only five functions: open, wirte, lseek, and close. Then we examine the effect of various buffer size on the read and write functions.Then atomic operation becomes important,  when we describe the sharing of resources among multiple processes.

3.2 File Descriptors

  one: all open file are referred to by file descriptors,to the kernel.  two: A file descriptor is a non-negative integer. Three: we identify the file with the file descriptor that was return by open or creat as an argument to either read or write. Four: open or create a new file the kernel return a file descriptor to the process. 

      Unix System associate file descriptor 0 with the standard input of a process, 1 with the standard output, 2 with the standard error. the magic numbers 0,1,2 was  replaced with symbolic constants STDIN_FILENO, STDOUT_FILENO and STDERR_FILENO deifned in <unistd.h> to improve readability. 

  file descriptor range from 0 through OPEN_MAX -1.

3.3 open and openat fucntions    进测试,能同时调用两个open对文件进行读写。

  a file is opend or created by calling either the open or openat function. 

  #include<fcntl.h>

  int open(const char* path, int flag, ... /** mode_t mode **/);

  int openat(int fd, const char* path, int flag, ... /** mode_t mode **/);

  the function has a multitude options, which are specified by the flag argument. This argument is formed by ORing together one or more constants from <fcntl.h> header.

  CONSTANTS: one and only one : O_RDONLY, O_WRONLY, O_RDWR, O_EXEC(执行权限), O_SEARCH(applies to directory).  

          the following constants are optional: O_APPEND, O_CLOEXEC, O_CREAT(requires a third argument to the open fucntion- the mode, which specifies the access permission bits of the new file) O_EXCL(Generate an errno when O_CREAT is specified. This is an atomic operation)

         return the lowest-numbered unused descriptor

1 #include<fcntl.h>
2 #include<stdio.h>
3 #include<errno.h>
4 
5 int main()
6 {
7   int fd = open("1.c",O_WRONLY|OCREAT|O_EXCL, 0666);      
8   printf("errno = %d
", errno);     // O_EXCL 保证 原子操作 
9 }

 OPENAT: 1. an absolute path  the fd is ignored .And behaves like the open fucntion. 2. a relative path, the fd specifies the starting location in the file system where the relative path is to be evaluated. 3. the fd is AT_FDCWD, the pathename is evaluated satring  in the current working directory and behaves like the open function.

3.3 filename and pathname truncation

  Question: what happens if NAME_MAX 14 and we try to create a new file in the current directory with a filename containing 15 characters. silently truncating the filename beyond the 14th character. or return an errno.with POSIX.1 the constant _POSIX_NO_TRUNC determines whether long filenames and long components of pathname are truncated or an errno is returned. we use fpathconf or pathconf to query a directory to see which behavior is supported.

  1 #include<stdio.h>
  2 #include<unistd.h>
  3 int main(int argc, char**argv)
  4 {
  5     if(argc != 2)
  6     {
  7         printf("usage a.out <dirname>
");
  8         return 0;
  9     }
 10     else
 11     {
 12         printf("filename : %s/n",argv[1]);
 13
 14     }
 15 #ifdef _POSIX_NO_TRUNC
 16     printf("_POSIX_NO_TRUNC value : %d
", _POSIX_NO_TRUNC);
 17 #else
 18     printf("not supported!!!
");
 19 #endif
 20 
 21 #ifdef _PC_PATH_MAX
 22     int max_pathname_num = pathconf(argv[1], _PC_PATH_MAX);
 23     printf("max_num: %d
", max_pathname_num);
 24 #else
 25     printf("not supported too!!!
");
 26 #endif
 27     return 0;
 28 }

 3.4 creat function

  #iinclude<fcntl.h>

  int creat(const char* path, mode_t mode);   returns: file descriptor opened for write_only if OK, -1 on error.

  Note that this fucntion is equivalent to  int open(path, O_WRONLY|O_CREAT|O_TRUNC, mode). one deficiency with creat is that the file is opened only for writing.a better way to use open as in: 

  int open(path, O_RDWR|O_CREAT|O_TRUNC, mode);

3.5 close function

  #include<unistd.h>

  int close(int fd); return 0 if OK, -1 on error.

  when a process terminates, all of its open files are closed automaticlly by the kernel.

3.6 lseek function

  #include<unistd.h>

  off_t lseek(int fd, off_t offset, int whence); return: new file offset if OK, -1 on error.

  wheence: SEEK_SET(from the begining of the file), SEEK_CUR( current value plus the offset which can be negative or positive), SEEK_END

  because the lseek return new file offset,we can seek zero offset bytes from the current position to determine the current position.Don't cause any I/O to take place.the offset is used by the next read or write operation.

  when the file's offset is greater than the file current size,the next write to the file will extend the file. in this case, it will cause a hole in the file, But do not allocate disk blocks for the data hole 

  _POSIX_V7_ILP32_OFF32 sysconf(_SC_V7_ILP32_OFF32);

3.7 read function  read an opened file

  #include<unistd.h>

  ssize_t read(int fd, void* buffer /** generic pointer**/, size_t nbytes); return number of bytes read,0 if end of the file, -1 on error.

3.8 write function  write an opened file

  #include<unistd.h>

  ssize_t write(int fd, const void* buffer, size_t nbytes); return number of bytes writen if OK, -1 on error. A common cause for a write error is either filling a disk or exceeding the file size limit for a given process.

3.9 io efficinecy

  1 #include<unistd.h>
  2 #include<stdio.h>
  3 #define BUFFSIZE 4096
  4 int main()
  5 {
  6     int n;
  7     char buffer[BUFFSIZE];
  8     while((n = read(STDIN_FILENO, buffer, BUFFSIZE)) > 0)
  9         if(write(STDOUT_FILENO, buffer, n) == -1)
 10             printf("write error!!!
");
 11     if(n < 0)
 12         printf("read error!!!
");
 13 }

   some caveats apply to this program:

  One:it use standard input and output to read and write . the user can redirect them. Two: when the process terminates, the kernal close all open file descriptor in a process. Three:there is no difference between the text and birnary file  for the UNIX kernal.

  let's run the program using different values for BUFFSIZE. In this book, when the size is 4096,increasing the buffsize thee systime time has little positive effect. most file systems support some kind of read-ahead to improve performance. The system try to read more date than an application requests.

3.10 file sharing

  why:  The UNIX System supports the sharing of open files among different process. Solution: the kernal use three data structures to represent an open file.

  1.Process:  process table entry (a) The file descrioptor flags (b) A pointer to a file table entry

  2.The kernal: file table entry (a) file status flags (b) current file offset (c) v-node pointer

  3.V-node structure: (a) contains the type of file and pointers to function that operate on the file. (b) contains an i-node for the file.

            i-node structrue:contains the owner of the file, the pointers to where the actual data blocks for the file are located on disk,and so on.

  Case : if two indenpent process have the same file open,

      :each process table entry has its own file table entry (each process has its own current offset for the file),but only one a single V-node table entry is required for a given file.

  Case: more than one file descriptor entry to the same file table entry. this also happen after fork when the parent and child share the sanme file table entry for each open file.

  Case: Note the difference between the file descriptor flags and the file status flags. the former apply only to a single descriptor in a signer process, whereas the latter apply to all descriptors in any process that point to the given file table entry.

3.11 atomic operations

  there is always the posibility that the kernal might temporarily suspend the process between the two function calls.

  The single UNIX Specification includes two fucntions that allows applications to seek and perform I/O atomically: pread and pwrite.

  case 1: lseek and read or write as an atomic operation.

  #include<unistd.h>

  ssize_t pread(int fd, void* buf, size_t nbytes, off_t offset)

  ssize_t pwrite(int fd, const void* buf, size_t nbytes, off_t offset) return -1 if not ok   

  case 2:creating a file

  O_CREAT,O_EXCL/ ** test and create **/

  

1 if( (fd = open(path, O_WRONLY) ) < 0 )
2     if( ENOENT == errno)
3         if( (fd = creat(path, mode)) < 0 )
4             printf(" creat fail 
");
5     else
6         printf(" open fail 
");

  if the operation is performed atomically, either all the steps are performed( on success) or none are performed(on failture).

3.12 dup amd dup2 function  duplication

  #include<unistd.h>

  int dup(int fd);

  int dup2(int fd, int fd2); return new file descriptor if OK -1 on error

  the close-on-exec file descriptor flag for the new descriptor is always clear by thr dup functions.

3.13 sync,fsync, and fdatasync function

  Traditional implemention of the UNIX System have a buffer cache or page cache in the kernal. DELAY WRITE.

  #include<unistd.h>     // synchronize/ consistency 

  int fsync(int fd);  // 确保数据写到了磁盘上

  int fdatasync(int fd);/** only data portions of a file **/   return 0 if ok  -1 on error

  void sync(int fd);

3.14 fcntl function

  The fcntl function can changee the properties of a file that is already open.

  #include<fcntl.h> file control

  int fcntl(int fd, int cmd,.../* int arg */); // return -1 on error 

  The fcntl function is used for five different purposes.

  case 1: Duplicate an existing descriptor( cmd = F_DUPFD or cmd = F_DUPFD_CLOEXEC) // The new descriptor clear or not FD_CLOEXEC file descriptor flg

  case 2: Get /Set file descriptor flags( cmd = F_GETFD or cmd = FSETFD) //  Get fd flag  only one file descriptor flag is defined : the FD_CLOEXEC flag 

  case 3: Get/Set file status flags( cmd = F_GETFL or cmd = F_SETFL)//  File status flg : O_RDONLY ...  THe only file status flag can be changed O_APPEND , O_SYNC, O_DSYNC

      O_RSYNC except other five status   // O_SYNC 同步写到磁盘,根据不同的系统情况不一样

  case 4: Get/Set asynchronous I/O ownership( cmd = F_GETOWN or cmd = F_SETOWN)

  case 5: Get/Set record locks( cmd = F_GETLK, F_SETLK, or F_SETLKW)

  

 1 #include<fcntl.h>
 2 #include<stdio.h>
 3 
 4 void set_fl(int fd, int flags)
 5 {
 6     int val;
 7     if((val = fcntl(fd, FD_GETFL, 0)) < 0)
 8         printf("fcntl error
");                 
 9     val &=flag;   // val |= flags;
10     
11 }   

3.15 ioctl function

  #include<unistd.h> /** System V **/

  #include<sys/ioctl.h> /** liunx and BSD **/

  int ioctl( int fd, int request, .....);   -1 on error, something else if OK.   there only one more argument , it is usually a pointer to a variable or a struct. beyond basic operation.

3.16 /dev/fd

  Opening the file /dev/fd/n is equivalent to duplicating descriptor n, assuming that descriptor n is open.mode only is the first mode subset

  用 /dev/fd/0 做 creat 函数 会得到file descriptor 但是不能读写,mode 只能设置先前mode的子集

Question:

  1只是用户没有缓冲区

  

原文地址:https://www.cnblogs.com/yetanghanCpp/p/8027375.html