PHP monolog write log 引发的一些思考

2020-01-21 PHP C Atomic Word Count: 2260 words Read Time: 5 minutes

PHP Monolog 并发写的原子性

关于 PHP项目的 log 记录的一些思考。为什么并发这么高的项目但是里面的log都没有乱掉。多个进程对同一个文件进行操作。

#Monolog
Logger::notice('log', 'logging', [$uid, $sn, $playContentType]); // 这种

看Monolog 的源码发现了他们有对于文件锁的获取

if (!is_resource($this->stream)){...}
if ($this->useLocking) {
    // ignoring errors here, there's not much we can do about them
    flock($this->stream, LOCK_EX);
}

$this->streamWrite($this->stream, $record);

if ($this->useLocking) {
    flock($this->stream, LOCK_UN);
}

/**
 * Write to stream
 * @param resource $stream
 * @param array    $record
 */
protected function streamWrite($stream, array $record): void
{
    fwrite($stream, (string) $record['formatted']);
}

Monolog用的就是fwrite, 然后有一个useLocking 字段作为锁

flock() # 加锁操作。可以给资源加几种锁，
LOCK_SH #to acquire a shared lock (reader). 共享锁
LOCK_EX #to acquire an exclusive lock (writer). 排它锁
LOCK_UN #to release a lock (shared or exclusive). 释放锁
LOCK_NB # not block lock 不会堵塞的锁 (windows not support)
# 锁是bit operate 几个锁之间的 
 #impotent！ php 5.3.2在文件资源句柄关闭时不再自动解锁。现在要解锁必须手动进行。
 4.0.1	增加了常量 LOCK_XXX。 之前你必须使用 1 代表 LOCK_SH，2 代表 LOCK_EX，3 代表LOCK_UN，4 代表 LOCK_NB。

但是发现 useLocking 默认为false 就是默认不加锁

  /**
    * @param resource|string $stream
    * @param string|int      $level          The minimum logging level at which this handler will be triggered
    * @param bool            $bubble         Whether the messages that are handled can bubble up the stack or not
    * @param int|null        $filePermission Optional file permissions (default (0644) are only for owner read/write)
    * @param bool            $useLocking     Try to lock log file before doing any writes
    *
    * @throws \Exception                If a missing directory is not buildable
    * @throws \InvalidArgumentException If stream is not a resource or string
    */
   public function __construct($stream, $level = Logger::DEBUG, bool $bubble = true, ?int $filePermission = null, bool $useLocking = false)
   {
       parent::__construct($level, $bubble);
       if (is_resource($stream)) {
           $this->stream = $stream;
       } elseif (is_string($stream)) {
           $this->url = $stream;
       } else {
           throw new \InvalidArgumentException('A stream must either be a resource or a string.');
       }

       $this->filePermission = $filePermission;
       $this->useLocking = $useLocking;
   }

很奇怪才发现php 官方文档fwrite里有个note很重要的一句话:

If handle was fopen()ed in append mode, fwrite()s are atomic (unless the size of string exceeds the filesystem’s block size, on some platforms, and as long as the file is on a local filesystem).

fwrite 在 append mode 下是默认的原子操作。所以就保证了多进程的写操作。

$this->stream = fopen($this->url, 'a');

既然都看到这了,想好好整理下，多进程下低层的fwrite写操作,，就顺便看看PHP源码,看看他到底怎么用锁还是什么东西保证原子性的，（看自己能看懂到哪儿一步。

先下载PHP的源码，然后第一步查看PHPfwrite函数定义


#ext/standard/file.c
/* {{{ proto int|false fwrite(resource fp, string str [, int length])
   Binary-safe file write */
PHPAPI PHP_FUNCTION(fwrite)
{
	...
    # 
    # 写入的代码
	ret = php_stream_write(stream, input, num_bytes);
	if (ret < 0) {
		RETURN_FALSE;
	}

	RETURN_LONG(ret);
}

然后 php_stream_write 在头文件中一顿搜索

/* Writes a buffer directly to a stream, using multiple of the chunk size */
static ssize_t _php_stream_write_buffer(php_stream *stream, const char *buf, size_t count)
{
	ssize_t didwrite = 0;
    ...
	while (count > 0) {
	    # 使用的 ops->write 方法， 
		ssize_t justwrote = stream->ops->write(stream, buf, count);
		if (justwrote <= 0) {
			/* If we already successfully wrote some bytes and a write error occurred
			 * later, report the successfully written bytes. */
			if (didwrite == 0) {
				return justwrote;
			}
			return didwrite;
		}

		buf += justwrote;
		count -= justwrote;
		didwrite += justwrote;

		/* Only screw with the buffer if we can seek, otherwise we lose data
		 * buffered from fifos and sockets */
		if (stream->ops->seek && (stream->flags & PHP_STREAM_FLAG_NO_SEEK) == 0) {
			stream->position += justwrote;
		}
	}

	return didwrite;
}

一开始在stream->ops->write 这儿卡住了

// php_stream.h

const php_stream_ops php_stream_output_ops = {
	php_stream_output_write,// stream->ops->write 的函数。
	php_stream_output_read,
	php_stream_output_close,
	NULL, /* flush */
	"Output",
	NULL, /* seek */
	NULL, /* cast */
	NULL, /* stat */
	NULL  /* set_option */
};

// php_output.h
/* convenience macros */
#define PHPWRITE(str, str_len)		php_output_write((str), (str_len))
#define PHPWRITE_H(str, str_len)	php_output_write_unbuffered((str), (str_len))

//然后再 main/output.c
/* {{{ void php_output_startup(void)
 * Set up module globals and initialize the conflict and reverse conflict hash tables */
PHPAPI void php_output_startup(void)
{
	ZEND_INIT_MODULE_GLOBALS(output, php_output_init_globals, NULL);
	zend_hash_init(&php_output_handler_aliases, 8, NULL, NULL, 1);
	zend_hash_init(&php_output_handler_conflicts, 8, NULL, NULL, 1);
	zend_hash_init(&php_output_handler_reverse_conflicts, 8, NULL, reverse_conflict_dtor, 1);
	// 将php_output_stdout 赋值给php_output_direct 
	php_output_direct = php_output_stdout;
}

// 到达终点
/* {{{ stderr/stdout writer if not PHP_OUTPUT_ACTIVATED */
static size_t php_output_stdout(const char *str, size_t str_len)
{
    // 底层还是用的fwrite，fwrite是ANSIC标准的C语言库函数，
	fwrite(str, 1, str_len, stdout);
	return str_len;
}

以前写C的时候有俩套文件操作，write和fwrite

write是Linux操作系统的系统调用，
fwrite是ANSIC标准的C语言库函数，fwrite在用户态是有缓冲区的。因此需要锁机制来保证并发环境下的安全访问。

上网查了一篇帖子fwrite也不是安全的这儿，

用模拟发现C 的fwrite 这玩意真的不是 atomic 的

fwrite => write 这么一个过程。

fwrite —–> c库缓冲—–> fflush———〉内核缓冲——–fsync—–〉磁盘

read/write/fsync：
linux底层操作；

内核调用， 涉及到进程上下文的切换，即用户态到核心态的转换，这是个比较消耗性能的操作。

fread/fwrite/fflush：
c语言标准规定的io流操作，建立在read/write/fsync之上

在用户层， 又增加了一层缓冲机制，用于减少内核调用次数，但是增加了一次内存拷贝。

php 测试

# log 每秒 100 并没有乱 
ab -c 100 -n 1000 http://test.api

C 测试fwrite 和 write function

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

int main() {
  FILE* h = fopen("file.txt", "a");
  if (h == NULL) { perror("Could not open file for appending"); exit(1); }
  if (fork() == 0) {
    char* line = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n";
    for (int i = 0; i < 10000; i++) {
      if (fwrite(line, strlen(line), 1, h) != 1) { perror("Could not append line to file"); exit(1); }
    }
    if (fclose(h) != 0) { perror("Could not close file"); exit(1); }
  } else {
    char* line = "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb\n";
    for (int i = 0; i < 10000; i++) {
      if (fwrite(line, strlen(line), 1, h) != 1) { perror("Could not append line to file"); exit(1); }
    }
    if (fclose(h) != 0) { perror("Could not close file"); exit(1); }
  }
  return 0;
}

总结

拢拢总总看了俩天的C 的fwrite 和write 说下结论吧：

fwrite (POSIX系统实现): 由于是有一个写缓冲buffer，所以在并发写文件是会有可能出现多行换行的情况。
write 是系统级别的写指令。只要是不超过 PIPE_BUF(A write that’s under the size of ‘PIPE_BUF’ is supposed to be atomic. That should be at least 512 bytes, though it could easily be larger (linux seems to have it set to 4096).)

也验证了上面的C代码，确实。

然后回到一开始的 PHP monolog 写日志会不会乱，首先来说Of Course 从他们的issue 里，也有人提了，只是开发者以一个理由，这个应用到了很多高并发的企业级的应用里也没出现这个问题，不想用默认 useLocking 来增加锁的获取，影响性能。（虽然issue 里有人贴出了翔实的数据测试并没有多大的影响）

logging.apache.org/log4php 也是会对fwrite 进行锁

[logging-log4php.git]/src/main/php/appenders/LoggerAppenderFile.php

class LoggerAppenderFile extends LoggerAppender {
   
           /**
            * If set to true, the file is locked before appending. This allows 
            * concurrent access. However, appending without locking is faster so
            * it should be used where appropriate.
            * 
            * TODO: make this a configurable parameter
            * 
            * @var boolean
            */
         protected $locking = true;
         ...
          /**
           * Writes a string to the target file. Opens file if not already open.
           * @param string $string Data to write.
           */
          protected function write($string) {
                  // Lazy file open
                  if(!isset($this->fp)) {
                          if ($this->openFile() === false) {
                                  return; // Do not write if file open failed.
                          }
                  }
                  
                  if ($this->locking) {
                          $this->writeWithLocking($string);
                  } else {
                          $this->writeWithoutLocking($string);
                  }
          }
          
          protected function writeWithLocking($string) {
                  if(flock($this->fp, LOCK_EX)) {
                          if(fwrite($this->fp, $string) === false) {
                                  $this->warn("Failed writing to file. Closing appender.");
                                  $this->closed = true;                           
                          }
                          flock($this->fp, LOCK_UN);
                  } else {
                          $this->warn("Failed locking file for writing. Closing appender.");
                          $this->closed = true;
                  }
          }

所以 PHP底层应该是用的write 写入指定的一个块大小的。系统内核级别的保证原子性。

但是自己试验始终没有测试出来，还是挺遗憾的。不过看这个还是找到几个很好玩的链接

Linus 跟人讨论关于所有信号量的设计(linus 02的邮件，好暴躁的。)

Permalink: https://z8n24.top/2020/01/php_fwrite/

License:

z8n24

A Computer Engineer

PHP monolog write log 引发的一些思考

PHP Monolog 并发写的原子性

总结

z8n24A Computer Engineer