没注意运算符优先级(漏掉括号)导致的一个 BUG

昨天新上线的一个项目,换了 $_SESSION 的处理方式。代码中有这样一条 SQL

$expiration = 1440 // <---- 这个在别的地方定义的
$sql = 'DELETE FROM session WHERE last_activity < ' . time() - $expiration;

线上代码运行过程不停的 warning ,说 SQL 语句执行出错,出错语句为 “-1440”,找了很长时间都没找到这个 BUG,后来终于找到那个 SQL  语句:

// 运行这一行语句,你会得到 -1400,而不是我们想要的 SQL 语句
echo 'DELETE FROM session WHERE last_activity < ' . time() - 1440;

为什么会是这样呢,因为字符串链接符 . 的优先级比运算符 – 要高,前面那一串先和 time() 返回的数字连成一个字符串,然后再与 1440 做减法运算,前面的一串字符串强制转换为数字 0,然后减 1440,就得到了 -1400,然后错误就出现啦。找到问题就好解决了,加上括号便 OK。

$expiration = 1440 // <---- 这个在别的地方定义的
$sql = 'DELETE FROM session WHERE last_activity < ' . (time() - $expiration);

话说这类错误很不好发现啊,项目里面那么多地方运行 SQL 查询,哪知道是哪里出错的呢。

看来有必要在 Log 里面里面记录函数调用栈,这样才能知道是哪里出错,能快速定位 BUG。

PHP CLI 模式输出彩色字符串

最近在写公司后台脚本程序代码。想到可以将不同的提示信息标注为不同的颜色,比如错误信息为红色,这样在监控输出地时候就可以将注意力集中在红色文字上。参考了别人的代码,做了一点小改动。如果有需要,请自取:(请注意,理论上该代码只能在 *nix 终端内能呈现彩色,Windows 不能用。)

ColorEcho: https://github.com/upliu/ColorEcho

MAC 上编译 PHP intl 扩展出现 Unable to detect ICU prefix or no failed 错误解决方法

进入扩展源码目录:

cd php-5.5.8/ext/intl

依次运行:

phpize
./configure

configure 这步会出错:

configure: error: Unable to detect ICU prefix or no failed. Please verify ICU install prefix and make sure icu-config works.

解决方法如下:

首先安装 icu4c:

brew install icu4c

然后:

./configure --with-icu-dir=/usr/local/opt/icu4c

接下来:

make && make install

大功告成!

Javascript 保留两位小数 保留多位小数

网上搜一番,发现很多手动算的文章,然后想起有 toFixed 方法就可以完成这个事情。我的疑问来了,不是有 toFixed 方法吗?干嘛要手动算,难道 IE6 不支持?我测试了一下,IE6 是支持的,直接 toFixed 就得了,还写函数干嘛?

var num = 123.456789;
alert(num.toFixed(2)); // 输出 123.46
alert(num.toFixed(3)); // 输出 123.457

alert(3.1415926.toFixed(2)); // 输出 3.14

// 下面是网上搜索到的函数
function formatFloat(src, pos)
{
    return Math.round(src*Math.pow(10, pos))/Math.pow(10, pos);
}

alert(formatFloat("1212.2323", 2));

我又想起 PHP 里面有需求是要获取微秒级别的时间戳,我们知道 time 是秒级的,PHP 里还有个函数是 microtime,这个函数默认返回字符串形式,要得到数字形式的怎么弄呢,网上还依然有很多博客(甚至时间为2012 2013年发布的文章,PHP5都出来10年了啊喂)在介绍下面这种老旧的方法:

function microtime_float(){ 
	list($usec, $sec) = explode(" ", microtime()); 
	return ((float)$usec + (float)$sec); 
}

但其实自 PHP 5.0 起 microtime 函数可以接受一个参数,如果为 true,则返回一个浮点数。

我相信,现在几乎没有不支持 PHP5 的环境了吧,microtime_float() 这类函数还有什么存在意义?

PHP 中 print 和 echo 的区别

echo 不表现得像一个函数, 所以不能总是使用一个函数的上下文。 另外,如果你想给echo 传递多个参数, 那么就不能使用小括号。

print 有很多人说 print 是函数,严格来讲不是,虽然 print 有返回值,PHP 官网也说了:print 实际上不是一个函数(它是一个语言结构),因此你可以不必使用圆括号来括起它的参数列表。

print 与 echo 最大的区别是 print 有返回值,而 echo 没有。

实际上 print 的表现更像一个操作符。

以下是一些代码样例:

echo 1; // 合法
print 1; // 合法
echo (1); // 合法
print (1); // 合法
echo 1, 2, 3; // 合法
print 1, 2, 3; // 不合法 !!!
echo (1, 2, 3); // 不合法 !!!
print (1, 2, 3); // 不合法 !!!
5 + echo 1; // 不合法 !!!
5 + print 1; // 合法,因为 print 是有返回值的

var_dump(function_exists('print')); // 输出 bool(false) 也说明了 print 不是函数

最后的结论就是完全没必要使用 print,echo 可以输出逗号分隔的多个值,比较方便。PHP 代码编译后的 bytecode,echo 效率也会比 print 高(print 有返回值嘛,肯定要多一个执行步骤)。

在 PHP 中,什么时候用 stdClass,什么时候用 array

PHP 编程中,如果一个函数要返回多个值,可以以对象 stdClass 的方式,也可以以数组 array 的方式返回数据。那么我们应该什么时候用 stdClass,什么时候用 array 呢?还是都用 array ?

这位开发者的说法是:

  • 当返回有固定结构的数据时,使用对象:
$person
    -> name = "John"
    -> surname = "Miller"
    -> address = "123 Fake St"
  •  当返回列表时使用数组:
"John Miller"
"Peter Miller"
"Josh Swanson"
"Harry Miller"
  •  当返回一组有固定结构的数据时使用对象组成的数组:
$person[0]
    -> name = "John"
    -> surname = "Miller"
    -> address = "123 Fake St"

$person[1]
    -> name = "Peter"
    -> surname = "Miller"
    -> address = "345 High St"

对象不适合保存一组数据,因为总是需要根据属性名去获取属性值,数组可以保存一组数据,也可以保存有固定结构的数据。但是具体使用哪种就看开发者的风格和喜好了。

该开发者给出了一个建议或者说是一般做法,但是并没有给出一个强制的结论。

另外需要注意的是,array 效率比 stdClass 高,请看如下代码:

<?php

$t = microtime(true);
for ($i = 0; $i < 1000; $i++) {
	$z = array();
	for ($j = 0; $j < 10000; $j++) {
		$z['a'] = 'a';
		$z['b'] = 'b';
		$z['c'] = $z['a'] . $z['b'];
	}
}
echo microtime(true) - $t, PHP_EOL;

$t = microtime(true);
for ($i = 0; $i < 1000; $i++) {
	$z = new stdclass();
	for ($j = 0; $j < 10000; $j++) {
		$z->a = 'a';
		$z->b = 'b';
		$z->c = $z->a . $z->b;
	}
}
echo microtime(true) - $t, PHP_EOL;

最终输出结果是:

QQ20140215-1

 

可以看到,array 比 stdClass 确实要快一些。好吧,这点时间其实可以忽略不计啦~

我的结论?结论就是:你喜好用 stdClass 就用 stdClass,你喜好用 array 就用 array 咯,但是同一个项目里最好保持一致,不要有的函数返回对象,而有的函数又返回数组。

【转】优化 PHP 应用的性能

What I will say in this answer is not specific to Kohana, and can probably apply to lots of PHP projects.

Here are some points that come to my mind when talking about performance, scalability, PHP, …
I’ve used many of those ideas while working on several projects — and they helped; so they could probably help here too.
First of all, when it comes to performances, there are many aspects/questions that are to consider:

  • configuration of the server (both Apache, PHP, MySQL, other possible daemons, and system); you might get more help about that on ServerFault, I suppose,
  • PHP code,
  • Database queries,
  • Using or not your webserver?
  • Can you use any kind of caching mechanism? Or do you need always more that up to date data on the website?

 

Using a reverse proxy

The first thing that could be really useful is using a reverse proxy, like varnish, in front of your webserver: let it cache as many things as possible, so only requests that really need PHP/MySQL calculations (and, of course, some other requests, when they are not in the cache of the proxy) make it to Apache/PHP/MySQL.

  • First of all, your CSS/Javascript/Images — well, everything that is static — probably don’t need to be always served by Apache
    • So, you can have the reverse proxy cache all those.
    • Serving those static files is no big deal for Apache, but the less it has to work for those, the more it will be able to do with PHP.
    • Remember: Apache can only server a finite, limited, number of requests at a time.
  • Then, have the reverse proxy serve as many PHP-pages as possible from cache: there are probably some pages that don’t change that often, and could be served from cache. Instead of using some PHP-based cache, why not let another, lighter, server serve those (and fetch them from the PHP server from time to time, so they are always almost up to date)?
    • For instance, if you have some RSS feeds (We generally tend to forget those, when trying to optimize for performances) that are requested very often, having them in cache for a couple of minutes could save hundreds/thousands of request to Apache+PHP+MySQL!
    • Same for the most visited pages of your site, if they don’t change for at least a couple of minutes (example: homepage?), then, no need to waste CPU re-generating them each time a user requests them.
  • Maybe there is a difference between pages served for anonymous users (the same page for all anonymous users) and pages served for identified users (“Hello Mr X, you have new messages”, for instance)?
    • If so, you can probably configure the reverse proxy to cache the page that is served for anonymous users (based on a cookie, like the session cookie, typically)
    • It’ll mean that Apache+PHP has less to deal with: only identified users — which might be only a small part of your users.

About using a reverse-proxy as cache, for a PHP application, you can, for instance, take a look atBenchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
(Yep, they are using Squid, and I was talking about varnish — that’s just another possibility ^^ Varnish being more recent, but more dedicated to caching)

If you do that well enough, and manage to stop re-generating too many pages again and again, maybe you won’t even have to optimize any of your code 😉
At least, maybe not in any kind of rush… And it’s always better to perform optimizations when you are not under too much presure…
As a sidenote: you are saying in the OP:

A site I built with Kohana was slammed with an enormous amount of traffic yesterday,

This is the kind of sudden situation where a reverse-proxy can literally save the day, if your website can deal with not being up to date by the second:

  • install it, configure it, let it always — every normal day — run:
    • Configure it to not keep PHP pages in cache; or only for a short duration; this way, you always have up to date data displayed
  • And, the day you take a slashdot or digg effect:
    • Configure the reverse proxy to keep PHP pages in cache; or for a longer period of time; maybe your pages will not be up to date by the second, but it will allow your website to survive the digg-effect!

About that, How can I detect and survive being “Slashdotted”? might be an interesting read.

 

On the PHP side of things:

First of all: are you using a recent version of PHP? There are regularly improvements in speed, with new versions 😉
For instance, take a look at Benchmark of PHP Branches 3.0 through 5.3-CVS.

Note that performances is quite a good reason to use PHP 5.3 (I’ve made some benchmarks (in french), and results are great)
Another pretty good reason being, of course, that PHP 5.2 has reached its end of life, and is not maintained anymore!

Are you using any opcode cache?

  • I’m thinking about APC – Alternative PHP Cache, for instance (peclmanual), which is the solution I’ve seen used the most — and that is used on all servers on which I’ve worked.
  • It can really lower the CPU-load of a server a lot, in some cases (I’ve seen CPU-load on some servers go from 80% to 40%, just by installing APC and activating it’s opcode-cache functionality!)
  • Basically, execution of a PHP script goes in two steps:
    • Compilation of the PHP source-code to opcodes (kind of an equivalent of JAVA’s bytecode)
    • Execution of those opcodes
    • APC keeps those in memory, so there is less work to be done each time a PHP script/file is executed: only fetch the opcodes from RAM, and execute them.
  • You might need to take a look at APC’s configuration options, btw
    • there are quite a few of those, and some can have a great impact on both speed / CPU-load / ease of use for you
    • For instance, disabling [apc.stat](http://php.net/manual/en/apc.configuration.php#ini.apc.stat) can be good for system-load; but it means modifications made to PHP files won’t be take into account unless you flush the whole opcode-cache; about that, for more details, see for instance To stat() Or Not To stat()?

 

Using cache for data

As much as possible, it is better to avoid doing the same thing over and over again.

The main thing I’m thinking about is, of course, SQL Queries: many of your pages probably do the same queries, and the results of some of those is probably almost always the same… Which means lots of“useless” queries made to the database, which has to spend time serving the same data over and over again.
Of course, this is true for other stuff, like Web Services calls, fetching information from other websites, heavy calculations, …

It might be very interesting for you to identify:

  • Which queries are run lots of times, always returning the same data
  • Which other (heavy) calculations are done lots of time, always returning the same result

And store these data/results in some kind of cache, so they are easier to get — faster — and you don’t have to go to your SQL server for “nothing”.

Great caching mechanisms are, for instance:

  • APC: in addition to the opcode-cache I talked about earlier, it allows you to store data in memory,
  • And/or memcached (see also), which is very useful if you literally have lots of data and/or areusing multiple servers, as it is distributed.
  • of course, you can think about files; and probably many other ideas.

I’m pretty sure your framework comes with some cache-related stuff; you probably already know that, as you said “I will be using the Cache-library more in time to come” in the OP 😉

 

Profiling

Now, a nice thing to do would be to use the Xdebug extension to profile your application: it often allows to find a couple of weak-spots quite easily — at least, if there is any function that takes lots of time.

Configured properly, it will generate profiling files that can be analysed with some graphic tools, such as:

  • KCachegrind: my favorite, but works only on Linux/KDE
  • Wincachegrind for windows; it does a bit less stuff than KCacheGrind, unfortunately — it doesn’t display callgraphs, typically.
  • Webgrind which runs on a PHP webserver, so works anywhere — but probably has less features.

For instance, here are a couple screenshots of KCacheGrind:

KCacheGrind : main screen KCacheGrind : Callgraph exported as an image

(BTW, the callgraph presented on the second screenshot is typically something neither WinCacheGrind nor Webgrind can do, if I remember correctly ^^ )
(Thanks @Mikushi for the comment) Another possibility that I haven’t used much is the the xhprofextension : it also helps with profiling, can generate callgraphs — but is lighter than Xdebug, which mean you should be able to install it on a production server.

You should be able to use it alonside XHGui, which will help for the visualisation of data.

 

On the SQL side of things:

Now that we’ve spoken a bit about PHP, note that it is more than possible that your bottleneck isn’t the PHP-side of things, but the database one…

At least two or three things, here:

  • You should determine:
    • What are the most frequent queries your application is doing
    • Whether those are optimized (using the right indexes, mainly?), using the EXPLAIN instruction, if you are using MySQL
    • whether you could cache some of these queries (see what I said earlier)
  • Is your MySQL well configured? I don’t know much about that, but there are some configuration options that might have some impact.

Still, the two most important things are:

  • Don’t go to the DB if you don’t need to: cache as much as you can!
  • When you have to go to the DB, use efficient queries: use indexes; and profile!

 

And what now?

If you are still reading, what else could be optimized?

Well, there is still room for improvements… A couple of architecture-oriented ideas might be:

  • Switch to an n-tier architecture:
    • Put MySQL on another server (2-tier: one for PHP; the other for MySQL)
    • Use several PHP servers (and load-balance the users between those)
    • Use another machines for static files, with a lighter webserver, like:
      • lighttpd
      • or nginx — this one is becoming more and more popular, btw.
    • Use several servers for MySQL, several servers for PHP, and several reverse-proxies in front of those
    • Of course: install memcached daemons on whatever server has any amount of free RAM, and use them to cache as much as you can / makes sense.
  • Use something “more efficient” that Apache?

Well, maybe some of those ideas are a bit overkill in your situation ^^
But, still… Why not study them a bit, just in case ? 😉

 

And what about Kohana?

Your initial question was about optimizing an application that uses Kohana… Well, I’ve posted someideas that are true for any PHP application… Which means they are true for Kohana too 😉
(Even if not specific to it ^^)

I said: use cache; Kohana seems to support some caching stuff (You talked about it yourself, so nothing new here…)
If there is anything that can be done quickly, try it 😉

I also said you shouldn’t do anything that’s not necessary; is there anything enabled by default in Kohana that you don’t need?
Browsing the net, it seems there is at least something about XSS filtering; do you need that?

Still, here’s a couple of links that might be useful:

 

Conclusion?

And, to conclude, a simple thought:

  • How much will it cost your company to pay you 5 days? — considering it is a reasonable amount of time to do some great optimizations
  • How much will it cost your company to buy (pay for?) a second server, and its maintenance?
  • What if you have to scale larger?
    • How much will it cost to spend 10 days? more? optimizing every possible bit of your application?
    • And how much for a couple more servers?

I’m not saying you shouldn’t optimize: you definitely should!
But go for “quick” optimizations that will get you big rewards first: using some opcode cache might help you get between 10 and 50 percent off your server’s CPU-load… And it takes only a couple of minutes to set up 😉 On the other side, spending 3 days for 2 percent…

Oh, and, btw: before doing anything: put some monitoring stuff in place, so you know what improvements have been made, and how!
Without monitoring, you will have no idea of the effect of what you did… Not even if it’s a real optimization or not!

For instance, you could use something like RRDtool + cacti.
And showing your boss some nice graphics with a 40% CPU-load drop is always great 😉
Anyway, and to really conclude: have fun!
(Yes, optimizing is fun!)
(Ergh, I didn’t think I would write that much… Hope at least some parts of this are useful… And I should remember this answer: might be useful some other times…)

原文链接:http://stackoverflow.com/questions/1260134/optimizing-kohana-based-websites-for-speed-and-scalability

PHP 中 define() 和 const 定义常量时的区别

自 PHP 5.3.0 起,有两种方式定义常量,使用 const 关键字或者 define() 函数:

const FOO = 'BAR';
define('FOO', 'BAR');

这两种方式最根本的区别在于 const 在编译时定义,而 define 在运行时定义。

一、const 不能在条件语句中使用,使用 const 关键字定义常量必须处于最顶端的作用区域:

if (...) {
    const FOO = 'BAR';    // 错误
}
// 但是
if (...) {
    define('FOO', 'BAR'); // 正确
}

二、const 定义常量值必须是一个定值,不能是变量,类属性,数学运算的结果或函数调用,官网说明见这里;而 define 定义常量时可以使用表达式的值:

const BIT_5 = 1 << 5;    // 错误
define('BIT_5', 1 << 5); // 正确

三、const 定义的常量名不能是表达式,而 define 可以,因此下面的代码是合法的:

for ($i = 0; $i < 32; ++$i) {
    define('BIT_' . $i, 1 << $i);
}

四、const 定义的常量名大小写敏感,而 define 可以在定义常量时指定第三个参数为 true 定义一个大小写不敏感的常量:

define('FOO', 'BAR', true);
echo FOO; // BAR
echo foo; // BAR

说明:有人说在 PHP 5.3 之前的版本里面,const 语句只能用在类定义里而不能再全局定义域使用,这点笔者没有去考证,都啥年代了,还用 PHP 5.2 ?另外请注意, PHP 官网上对 const 的说明是放在类与对象里面讲的,也能表明 const 最初设计是用来定义类里面的常量的。

本文文字主要翻译总结自该问题下得票最高的答案:http://stackoverflow.com/questions/2447791/define-vs-const

相关链接如下:

PHP 中数组获取不到元素

早上看到 SO 上一个有关 PHP 的问题,提问者描述有一个数组,使用 print_r 可以看到索引 key 和相对应的 value 都是存在的,但是访问该元素,不管是使用 array[key] 还是 array[‘key’] 这两种访问形式,都提示 Undefined offset 而取不到数据。举例描述提问者的问题,假设一个数组 $a,print_r($a) 的输出为

Screenshot from 2013-11-28 21:19:19

可以看到数组存在索引 1 值为 foo,当使用 $a[1] 或者 $a[‘1’] 访问索引为 1 的元素,都提示 Undefined offset,这就有点让人费解了,下文将讲解这个问题产生的原因,以及如何得到像这样奇怪的一个数组。

首先说明一点,PHP 中数组的 key 可以为整形和字符串,但是包含有合法整型值的字符串会被转换为整型。例如键名 “1” 实际会被储存为 1。来看一个例子,考虑如下代码:

$a = array(
    1      => 'foo',
    '1'    => 'bar',
    'name' => 'upliu',
);
print_r($a);
var_dump($a);

将会输出:

Screenshot from 2013-11-28 21:43:06

可以看到存入到数组里面的 1 为数值索引(注意索引 name 加了引号,说明索引 name 为字符串索引(这不废话嘛,’name’ 肯定是字符串啊)),并且值为 bar 覆盖了先出现的 foo,$a[1] 和 $a[‘1’] 都能正确读取到 bar,且没有任何错误警告提示,说明这个两者都是可用的(笔者在此猜测 $a[‘1’] 实际上完全等效于 $a[1],PHP 数组读取元素的时候会将数值字符串索引转换为数值索引)。

我们先还原一下提问者的问题,看如何生产出那样一个数组。考虑如下代码:

$json = '{"1":"foo"}';
$o = json_decode($json);
var_dump($o);

将会输出

Screenshot from 2013-11-28 21:57:15

这个结果很显而易见,$o 为一个对象,有一个属性为 1,因为该属性并不是合法的 PHP 标识符,因此不能使用箭头的方式访问,我们使用强制类型转换将该对象转换为一个数组:

$a = (array)$o;
print_r($a);

将会输出

Screenshot from 2013-11-28 22:03:35

接下来尝试访问数组 $a 的索引为 1 的元素:

echo $a[1], PHP_EOL;
echo $a['1'], PHP_EOL;

上面两条语句均会报错 Undefined offset,这时数组 $a 就是 SO 上那位提问者遇到问题时碰到的数组了,BUG 重现是一件很爽的事啊。

我们来直接将上面代码中的 json 串解析为数组:

$a2 = json_decode($json, true);
print_r($a2);
echo $a2[1], PHP_EOL;
echo $a2['1'], PHP_EOL;

将会输出

Screenshot from 2013-11-28 22:11:16

一切正常,这个时候问题来了,明明数组 $a 和数组 $a2 使用 print_r 输出一模一样,为什么一个元素可以访问,另一个却不能访问。我们用更强大的 var_dump 看看:

var_dump($a);
var_dump($a2);

将会输出

Screenshot from 2013-11-28 22:14:36

从这个输出我们可以看到数组 $a 和 $a2 的不同,通过将对象强制类型转换得到的数组 $a 拥有一个字符串 ‘1’ 的索引(可以使用 var_dump(array_keys($a))来证实这一点),而我们使用 $a[1] 和 $a[‘1’] 都是访问数组 $a 中索引为 1 的元素,而 $a 并不存在该元素,因此出现错误 Undefined offset。

小结:PHP 默认不会存储整型字符串的索引,会将其转换为数值,在将对象转换为数组的过程中可能引入整型字符串的索引,如果给出索引为整数或整形字符串,访问数组元素都会去获取数组的对应数值索引。

本文实例完整代码如下:

<?php
$json = '{"1":"foo"}';
$o = json_decode($json);

$a = (array)$o;
print_r($a);
echo $a[1], PHP_EOL;
echo $a['1'], PHP_EOL;

$a2 = json_decode($json, true);
print_r($a2);
echo $a2[1], PHP_EOL;
echo $a2['1'], PHP_EOL;

var_dump($a);
var_dump($a2);

var_dump(array_keys($a));
var_dump(array_keys($a2));

foreach ($a2 as $k => $v) {
	var_dump($k);
	var_dump($v);
}
foreach ($a as $k => $v) {
	var_dump($k);
	var_dump($v);
}

 

PHP 支持汉字的反转字符串函数

PHP 里面有一个自带的函数 strrev,该函数可以将字符串反转,例如:

$str = 'abcdef';
echo strrev($str);

将输出:

fedcba

但是该函数并不支持中文,如果字符串含有中文,那么汉字将会乱码。

写了一个支持反转包括汉字的字符串反转函数:

function mb_strrev($str) {
	$len = mb_strlen($str, 'UTF-8');
	$arr = array();
	for ($i = 0; $i < $len; $i++) {
		$arr[] = mb_substr($str, $i, 1, 'UTF-8');
	}
	return implode('', array_reverse($arr));
}

 

使用示例:

$str = '记者获some-letters-here悉嫦娥二号发射工作准备全部就绪';
echo mb_strrev($str);