myPoorTech
Upgraded to WordPress 3.0
by broom9 on Jun.18, 2010, under web
It’s a good reason for me to start blogging again, lol.
Admin look&feel changed a little bit, haven’t seen any significant improvements.
QCon小记
by broom9 on Apr.26, 2010, under myPoorTech
我去的是QCon第二天。比较有印象的session就是Facebook和Twitter的talk。
1. memcache@facebook
演讲者是Marc,FB资深架构师,对memcache应该是有很强的hands on经验,对各种细节逻辑了解清楚且反应迅速。这个talk讨论了FB对memcache的大量修改和扩展,使得memcache能够有效的scale并 承载极高的流量。
扩展的逻辑之复杂度个人感觉已经超过了原有的memcache。可以认为memcache是充分利用内存来开发可扩展高负载应用的一个良好基础。
Facebook的规模是400m活跃用户,每天billion级的status update,数万台服务器。
memcached服务器每秒承担400m gets请求和28m sets请求,cache了超过2T的items,超过200T bytes。
单台memcached服务器每秒承担80k gets和2k sets,receive 9.7M/s, transmit 19M/s。Facebook的架构大致可以分为DB tier, memcached tier和Web tier三层。
为了memcache,FB实现了新的serialization库,比php serialization快速高效。
mcproxy: memcache tier的顶层是一组mcproxy服务器,用来dispatch请求。memcached服务器是有按照地域的水平分割和冗余的,mcproxy负责基 于这些逻辑进行分发。
对于Hot Keys(系统中出现的热点,比如名人的页面),复制到多台memcached。
对于同一来源并发的大量gets请求,使用Broad Shallow Multi gets的方法将其分组,可以减少gets请求数从而减少数据流量。相应的,memcached服务器要进行冗余和分组,使得每组gets请求只需要发到 一组服务器。
key missing和delete的情况都做了很多处理来scale。
展示了扩展后的key的状态机,看起来相当复杂。
Tesing的原则是test fast and don’t break things。没有使用test framework。
Why memcache works: easy, robust primitives, allow hacking.
2. Big Data in Real-time at Twitter
演讲者Nick Kallen。伯克利毕业,Twitter系统架构师,络腮胡,右耳有长耳钉,轻声慢语,气质相当文艺……
keynote在http://www.slideshare.net/nkallen/q-con-3770885。
Twitter的这个talk主要集中在大数据量和实时这两点上。加上下午的session,主要讨论了四个问题及其解决方法:
a) Tweets. 根据时间进行水平分割。利用查询主要集中在最近的分区这一Locality。但仍存在MySQL死锁,创建新分区费时费力的问题,计划中的解决方案包括基 于主键分区,Cassandra和memcached等。
b) Timeline. offline计算,预存结果。所有的timeline都是预存在memcache里面的,每条tweet都会offline的fanout到所有它应该 出现的timeline上。预存的timeline定期truncate以保证其大小在一定范围内。总结起来就是使用offline计算的原则是查询方式 固定且offline计算结果可以限定在一定范围内;另外一旦offline结果丢失,重建的成本也应该考虑在内。
c) Social graph. Information like who follows whom and who blocks whom. 解决方案简单的说是对每条边进行双向的存储,然后通过分区,冗余和索引来scale。具体比较复杂。
d) Search Index. 在Document的时间两个维度上进行分割。可能使用Lucene代替MySQL。
Twitter,FB以及其他很多talk里面都提到了Cassandra。根据了解到的信息,FB将Cassandra使用在Inbox等应用上,而 Twitter认为Cassandra尚不能胜任critical的应用。
foursquare
by broom9 on Apr.18, 2010, under web
foursquare是个最近很热门也很好玩的应用,大概就是去了什么地方都在手机上check in一下(手机必须有GPS且能装foursquare的客户端,iPhone, Android, Blackberry都可以),这样朋友就知道你去哪儿了。然后可以给去过的地方写写tips,比如某饭馆什么菜好吃,某cafe的wifi密码是多少之类。经常去某个地方的就可能成为那儿的市长(mayor),这样就可以修订这个地方的名字地址位置等信息。另外还有满丰富的badge系统,多用的话就会获得各种各样的奖章。
这个应用整个就是和GPS绑定在一起的,在手机上你可以随时查询周围在foursquare上都有什么地点,有什么tips,在网站上可以看到每个地点都有哪些人check in过。想check in一个地方的话必须真的在那附近,否则就算你check in了也拿不到积分和奖章。所有的地点都是用户自行添加进去的,在北京用这个的人还相当少,很多地点的位置也标得很不准,如果数据能做到跟大众点评那么丰富的话,就真的可以算是killer app了。
总之现在刚开始用,基本上还是纯玩,不太具有像大众点评那样查询信息的功能。在北京这样的“蛮荒之地”,去个地方经常得自己添加地点,倒是有些开拓者的感觉,很容易当上市长,哈哈。这东西主要的目的还是让你知道朋友去哪儿吃饭了去哪儿玩了,这样就可以聊起来,所以foursquare本质上还是SNS性质的吧。
听说Yahoo已经出价1.25亿要买了,希望不要买了就玩死就好……
至于国内已经开始兴起的clone版,这种应用不像twitter,发一条可以同步到n个网站上。这东西的用户粘性是很大的,我去了一个地方肯定不会打开n个应用挨个check in。一个地方的check in信息也没法随便就同步到另外一个地方除非你彻底剽窃地点信息的数据库。所以winner takes all,自求多福吧。。foursquare的手机客户端做得还是相当专业的,而且有Blackberry版,我估计国内clone的一年之内都不见得会有做像样的Blackberry客户端的,啊,为什么我这么鄙视这些clone呢!
Blogbus到WordPress的转换工具
by broom9 on Mar.03, 2010, under myPoorTech
过年的时候帮朋友写了个Blogbus到WordPress的转换工具。Blogbus提供XML格式的导出,转换到WordPress的格式也就是个力气活了。
利用了一些原来Live Space Mover的代码,所以代码还是Python的。Code放在http://code.google.com/p/blogbus-to-wordpress/。
应用放在Google App Engine上了,用起来应该会比较简单。访问
http://blogbus-to-wordpress.appspot.com/
上传Blogbus的备份XML文件,得到转换后的WordPress格式文件,到WordPress后台导入即可。
WordPress导入的时候支持一大堆类型,注意选择WordPress类型。
Google is Taking Its Virtue Back
by broom9 on Jan.13, 2010, under web
Google’s “A new approach to China” announcement is the break news today. My first reaction is “will all stuff of Google China be relocated to US?”, which is a little bit weird, probably because friends talked about “escaping from 贵国” often these days.
Then I read the official blog post carefully and got some comments on it:
- The posts starts with describing the cyber attack, seriously. With “at least twenty other companies” and “relevant US authorities”, this sounds like a real threatening statement. I even doubt there will be the 3rd World War, online.
- Google mentioned their “investigation” several times. It seems they still have plenty of cards to play.
- From my point of view, it isn’t necessary to point out the GMail accounts being hacked are China human rights activists’. This sounds like a small trick to attract western media and people’s attention.
- Not only security and human rights, but also freedom of speech. In my mind this is the POINT. No matter Google’s retreat is more for business value or for morals, if it claims this and does pull out, I will applause.
- Google said it will “discussing with the Chinese government”. I really doubt if it can find the actual person to talk with, which would be from the most famous, most powerful but no-one-has-ever-seen “relevant departments(有关部门)”.
- At the end of the post, Google claimed this decision was made “without the knowledge or involvement of our employees in China”. This is an interesting point. Is it trying to help employees be clear from being bothered by China gov? However China gov definitely won’t care about this line… So is it trying to make something clear to US gov?…
I also read an article from TechCrunch which is insightful in my opinion. One of the points in it is “Google is ready to burn bridges”. Because by throwing out such a blog post, Google doesn’t show any attitude to negotiate to China gov in fact. Google is not acting on impulse and must aware of it. So this article is just buying affection tendency from the rest of world, and giving China gov a slap. However, I wonder if Google China, and its stuff, will really be “thrown under the bus” as the article predicted.
Google is not a charity organization, and it will follow the instinct of a company, to pursue the maximum value. The question is how much different things worth in Google’s eyes. Giving up biz in China may reduce around 600 million of Google’s revenue. How much does it worth to give up conscience and help China gov to strengthen controlling? How much does it worth to stain Google’s reputation and keep good programmers who don’t want to be “evil” away? How much does it worth to waste best people’s talent on censorship shit and flattering god-damned gov suckers?
Anyway, today is memorable, 2010.01.13. Goodbye Google. I respect your efforts here and wish you the best out of China.






