Category Archives: linux

LinkedIn and Hadoop 0

Interesting post about LinkedIn’s Data Infrastructure Much of LinkedIn’s important data is offline – it moves fairly slowly. So they use daily batch processing withHadoop as an important part of their calculations. For example, they pre-compute data for their “People You May Know” product this way, scoring 120 billion relationships per day in a mapreduce [...]

FlashCache and ditching caches. 1

I have to admit that FlashCache for Linux looks pretty cool. It basically lets you use a block device SSD as cache. Another hack is to mount the SSD as swap and tell InnoDB to use say 100GB of memory. I haven’t tested this but it might be a fun hack We’ve actually migrated away [...]

Thoughts on InnoDB Page Compression 2

I spend the last couple days playing with InnoDB page compression on the latest Percona build. I’m pretty happy so far with Percona and the latest InnoDB changes. Compression wasn’t living up to my expectations though. I think the biggest problem is that the compression can only use one core in replication and ALTER TABLE [...]

Spinn3r Hiring Senior Unix Operations Engineer 0

Spinn3r is growing fast. Time to hire another engineer. Actually, we’re hiring for like four people right now so I’ll probably be blogging more on this topic. My older post on this subject still applies for requirements. If you’re a Linux or MySQL geek we’d love to have your help. Did I mention we just [...]

Notes on Spinn3r’s Datacenter Migration to Softlayer 2

About two weeks ago we completed a pretty big project to migrate Spinn3r’s operations from ServerBeach over to Softlayer. The entire project, from start to finish, too just over one month. I’m also proud to note that not a single customer noticed any downtime or any issue with our migration. It cost a bit more [...]

Unreliable VPN Code to Detect Bugs by Fuzzing. 0

I had an interesting idea today to find bugs in networking code. Design a VPN that deliberately introduces network packet corruption. One could introduce a tunable to corrupt a certain % of packets. For example, you could bring up a MySQL master/slave on your ethernet network and then launch the VPN to transfer the replication [...]

Buffered Binary Logs… 1

One of the things that has always bothered me about replication is that the binary logs are written to disk and then read from disk. There is are two threads which are for the most part, unaware of each other. One thread reads the remote binary logs, and the other writes them to disk. While [...]

ext4, fallocate, and InnoDB autoincrement 1

This might be a bit cutting edge, but the new fallocate() call in > Linux 2.6.23 might be able to improve InnoDB performance. When InnoDB needs more space it auto-extends the current data file by 8MB. If this is writing out zeros to the new data (to initialize it) then using fallocate() would certainly be [...]

The Middle Path and the Solution to Linux Swap 2

I’m enamored by the middle path. Basically, the idea is that extremism is an evil and often ideological perspectives are non-optimial solutions. The Dalai Lama has pursued a middle path solution to the issue of Tibetan independence. The two opposing philosophies in this situation are total and complete control of Tibet by the Chinese or [...]

Apple Is Getting Lazy 0

For a while I was using Forget Me Not to sync up my laptop when I disconnected from my 30″ cinema display. Users of portable Macs, how many times have you encountered this scenario? You’ve connected your laptop to a nice big monitor, you have your windows arranged for optimal creativity and productivity, then you [...]