Archive for the 'Uncategorized' Category
By my math this is 15TB of storage as the original xbox had an internal 10GB disk.
Well I’m up in Seattle for another week. I’m going to be here until the Google Scalability Conference.
I’m trying to compile a list of decent places to work out of until I head back to Seattle. I’ve become spoiled by my 30″ Apple cinema display setup.
I’m going to head over to [...]
I will often use a TreeMap in place of a HashMap because while TreeMap is theoretically slower (O(logN) vs O(1)) it is often much more memory efficient, especially for larger maps.
The problem is that you can’t just swap your map for TreeMap without modifying code because you can’t store null keys in TreeMap.
This code:
map = [...]
Check this out….. Imation is OEMing Mtron drives for 20% cheaper then the normal/stock Mtron drive.
Same hardware but 20% cheaper.
We’ve open sourced the web thumbnail backend that we use within Tailrank.
It needs some work but if you’re ready to get your hands dirty then webthumb will get you 80% of the way to a scalable thumbnail backend.
The API is pretty simple. You just create a REST call to webthumb with a URL to [...]
If you look at the timeline for the release of a product (open source or not) it generally forms a power law distribution on euclidean plane similar to the following:
The y-axis is the number of pending critical bugs and the x-axis is time.
At some point the product managers realize that the number of reported [...]
A mainstream media outlet has FINALLY used the term ‘disapproval rating’ when talking about Bush.
To be fair, this should be used any time an opinion poll shows that only 49% of people approve of a President’s handling of a given situation
Bush, Clinton, Obama, it doesn’t matter.
What bothers me is it takes nearly 70% disapproval for [...]
There’s been more activity in the distributed consensus space recently.
At the Hypertable talk yesterday Doug mentioned Hyperspace, their Chubby-style distributed lock manager. Though I think it’s missing the ‘distributed’ part for now.
To provide some level of high availability, Hypertable needs something akin to Chubby. We’ve decided to call this service Hyperspace. Initially [...]
A new version of Slurp is out the door apparently:
Over the past few weeks, we’ve been preparing for the latest version of the Yahoo! Search crawler with some infrastructure updates, which recently caused a variance in our crawl behavior.
With everything now in place, the rollout has officially begun. The new Yahoo! Slurp 3.0 recognizes the [...]
I’ll be at the MySQL users conference this week. Ping me if you want to chat.
Also, come see my talk on Thursday:
We present the backend architecture behind Spinn3r – our scalable web and blog crawler.
Most existing work in scaling MySQL has been around high read throughput environments similar to web applications. In [...]
It looks like Youtube now has MPEG4 support.
I wrote youtube2ipod transcoder that takes a .flv and builds a mp4 which works with iphones and ipods.
The problem is that it takes about an hour to convert a video.
This is a bit more handy.
Let’s see if it works with Feedburner’s automatic podcast support.
http://googlesystem.blogspot.com/2008/04/download-youtube-videos-as-mp4-files.html
link to [...]
I’m amazed at how swamps and zero oxygen environments can preserve objects across time.
Check out the pictures of this WWII era soviet T34 tank:
The last few months have shown shown a number of internet Meme’s on this subject including the recently found Baby Mammoth:
… and then you have the preserved Antarctic cabins of Scott and [...]
A few weeks ago I blogged about my 10x space saving proposal for storing maps as space efficient primitives:
In my situation I’m nearly seeing 10x overhead. This 350MB memory image of my data structure can be represented as just 35MB on disk. This would be more than fine for my application.
Which is when I had [...]
ICWSM was a great conference. I was finally able to hang out with some of our customers (and potential customers).
It turns out that a bunch of the papers are online. There were some really good talks over the last few days.
Heading back to SF tonight.
Storage Mojo points out the following paper:
Which I have to admit, is pretty cool.
…is a distributed network of intelligent, disk-based, storage appliances that stores data reliably and energy-efficiently. While existing MAID systems keep disks idle to save energy, Pergamum adds NVRAM at each node to store data signa- tures, metadata, and other small items, allowing [...]
Check out this funny class namer. I like SingletonRecordConcatenator…
There are plenty of technical reasons to forgo using Rackspace in favor of another blog host.
Yet another reason is that they have a tendency to censor their customer base:
Last week, it all got weirder. Hosting service GoDaddy mysteriously terminated Sesto’s account, and pulled RateMyCop.com offline. GoDaddy has offered several explanations to Wired’s ThreatLevel blog, but [...]
Here’s an idea I had the other day while talking to Jonathan.
Engineers often know that when designing a database that the filesystem should NOT update the ‘atime’ filesystem attribute.
This can be expensive. On DBs that use LOTs of small files this can be a few orders of magnitude slower.
The problem is that the designer [...]
Looks like I’m heading up to Seattle for about nine days. If anyone wants to meet for a geek lunch let me know. Seattle seems to have some smart people…
Current plans are to head to Mount Rainer on the weekend.
Any other suggestions while I’m there?
The Read Write Web has a good post about crgslst, a new search service for Craigslist:
Denver, Colorado based Superhero.es has built crgslst, a very slick multi-city search tool for Craigslist. Craigslist itself doesn’t offer a multi-search service. By combining the publicly available RSS feeds from Craigslist with AJAX, crgslst fills this need “so fast, we [...]











