Engineering @ Facebook's Notes

Facebook's Scribe technology now open source
Here at Facebook, we're constantly facing scaling challanges because of our enormous growth. One particular problem we encountered a couple of years ago was collection of data from our servers. We were collecting a few billion messages a day (which seemed like a lot at the time) for everything from access logs to performance statistics to actions that went to News Feed. We used a variety of different technologies for the different use cases, and all of them were bursting at the seams. We decided to build a unified system (called Scribe) to handle all of these cases, and do it in a way that would scale with Facebook's growth. The system we built turned out to be enormously useful, handling over 100 use cases and tens of billions of messages a day. It has also been battle tested by just about anything that can go wrong, so I encourage you to take a look at the newly opened Scribe source and see if it might be useful for you. To give the code some context, I'm going to go through the major design decisions we made to allow the system to scale.

The first decision we made was to not lock ourselves into a particular network topology. The Scribe servers are arranged in a directed graph, but each server only knows about the next server in the graph. This flexible topology allows for things like adding an extra layer of fan-in if the system grows too large, and batching messages before sending them between datacenters, but without having any code that explicitly needs to understand datacenter topology, only a simple configuration.

The second major design decision was about reliability. We chose was a middle ground here, reliable enough that we can expect to get all of the data almost all of the time, but not reliable enough to require heavyweight protocols and disk usage. More specifically, Scribe spools data to disk on any node to handle intermittent connectivity node failure, but it doesn't sync a log file for every message, so there's a possibility of a small amount of data loss in the event of a crash or catastrophic hardware failure. Basically, this is more reliability than you get with most logging systems, but not something you should use for database transactions. As it turned out, this is a reasonable level of reliability for a lot of use cases, and has made scaling much easier. It's also the source of a lot of the hard-learned lessons: getting the system to catch up seamlessly after a significant network problem is tricky, especially when there are tens or hundreds of gigabytes of data backed up.

The final design decision was about the data model. When you're building something that looks like a logging system there are a lot of things people expect: logging levels and rules about when they get sent, timestamping and ordering of messages, schemas for common messages, etc. We decided that this was a can of worms that shouldn't be mixed up with the asynchronous and mostly reliable delivery of data, so we made the data model very simple. A message is two strings: a category and the actual message. The category is the description of what the message is about, and the expectation is that messages of the same category end up in the same place. The message is the actual data to be logged. We also don't have any a priori list of categories that must be maintained. If you create a new category it shows up at a new file. This is following the Unix philosophy of doing exactly one thing and doing it well, and it has definitely paid off in ease of use and development. We started with four or five use cases in mind and now we have hundreds, but we didn't have to modify the Scribe source for any of them.

Another choice we made early on was to build Scribe using Thrift. This sped up development enormously because a lot of the hard parts were already taken care of, and it also made the resulting system much more flexible. We currently log messages to Scribe from PHP, python, C++, and Java code, and the list of possible languages is growing all the time from the contributions of developers around the world. So Scribe has already benefitted enormously from Thrift being open, and it will be even better having Scribe open too. I hope you find it as useful as we do.
Diane at 12:35pm October 24
Do you have basic information on how to utilize facebook ?? I'm old and don't know all the different things I can do on it. Do you have some sort of instruction or expanation of all the different functions and wat they are for and how they work ??
Thank you for your paitence and help,
Diane Cheek
crazyladydi57@aol.com
Jonathan at 4:40pm October 24
Dear Diane,
Scribe is software to help people make more software like facebook. My advice - learn how to use facebook by using facebook. By that I mean - ask your facebook friends, they can help!
Diane at 7:48pm October 24
Thank you for your advice. But the people I know are my daughters and they are new at this to. The terms used for the computer world are sometimes difficult to know what the function is or does.
Now give me a sewing machine and I can tell you all of it's functions and how it processes.
I wonder if there is one of the books called " How to operate facebook or my space for DUMMIES" ??
Michael at 9:42pm October 24
I had been hoping to see this for awhile, this is great news! It is kind of strange that different Facebook Engineering projects have chosen Google Code hosting, Facebook hosting, Apache hosting, and SourceForge hosting though.
Jordan at 11:15pm October 24
In fact Diane, there is such a book. Check it out!

http://www.facebook.com/pages/Facebook-for-Dummies/7087758108
Joe at 12:32pm October 25
Please fix your RSS feed -- The entire contents of your posts here merge into one large blob of text.
Kevin at 6:49pm October 25
GREAT!
@Michael:
I'm glad to try the FB hosting.
Andrew at 8:06am October 27
Very well said, Robert!

I'm always looking for good examples of appropriate design instead of acronym driven design, and this is a great example. Thanks for sharing!
Gergely at 9:54am October 27
Robert: can you share some of the use cases? TIA!
Larry at 5:14pm October 27
Hey Diane-
Here's a pretty good run down of all the different things you can do on Facebook and how to use them:

http://www.facebook.com/help.php

Good luck and have fun!
Ryan at 6:15am October 28
I tried getting a copy of Scribe from SourceForge

http://sourceforge.net/projects/scribeserver

to use on my enterprise applications I develop, however it appears there are not releases available? When can we expect releases (even alpha or beta) available?
Anthony at 5:56pm October 28
Ryan, There is a release of Scribe version 2.0 on sourceforge. It is under the 'code' tab. You can either use svn or download a tarball:

http://scribeserver.svn.sourceforge.net/viewvc/scribeserver/releases/
John at 11:42pm October 28
I read that Scribe logs are used as basis for redo logs to dynamically build searches indexes for some Facebook search functions. Is this true? If so, could you expand on this here or in a future post?
Dave at 4:22pm October 29
Scribe!
Hong at 9:15pm November 11
John, Facebook search uses redo logs to keep search indexes up-to-date. User updates are appended to the redo logs and the search backend periodically tails the redo logs to pick up new information and dynamically update the search indexes, so that, for example, when a new user joins Facebook she/he becomes searchable almost immediately (unless the user chooses to remain unsearchable and sets her/his search visibility accordingly). Our current redo log implementation is based on a Thrift service which is similar to Scribe but customized for search functions, but we are planning to migrate the implementation to use Scribe.
Chandranshu at 11:55pm November 20
I am trying to use Scribe with RoR but it turns out there are quite a few problems in getting the required dependency fb303 (FacebookService) to work with ruby. I have been able to work around most of them and will be publishing all the steps on my blog shortly. Where can I contact the developers?( IRC/ mailing list/dev page)
Anthony at 12:15am November 21
Chandranshu, we have a mailing list on Scribe's sourceforge page:
http://developers.facebook.com/scribe/

I'm interested in hearing about your experience with Scribe/fb303/Ruby.
Chandranshu at 4:20am November 21
Anthony, I have reported all the steps that I have tried till now to get Scribe working with Ruby. I'm stuck at an exception which I think I'll not be looking into again till Monday :)

Cheers
Chandranshu
Brett at 7:09am November 26
You say "everything from access logs to performance statistics to actions that went to News Feed" - can you give an example of how you're logging to scribe with say, Apache? (or whatever you guys happen to be using)

Did you disable the "real" access logs and instead setup logging at the application level? Or did you write a small Apache (or whatever) mod to do this? I'm trying to imagine how it's setup, because I assume this is your full replacement for what someone might do with remote syslogging - right?
Anthony at 7:01pm November 26
Brett, Facebook uses Scribe to log all Apache error logs to a central location. We do not disable Apache’s logging. We just use a simple script to tail the existing Apache logs on each machine and write the data to Scribe. Since every machine running Apache is also running Scribe locally, this solution scales well. Now that all logs from many servers have been aggregated into a single log, we can easily monitor this log for problems across our entire site.

But Scribe logs much much more data than just Apache logs. Any Facebook feature that needs to aggregate data from many machines can simply write the data to Scribe. And we have over a hundred use cases where Scribe has come in handy.
John at 12:20am December 18
very interesting project. I presumed that it is very similar to Yahoo's shmproxy as a data bus. Got a couple of questions:

1. w.r.t Anthony's answer of aggregating apache log, I am wondering why facebook is not using "ErrorLog syslog:localX" for apache error log aggregation and 'CustomLog "| /usr/bin/logger -p localX.notice -t something"' for access log aggregation over (r)syslog?

2. reading over the python lib. The client has a simple send_log, recv_log interface, nice. But if messaging order are important, how to do tcp like sliding window type batch sent, ack?

thanks,
-john
Anthony at 3:46pm December 19
John,
The primary reason we use Scribe to aggregate Apache logs is that Scribe enables us to easily aggregate logs from many,many machines into a central location. You can still use Apache’s custom logging to filter logs locally and then use Scribe to aggregate logs from multiple machines.

The send_log and recv_log functions you are referring to are not technically a part of Scribe’s interface, and you can ignore them. These functions are auto-generated by Thrift. Scribe is built on top of Thrift to support cross-language remote procedure calling. The only function in Scribe’s interface is Log(). See the scribe.thrift file to see how Scribe defines a Thrift service. And check out http://developers.facebook.com/thrift/ to learn more about Thrift.

In this note

No one.