How to get lots of data from the Facebook Graph API with just one request

Optimizing request queries to the Facebook Graph API

Oct 09, 2014

In a typical Facebook app we often find ourselves grabbing a trivial amount of data from the Facebook Graph API. Usually the data will include a user's ID, name and email for Facebook login.

But when we start creating Facebook apps that depend on larger amounts of data from Facebook, sometimes it's not obvious how to retrieve the data efficiently. Luckily there are several methods Facebook offers to efficiently retrieve lots of data from the Graph API.

Note: Use the Graph API Explorer to play with the various request methods we'll be discussing below.

Getting data the wrong way

Say you wanted to get the following info from Graph:

Assuming you have obtained a user access token with the user_photos & user_likes permissions, you might be inclined to do something like this:

GET /me?fields=id,name,email
GET /me/photos?limit=5
GET /me/likes?limit=3

That's three requests to Graph that can actually be combined into one. And I'm not just talking about a batch request which is actually not the best method to optimize this particular case.

The four methods we'll discuss for optimizing your Graph queries are (1) FQL, (2) nested requests, (3) multiple ID read requests and (4) batch requests.

Some Graph terminology

Before we dive too deep, it'll help to learn three basic Graph terms: nodes, fields & edges.

Nodes

A node is an object returned from Graph like a user, page or event. If a page has an id of 1337 it can be accessed by the following endpoint:

GET /1337

I'll be using "node" and "Graph object" interchangeably throughout this article.

Fields

Fields reference specific properties of a Graph object. So a page object will have an ID, name and description among other things. You can specify which fields you'd like to be returned for a particular node using the fields modifier.

GET /1337?fields=id,name,description

Edges

Data on Facebook is very relational. A page for example could have many timeline posts associated with it. This relation between two Graph objects is called an edge. We can grab the page timeline posts from a page with an ID of 1337 like so:

GET /1337/posts

In this example posts is not a node since it doesn't represent a Graph object. Instead it represents a relation to other Graph objects. It is an edge because it connects a page Graph object and its posts.

Getting data the optimized way

Now that we have a better understanding of Graph API terminology, let's explore the various methods for getting lots of data from Graph in a single request.

FQL (Facebook Query Language)

Let's start with FQL which is a really huge feature of the Facebook development platform. And it is... wait for it... officially deprecated.

The last version of Graph to support FQL is v2.0. Since Graph v2.1 was released on August 7th, 2014, FQL will be dead as of August 7th 2016. This is because Facebook supports deprecated versions of Graph for two years.

Facebook is pulling the plug on the "SQL for Facebook" in favor of the other request methods I'll discuss below. If you have a Facebook app that currently uses FQL, you had better start reading up on the other request methods unless you want a broken app in the very near future!

So since FQL is being replaced by other request methods, we'll skip it and move on to the other methods.

Nested requests (a.k.a. field expansion)

When retrieving data from a single node (using the same access token for each request), then you'll want to use nested requests to optimize your requests to Graph.

The /me endpoint (node) returns data on the authenticated in user. The endpoint that returns the photos that the authenticated user is tagged in is /me/photos (edge).

The /me/photos endpoint could be expressed as a nested request: /me?fields=photos. That doesn't seem to be too useful until we want to also pull in another edge. For example, the pages that the user likes from the /me/likes endpoint (edge). Then it makes sense to pull in the users photos & likes with the following nested request:

GET /me?fields=photos,likes

Nested requests syntax

The syntax for nested requests looks like this:

/<node-id>?fields=<first-level>{<second-level>}

Where <node-id> is the Graph object you want to retrieve, <first-level> is the name of the field or edge. If <first-level> is an edge, you can further specify that edge's fields or edges in <second-level>.

Using our example from above, let's get info on a user, the photos they were tagged in and their page likes.We can get all that data with one nested request.

GET /me?fields=id,name,email,photos{id,name,source},likes{id,name}

You'll notice that this will return lots of photos and likes, so to limit the results from each of those edges we use the .modifier(value) syntax.

GET /me?fields=id,name,email,
    photos.limit(5){id,name,source},
    likes.limit(3){id,name}

Edges can be embedded infinitely deep. For example, getting a user's events and those event's photos and those photo's likes.

GET /me?fields=events{name,photos{source,likes{name}}}

This can get hairy if you're going deep and have lots of fields and edges you're specifying. I wrote a package in PHP called the Facebook query builder that makes it a lot easier to work with complex nested requests.

Multiple ID read requests

If you want to get data on multiple nodes in one request to Graph, you can make use of multiple ID read requests.

You simply use the ids param on the root / endpoint with a comma-separated list of Graph nodes.

GET /?ids=<node-id-a>,<node-id-b>

This is equivalent to sending two requests.

GET /<node-id-a>
GET /<node-id-b>

The nodes don't have to be of the same Graph object type so <node-id-a> could be a user and <node-id-b> could be an event.

You can also specify an edge that you would like to return for each Graph node.

GET /<edge>?ids=<node-id-a>,<node-id-b>

Say you wanted to get the profile picture from three different users in one request. We can specify that we want the results form the picture edge for each of the users.

GET /picture?ids={user-id-a},{user-id-b},{user-id-c}&redirect=false&type=large

That will return a JSON response like this:

{
  "{user-id-a}": {
    "data": {
      "url": "https://user-id-a.jpg", 
      "is_silhouette": false
    }
  },
  "{user-id-b}": {
    "data": {
      "url": "https://user-id-b.jpg", 
      "is_silhouette": false
    }
  },
  "{user-id-c}": {
    "data": {
      "url": "https://user-id-c.jpg", 
      "is_silhouette": false
    }
  }
}

This also works across different types of Graph objects. The user, page and event Graph objects all provide a photos edge. So you could get a list of photos for a user, a page and an event in the same request.

GET /photos?ids={user-id},{page-id},{event-id}

Batch requests

When you need to run completely unrelated actions in one request (e.g. GET, POST, PUT & DELETE actions for different users, pages and other Graph objects), you can send a batch request.

You can send up to 50 different requests in a single batch request to Graph.

A batch request is comprised of several requests encoded in JSON format. The basic format for each request looks like this:

{
    "method": "GET",
    "relative_url": "me"
}

To send a batch request, POST a batch key with the JSON-encoded requests as the value to the root Graph endpoint: https://graph.facebook.com

POST /
    batch=[...]&access_token=...

The following batch request would return info on <user-id> and delete <event-id>.

POST /
    batch=[
    {"method":"GET",
        "relative_url":"<user-id>"},
    {"method":"DELETE",
        "relative_url":"<event-id>"}
    ]&access_token=<token>

The response JSON looks like this:

[
    { "code": 200, 
      "headers":[
          { "name": "Content-Type", 
            "value": "text/javascript; charset=UTF-8" }
      ],
      "body": "{\"id\":\"123\"}"},
    { "code": 200,
      "headers":[
          { "name":"Content-Type", 
            "value":"text/javascript; charset=UTF-8"}
      ],
      "body":"{\"success\": true}"}
]

Here are all the options for a JSON-encoded request:

{
    "method": "<HTTP_VERB>",
    "relative_url": "<endpoint>",
    "headers": ["<header-key>: <header-value>"],
    "body": "<body>",
    "name": "<name-of-request>",
    "depends_on":"<name-of-dependency-request>",
    "omit_response_on_success": <true|false>,
    "attached_files":"<name-of-file>"
}

The only required values are method and relative_url.

Access token fallack

When you POST a batch request you can include the access token at the root of the query and all requests in the batch will fallback to that access token.

POST /
    batch=[...]&access_token=...

But you can explicitly set an access token for each request in the relative_url.

POST /
    batch=[
    {"method":"GET",
        "relative_url":"me?access_token=<user-token-a>"},
    {"method":"GET",
        "relative_url":"me?access_token=<user-token-b>"}
    ]

And if you have "app secret proof" enabled for your app (and you should), you'll need to specify that in each relative_url as well.

POST /
    batch=[
    {"method":"GET",
        "relative_url":"me?
            access_token=<user-token>&
            appsecret_proof=<proof>"},
    { . . . }
    ]

Or just provide the appsecret_proof in the root of the query as a fallback:

POST /
    batch=[...]
    &access_token=<token>
    &appsecret_proof=<proof>

Named requests

Sometimes it's necessary for one request to refer to another request within a batch of requests. For example if have a request that returns data on a user, we can name the request the-user using the name feature.

{
    "method": "GET",
    "relative_url": "<user-id-a>",
    "name": "the-user"
}

In another request, we can reference the results from that request using a JSONPath expression. For example the user's name.

{
    "method": "POST",
    "relative_url": "<user-id-b>/feed",
    "body": "message=I+am+friends+with+{result=the-user:$.name}"
}

Let's break down the syntax for referencing & embedding the results from another query.

{result=<the-query-name>:<JSONPath-experssion>}

You have <the-query-name> which just refers to whatever name you put in the name part of the JSON-encoded query. The syntax for <JSONPath-experssion> is a little less obvious but the best way to understand it with an example or two.

Say you hit the /me?fields=id,name endpoint. Graph will return the following JSON:

{
  "id": "1337", 
  "name": "Sammy Kaye Powers"
}

The JSONPath reference to the user's name, would simply be $.name.

In an alternative example, if we hit this endpoint to get the photos this user is tagged in: /me/photos?fields=id,name,source. Graph will return the following JSON in response:

{
  "data": [
    {
      "id": "10", 
      "name": "Photo 10", 
      "source": "https://foo.jpg"
    },
    {
      "id": "11", 
      "name": "Some other photo", 
      "source": "https://bar.jpg"
    },
    {
      "id": "12", 
      "source": "https://baz.jpg"
    }
  ]
}

In order to reference the photo source for #11, you'd use $.data.1.source. We use 1 to refer to the array value in position 2 since $.data is a numeric array and a numeric array index starts at 0.

We can also refer to all the values in the array at once. For example all the photo sources in the array could be referenced with $.data.*.source which returns a concatenated version of the values as a CSV formatted string. This is handy when you want to run a multiple-ID-read-request query (described above) that references another query. For example:

{
    "method": "GET",
    "relative_url": "?ids={result=my-photos:$.data.*.id}"
}

Query dependencies

The queries from a batch request are not executed in any particular order. This can be problematic when a query depends on another query's results or if you don't want to execute a query if another query fails. We use the depends_on feature to make sure the queries are executed in the proper order.

POST /
    batch=[
    {"method":"GET",
        "name":"do-me-first",
        "relative_url":"<node-id>"},
    {"method":"POST",
        "relative_url":"me/feed",
        "body":"message=Some+reference+to+{result=do-me-first:$.name}"
        "depends_on": "do-me-first"}
    ]

The above example is actually redundant since Graph will automatically detect request dependencies based on the embedded references and execute them in order. Smart programmer nerds we have working at Facebook! :)

Graph will not execute requests that depend on other requests that have failed.

Custom Headers & eTags

You can send custom headers using the following syntax:

POST /
    batch=[
    {"method":"GET",
        "headers": [
            "User-Agent: Silly Willy Browser 2.0",
            "X-foo-header: Bar value"
        ],
        "relative_url":"<node-id>"},
    { . . . }
    ]

Since eTags are supported in batch requests (as well as standard requests to Graph) you can send the proper If-None-Match headers with the value of the ETag header that was returned in the last response.

POST /
    batch=[
    {"method":"GET",
        "headers": ["If-None-Match: \"some-eTag-example\""],
        "relative_url":"<node-id>"},
    { . . . }
    ]

Batch requests in the Facebook PHP SDK

As of 4.0 of the Facebook PHP SDK, batch requests are unsupported. But when 4.1 is finally released, it will contain full support of batched requests.

End

As you can see there are lots of ways to efficiently produce lots of data from Graph. And if you're using FQL, hopefully one or several of these methods will help you convert your app before FQL D-Day: August 7th 2016. Good luck!

If you found this guide helpful, say, "Hi" on twitter! I'd love to hear from you. :)