Facebook api public posts
Is it possible to use Facebook’s Graph API to get public posted data and images? Essentially rather than scraping it.
Tl;DR - NO :-) See end of article for what I do.
I work in Open Source Human Rights investigations, we want to safely archive images so that they can’t disappear from facebook
https://www.facebook.com/khitthitnews/posts/1637247636712576 is a good example. Which has 1 image in it. It is public facing (you don’t need to be logged in to view it).
Other social media platforms make it easy to retrieve public data eg Twitter API, Telegram, TikTok, VK
Facebook Graph API
Graph API Get Started describes how to
- Register as a Facebook Developer
- Create a new Facebook App
- Use the Graph Explorer
token is valid for 90 minutes
notice the app needs permission to access email
address.
me Node
A Node is a thing.. eg Post, User
# me (gives name and id by default)
# me?fields=id,name,email
{
"id": "10166951349825331",
"name": "Dave Mateer",
"email": "davemateer@gmail.com"
}
If you click on the id in the explorer you’ll get
# 10166951349825331
{
"name": "Dave Mateer",
"id": "10166951349825331"
}
Post Node
you’ll need user_posts permissions
On the User Node
, looking at all Posts
for that User.
# 10166951349825331?fields=id,name,email,posts
{
"id": "10166951349825331",
"name": "Dave Mateer",
"email": "davemateer@gmail.com",
"posts": {
"data": [
{
"created_time": "2022-07-11T10:25:46+0000",
"message": "Hi Everyone - this is a test public video for a project I'm working on!",
"id": "10166951349825331_10166561853820331"
},
{
"created_time": "2022-06-03T09:16:34+0000",
"message": "Resting tree. Going up the hill last night for Jubilee beacon lighting.",
"id": "10166951349825331_10166448187020331"
},
Clicking on one of the Posts Nodes
# 10166951349825331_10166448187020331?fields=id, created_time, message, full_picture
{
"id": "10166951349825331_10166448187020331",
"created_time": "2022-06-03T09:16:34+0000",
"message": "Resting tree. Going up the hill last night for Jubilee beacon lighting.",
"full_picture": "https://scontent.ffab1-2.fna.fbcdn.net/v/t39.30808-6/283266029_10166448185560331_1411838295880156780_n.jpg?stp=dst-jpg_p180x540&_nc_cat=104&ccb=1-7&_nc_sid=8024bb&_nc_ohc=ddn5e7zlZT8AX94BWAp&_nc_ht=scontent.ffab1-2.fna&edm=ADqbNqUEAAAA&oh=00_AfC69LxOJAlLuGhenIjzq02FA0x1TwJ7DIJEBOApWOf82w&oe=6371570B"
}
Photos Edge
An Edge is a collection? or a connection between 2 nodes.
# all photos from User Node.
# 10166951349825331/photos
{
"data": [
{
"created_time": "2013-04-21T09:55:31+0000",
"name": "Ellie and I with the 7 sisters in the background. Sussex.",
"id": "10152770795170331"
},
{
"created_time": "2010-02-07T21:09:46+0000",
"name": "Victoria Park :-)",
"id": "476048985330"
}, ...
Photo Node
# all different sizes of photos
# 10152770795170331?fields=name,created_time,images
{
"name": "Ellie and I with the 7 sisters in the background. Sussex.",
"created_time": "2013-04-21T09:55:31+0000",
"images": [
{
"height": 639,
"source": "https://scontent.ffab1-2.fna.fbcdn.net/v/t1.18169-9/532862_10152770795170331_944541926_n.jpg?_nc_cat=106&ccb=1-7&_nc_sid=dd7718&_nc_ohc=R6GcykLnruYAX9CwLbp&_nc_ht=scontent.ffab1-2.fna&edm=AMAeTUEEAAAA&oh=00_AfClXYDQpZyQ_dvVtL-pbK_5vOUWG1nKXXEOZfwDd3sgLA&oe=63933675",
"width": 852
},...
{
"height": 225,
"source": "https://scontent.ffab1-2.fna.fbcdn.net/v/t1.18169-9/532862_10152770795170331_944541926_n.jpg?stp=dst-jpg_p75x225&_nc_cat=106&ccb=1-7&_nc_sid=dd7718&_nc_ohc=R6GcykLnruYAX9CwLbp&_nc_ht=scontent.ffab1-2.fna&edm=AMAeTUEEAAAA&oh=00_AfA7kYNJzjq0cvNSES01-2b55DgCkCUCQTkr_uokG_tIIA&oe=63933675",
"width": 300
}
],
"id": "10152770795170331"
}
Can I query a public post in the same way?
https://mobile.facebook.com/khitthitnews/posts/pfbid0PTvT6iAccWqatvbDQNuqpFwL5WKzHuLK4QjP97Fwut637CV3XXQU53z1s2bJMAKwl link to a post
https://www.facebook.com/photo/?fbid=1329142910787472 link to a photo
I know the ID of khitthitnews is 1646726009098072
# 1646726009098072
{
"error": {
"message": "(#10) This endpoint requires the 'pages_read_engagement' permission or the 'Page Public Content Access' feature. Refer to https://developers.facebook.com/docs/apps/review/login-permissions#manage-pages and https://developers.facebook.com/docs/apps/review/feature#reference-PAGES_ACCESS for details.",
"type": "OAuthException",
"code": 10,
"fbtrace_id": "AaU8dElApqiYPX1kSrfhpyi"
}
}
page_public_content_access
The Page Public Content Access feature allows your app access to the Pages Search API and to read public data for Pages
But the allowed usage is for competitive benchmark analysis
server-to-server apps
server-to-server apps is what I would like to use, but probably not enough access.
Workarounds
apify good background on legality, issues with datacentre scraping. I couldn’t get their crawler working on FB though.
scapers doing good
https://apify.com/web-scraping#benefit-humanity