Go Back

Parsing a raw WordPress post with blocks

Posted: 
Last modified: 

Gutenberg blocks are great. I love the authoring experience using WordPress with blocks.

The trick was figuring out how to use that authoring experience in a headless context with a Next.js site. We somehow need to get the blocks and translate them into a React context.

Raw WordPress post

The first step is to make sure you can grab the raw post content from the REST API which will require authorization. Posts are usually publicly available with the WordPress API, however when you add the query parameter _context=edit, you will need authorization. See my article Setup OAuth with WordPress.com to use as headless CMS for info.

Once you're able to make a request with the _content=edit query parameter, you will get another property on the content property - raw. The raw data is the custom WordPress blocks markup that uses HTML comments to store additional data for each block. It will look something like:

'<!-- wp:paragraph -->\n' +
'<p>This is the first in a series of posts that will cover my recent conversion from using markdown files for my blog posts to using WordPress.com as a headless CMS. The reasoning for using WordPress.com is to get the benefit of the Gutenberg editor without needing to host my own WordPress site just for content editing.</p>\n' +
'<!-- /wp:paragraph -->\n' +
'\n' +
'<!-- wp:heading -->\n' +
'<h2 class="wp-block-heading">Setting up an <em>application</em></h2>\n' +
'<!-- /wp:heading -->\n' +
'\n' +
'<h2 class="wp-block-heading">Setting up an <em>application</em></h2>\n' +
'<!-- /wp:heading -->\n' +

It's not super hard to understand. Just some HTML wrapped in custom comment tags that mean something to Gutenberg. It's not super important to understand this as we're going to parse it into a JSON object that will be easier to work with.

Parse post into JSON blocks object

Now that we made our fetch request and have the raw post content, we now need to install the block parser:

npm i @wordpress/block-serialization-default-parser

Now we can use the parser like this:

import { parse } from "@wordpress/block-serialization-default-parser";
const blocks = parse(content.raw);

That will parse the comment style block post into an array of individual JSON block objects.

[
{
"blockName": "core/paragraph",
"attrs": {},
"innerBlocks": [],
"innerHTML": "\n<p>Postman is a great tool for working with REST APIs. It allows us to test out endpoints without having to setup a bunch of infrastructure.</p>\n",
"innerContent": [
"\n<p>Postman is a great tool for working with REST APIs. It allows us to test out endpoints without having to setup a bunch of infrastructure.</p>\n"
]
},
{
"blockName": null,
"attrs": {},
"innerBlocks": [],
"innerHTML": "\n\n",
"innerContent": [
"\n\n"
]
},
{
"blockName": "core/paragraph",
"attrs": {},
"innerBlocks": [],
"innerHTML": "\n<p>In order to unlock access to all the WordPress.com endpoints and data we will need to generate an access token to send with our requests.</p>\n",
"innerContent": [
"\n<p>In order to unlock access to all the WordPress.com endpoints and data we will need to generate an access token to send with our requests.</p>\n"
]
},
{
"blockName": null,
"attrs": {},
"innerBlocks": [],
"innerHTML": "\n\n",
"innerContent": [
"\n\n"
]
},
{
"blockName": "core/paragraph",
"attrs": {},
"innerBlocks": [],
"innerHTML": "\n<p>Set up a GET request to an endpoint. Let's use the <em>/posts</em> endpoint for now:</p>\n",
"innerContent": [
"\n<p>Set up a GET request to an endpoint. Let's use the <em>/posts</em> endpoint for now:</p>\n"
]
},
{
"blockName": null,
"attrs": {},
"innerBlocks": [],
"innerHTML": "\n\n",
"innerContent": [
"\n\n"
]
},
{
"blockName": "core/code",
"attrs": {},
"innerBlocks": [],
"innerHTML": "\n<pre class=\"wp-block-code\"><code>https://public-api.wordpress.com/wp/v2/sites/jeremyrichardson.home.blog/posts</code></pre>\n",
"innerContent": [
"\n<pre class=\"wp-block-code\"><code>https://public-api.wordpress.com/wp/v2/sites/jeremyrichardson.home.blog/posts</code></pre>\n"
]
},
]

I haven't quite figured out why they create a number of blocks with the blockname of null, but I have so far just filtered those out.

Now we have our blocks in JavaScript which opens up a whole huge range of possibilities.

Next we'll deal with how to turn those block objects into React components.