The front-end of Headless CMS

I wrote a blog on what Headless CMS is all about  on my agency site back in December. Partly to help explain to our clients what this technology is all about, partly to help crystalise some of my thoughts on the subject.

I’m currently embarking on an interesting project to build a set of tools to build a front-end website in PHP based on Headless CMS technology. My aim is to open source this front-end code, though we need to prove it first on a few projects.

I thought I’d blog about my experience and findings here. The following is a short technical focussed intro to all this.

Why is Headless good for developers?

Multiple content sources

Most websites these days are not made up of content from just one CMS, it’s quite common for a site to require content from multiple sources. That may be a CRM, an internal business system, social media feeds, other sites, the weather in your garden, the list goes on.

The central idea of Headless CMS is you use the CMS just for managing structured content. You then build the front-end site in whatever tools you wish, theoretically freeing up your dev team to work as they wish, faster and more efficiently. Content is pulled into website over an API.

If your site is built up of content from many different sources, the Headless CMS paradigm works pretty well. The main page content is pulled in from a Headless CMS API, so it’s just as  easy to pull in content from different locations. Some people call this a decoupled front-end site.

Flexible front-end

It’s easy to be constrained by the chosen CMS on a project, to have to write HTML or CSS in specific ways to fit in with how the CMS does things. There are few CMSs that really allow your front-end  to be completely independent.

The Headless CMS approach frees you up to build front-end templates exactly as you’d like. More importantly, it’s easier to share front-end code between projects that may use different CMSs or content sources, but the same decoupled front-end approach.

CMS independence

With Headless you are no longer so dependent on the CMS. In the good old days if you changed CMS, you usually needed to rebuild the site from scratch. This isn’t very efficient and it always seemed to me there should be a better way.

If your front-end is separated from your CMS you have the flexibility to change the CMS platform and keep pretty much all your front-end site code. You just need to update where the content is read in from. This is a good example of separation of concerns, a well-established and beneficial programming principle.

Performance and security

Websites built in large CMSs can be slow and cumbersome. This is usually due to the large amount of code they run, or the complexity that is understandably present to ensure a CMS meets the varied needs of its many users.

A front-end site build with Headless is smaller, has less code and therefore less attack vectors. You can build only what you need and ensure you build that well.

People often complain about CMS security. In truth large projects like WordPress and Drupal have fantastic security teams who respond quickly to any issues. However, the main issue I’ve found is with third-party plugins or modules, which rarely have the same focus on security and code quality. Avoiding a dependency on these can be a good thing for both performance and security, though comes at a cost since you need to build your own functionality.

Developer experience

One key criteria for me is to make front-end development easier for my team. CMSs like Drupal have become massively complex and its difficult to find good front-end developers who are expert at core skills like HTML, CSS and accessibility who also know the ins and outs of a complex CMS such as Drupal.

Ideally I don’t want a CMS to hold my team back. I want them to be able to build a front-end we can just implement on a website quickly and easily.

Why PHP?

I’m very experienced in PHP, since I’ve been coding in in for two decades. I’ve worked on loads of PHP projects, mentored team members in the language, attended a ton of PHP conferences, and seen both the good and bad side of the language.

PHP remains one of the most popular languages on the web. It’s pragmatic, runs almost everywhere, it’s easy to use and with the great strides made in both the PHP language and community in the past decade it’s a great language to work with.

Hey, what about JavaScript?

The world of Headless CMS seems to be dominated by JavaScript Single Page Apps (SPA). There’s even an acronym for it, the JAM Stack (JavaScript, APIs, Markup). You could be forgiven for thinking you have to write a Headless CMS front-end in JavaScript!

In my experience you need to have a very experienced set of JavaScript developers to write successful and maintainable SPAs. The world of JavaScript tools changes quickly, it can be difficult to keep up with the change of pace. JavaScript templating tools also increase complexity, often mixing HTML and CSS directly in JavaScript, breaking that old separation of concerns principle that has served me so well.

I don’t want my front-end team to have to be JavaScript experts in order to build a website.

Serve content as HTML

Jeremy Keith’s excellent (and free) online book Resilient Web Design states his simple approach to web design:

  1. Identify core functionality.
  2. Make that functionality available using the simplest possible technology.
  3. Enhance!

If you’re serving content to users the simplest available technology is HTML, not a Single Page App. Serving static content with JavaScript simply doesn’t feel like a good idea to me.

While there are solutions to get around this in the JS world (e.g. NuxtJS), they aren’t often used well and the complexity JS frameworks bring isn’t always a good thing.

In my opinion using PHP to serve HTML to users is a more resilient solution. JavaScript can then be used to enhance websites.

The Front-end

I’m focussing on the front-end since it’s what we build for our clients. There are enough Headless CMSs out there, and all you really need is a half-decent API to grab content and display it on your site.

Challenges

Whether you use PHP, JavaScript or Python to build a front-end site with content stored in a Headless CMS, there are challenges. Here are a few..

Complex content fields

We use tools like Advanced Content Fields in WordPress to create more flexible and structured content for pages. However, this creates a complex, nested content object in the content API. Something that will require some cleaning up to make it easy to use in templates.

Search

Out-of-the-box search in a CMS is often pretty shoddy. These days with tools such as ElasticSearch the expectation is for fast, relevant search results.

After reading Simon Willison’s excellent 24Ways article on building fast autocomplete search in SQLite and Python I thought why not use SQLite. It has good support in PHP and also means there is no dependency on a database. For larger projects I’d look to use ElasticSearch, but that can come in the future.

Listing and pagination

Not all content APIs are built the same. WordPress, for example, decided to omit pagination when listing things like blog posts for a website. Not entirely helpful! I saw a JS-built Headless CMS website which loaded the entire set of news articles over the API in order to build pagination, which is really not the best approach.

SQLite again to the rescue! My approach is to use SQLite to sync data from content APIs to help build listing and filtering pages.

Content preview

Another missing feature for sites built with Headless CMS is content preview, a very useful feature for clients to review what a page will look like before it’s published.

The obvious solution here is to build a small WordPress plugin to display a preview page via the front-end site, using some form of authentication to help ensure the preview function is kept private.

Conclusion

These are a few of the topics I’m looking at in this project, I’d be interested in anyone else’s experience of this or comments on build performant front-end sites with Headless CMS!

Goodbye 2018, hello 2019

Not done an end of year post for a while, in fact I’ve not managed to post to my personal blog at all in 2018 which was rather remiss of me!

2018 was a busy year. I’m proud of the great team I’ve built at Studio 24, we got our new website out (finally), though work has had it’s ups and downs. We had a quiet summer & the increased team size has made certain things challenging (though also gives us more opportunity). The web dev world we work in has thrown up a few challenges too, from the poorly communicated and complex introduction of Gutenberg in WordPress 5.0, old JavaScript projects not working in modern browsers (I’m looking at you Angular 1), to the rise of JavaScript frameworks often at the expense of web standards and basic website performance. Feels like we’ve been battling against JavaScript a fair bit this year.

We’re currently in an exciting position with a few exciting new projects due to start in January and some really interesting R&D work I’ve been doing recently which has sparked my interest again in web development.

Family life has been good. The boys Bill and Daniel are enjoying school, I can’t quite believe its Bill’s last year at primary school before he moves up to secondary school this September! We had some good UK-based holidays this year (Topsham, Charmouth, Lewes), a fab family Christmas where we saw all our immediate family (and the kids made lots of noise with their cousins!) and I travelled to Berlin for the first time with friends (lovely city). Kate has been somewhat ridiculously busy this past six months between the Cherry Hinton Festival and Cambridge Film Festival. The Cherry Hinton Festival is planned to be a little smaller this year, to give Kate and the other organisers a bit of a much-needed break.

Things I’d like to do more of in 2019…

  • Travel in Europe more, first trip planned to Belgium around Feb-time.
  • Have more date nights with Kate, so much of our time is dominated by looking after kids, work and charity work!
  • Attend inspiring web conferences, my first is New Adventures.
  • Blog more and focus on my current R&D area, which is Headless CMS, web standards and web performance.
  • Develop my personal blog, cross-post my posts from the Studio 24 site.
  • Do some talks, I have one planned for March on Headless CMS.
  • Resist Brexit as much as is possible (and support my 4 European staff through whatever this process brings).
  • Be open to new opportunities, meet new people, be friendly and supportive.

OpenTech 2017

I made my first trip to OpenTech yesterday hosted at University College London. I didn’t really know what to expect, I’d spotted the conference on my Twitter feed and I understood it to be a day full of discussions on open data, technology and how they contribute to society.

I was impressed. It was a busy and passionate conference, full of people who work with tech trying to make a difference to society, making it more open and fair, against a challenging and often unhelpful world.

My day started with Hadley Beeman, a member of the W3C Technical Architecture Group. Hadley’s talk was on “Standards for Private Browsing,” she explained how user expectations of how private browsing works differs from how browsers actually do it. Some US research stated the most popular reason to use private browsing mode is to hide embarrassing searches, however only Safari hides recent searches. Not helpful for users.

The concept of private browsing needs standardisation, not only to help user’s expectations about how their data is stored but also to help people build technology and be confident about how private mode will work. With the rise of Web Payments this is only going to become a larger issue. Hadley said more user research is needed to help in this area.

Rachel Coldicutt followed on with a passionate, excellent talk about Doteveryone, the think tank that is “fighting for a fairer internet.” Rachel gave a good overview of how Doteveryone is trying to improve digital understanding for everyone by focussing on education, standards for responsible technology, and stimulating new value models.

She talked about the rise of power of the big four “GAFA” (Google, Apple, Facebook, Amazon) and how these companies wield much unaccountable power on the internet today. With a government, if you disagree how things are run, you can revolt, not so with Facebook. She revealed 7 developers are responsible for the Facebook timeline algorithm (just 7!), a technology that is becoming bigger news with how it’s seen to have influence on recent political decisions. She also raised an interesting idea around a “fair trade” mark on the internet and how that could work.

The next session was by Anna Powell-Smith who talked about an offshore property ownership project she worked on for Private Eye. She worked on pulling data sources together to build a map of properties in England and Wales owned by offshore companies. Offshore ownership is problematic because it’s used for tax avoidance by those with often dubious means of making money. Anna told an interesting story of how she matched FOI requested data up with the INSPIRE dataset (important, but restricted, geo-spatial data on properties), a process that seemed pretty convoluted and difficult but was successful. The Private Eye report was discussed in parliament and it looks like the government are starting to make some positive movement in making this data more available.

However.. Ordnance Survey are legally obliged to make money out of their data so they are not willing to make this completely open. The critical component Anna used in her research, matching the INSPIRE ID to title IDs is no longer available without spending £3 per property, which makes it cost-prohibitive.

The government has put this requirement on Ordnance Survey to sell their data rather than make it open. Anna made a call for any economists to help make the case for why this data should be free and how it would have a positive economic impact in the UK. If anyone can help contact Anna at https://anna.ps/

The next speaker was ill, so John Sheridan helped out with an impromptu talk on his work at the National Archives. This was fascinating, touching on the different challenges between physical and digital archives, how context is important in archived data, how copying is a core part of digital archiving (“there is no long term storage solution for digital”), how this also requires validating the data you have stored is still the same (they use hashing methods to help with this), and how you need to understand the data you store so you can also provide a means to view it. The general message was data encoded in open formats is easier to archive, and to make available in the future.

John also touched on the UK Web Archive project, run by the British Library who have a digital archive of around 3 petabytes, most of which is not published online mostly for copyright reasons. While the US-based Internet Archive has a policy to publish first and takedown content on request, as a UK public institution the British Library and National Archives have a lower appetite for risk for potential legal action — and therefore only publish when they have permission to do so.

I chatted to John in the bar after the event and he explained that the National Archives takes responsibility for archiving all government digital content, taking snapshots every 3 months or so. The Web Archive project deals with UK websites. I asked him where a past project we worked on would be archived, the Armada Tapestries site for the House of Lords. Apparently this is taken care of by Parliament itself in the Parliamentary Archive. Lots of people archiving things!

After lunch I joined the Post Fact / Future News panel which turned out to be a real highlight of the day.

James, Wendy, Becky and Gavin

The speakers were James Ball, Wendy Grossman and Gavin Starks and the panel was hosted by Becky Hogge.

James started proceedings and talked eloquently and in detail explaining the difference between Fake news (an outright lie, not so common in the UK) and Post-truth bullshit (manipulation of an almost-truth) — basically where we find ourselves today. James talked at speed and with confidence and painted a fascinating, dark picture of how news is being manipulated for political ends at present and how a good narrative can often trump a complicated truth that is difficult to explain to the general public.

James made a great point on how you “can’t use technology to solve cultural issues” and that “fake news is not an internet problem.” He highlighted the problem is in society already and in figures such as Boris Johnson who have a long history of manipulating the truth for a political agenda. He’s written a book on this topic, so go buy it: Post-Truth How Bullshit Conquered the World!

He also noted we need to “think about the business of the internet,” the idea of business / value models cropped up a few times during the day, a lot of the issues we associate with the internet are exacerbated by how the web makes money — alternative models need to be found to help improve the current state of affairs.

A very funny Wendy on what today’s nine year olds may think about future society

Wendy then moved onto future news. She talked about predictions she made in 1997 and how many of these have some truth today. She went on to explore what younger generations will think about technology and society and what are future headlines likely to be. Wendy’s talk was fabulous fun.

Gavin began his slot by reading out a written statement by Bill Thompson who was due to speak but was otherwise waylaid at the Venice Biennale! Gavin read out a short piece by Bill on the rotten state of the net at present. It made for a sobering interlude to the discussion.

Gavin then moved on to talk about the work he’s been involved in to make the internet more open. The Open Banking Standard, an anti-slavery corporate statement registry, and tracking the origin of products through the supply chain.

He talked about how we now need to up our game more, how the community thought the case for open data was won but this is not currently the case.

Gavin is currently interesting in creating impact@web-scale, trying to tackle solvable problems in the UK between policy and technology, bringing public and private sector together. He’s looking for people to help, you can sign up at http://www.dgen.net/ or find out more on his blog.

I’ve probably written too much already, but the rest of the afternoon was also enjoyable, peppered with public interest technology, Ada Lovelace Day(celebrating women in STEM), using climate change data to make a symphony, electrocution for fun and profit (and education!), using neural networks to help map happy places, what the Open Data Institute is up to, and a few beers in the union bar.

By the end of the day, my head was full of ideas, problems and a better understanding of what people are doing in the area of open tech. I learnt a bunch of useful things that I can takeaway for my day-to-day work, and will get me thinking about ways I can help make a difference and contribute to a better, more open and responsible technology.

Finally, a shout out to Kevin Marks who as well as live tweeting most of OpenTech also wrote a whole bunch of interesting notes.

What I’m reading in 2017

It’s fair to say I read a lot. I love books and always have a few stacked up next to my bed for a quick (or long) read before I fall off to sleep. I also love books for work, since although there is a huge amount of resources on the web, published books distill expert knowledge, are peer-reviewed and are a great way to get a good overview of a particular subject.

Continue reading “What I’m reading in 2017”

Viewing images on the command line and the “No identify available” error

I’ve been testing a website that generates images on the fly and in the past had used the less command to view the file contents, this helped see when PHP errors had unfortunately made their way into an image file.

However, sometimes when viewing a file I got the following error returned:

No identify available
Install ImageMagick or GraphicsMagick to browse images

I’m pretty sure I worked this one out a few years ago, but had obviously forgotten. Turns out you can’t view binary files via a command like less!

The right way to view an image file is a command such as xxd. To view the top of a file (which usually points in the direction of the file format) use a command such as:

xxd /path/to/file.jpg | head

This command works just as well for text files, so will still pick up if PHP errors are inside the image file instead of the correct binary data.

Saving this one for later so I don’t forget again!