Data: we win or we learn

Share reading insights with your content creators
and they will reward you with better content

Preface

Haarlems Dagblad, one of our five regional news brands in the West of the Netherlands, has its origins in 1656. More than three centuries of independent journalism based on experience and intuition.

A very rich past, something to be proud of, but we can’t build a future solely on success and glory from the past. Times have changed and a radical change in direction was needed to reverse the annual decline in subscription numbers. Focus on print needed to turn into focus on digital publishing. If we wanted to grow and accelerate we needed to adapt to the changing media landscape and start fighting for a future as rich as our past.

On December 2, 2019, we completely changed our workflow, disconnected our writing staff (over 140 journalists) from old habits and into a digital first-process. Writing of good stories continued, but the channel on which we primarily publish changed: from paper first to digital first.

A daring move and a process that required persistence and above all a lot of learning. Our compass turned out to be much stronger and more influential than we suspected and dreamed of in advance: the reading data. We gained valuable insights into what we produce and how much we produce. And more importantly in what our subscribers read, and what attracts potential subscribers.

We would be able to write a book about it, include hundreds of statistics and graphics, but we prefer to tell our story in this summary disguised as a long read.

The main message? Share the data with the journalists and they will reward you with better content.

We never lose. We win or we learn.

 

 

How it started and how it is today

We have a brand strategy that focuses primarily on subscriptions. Our digital platforms with a hard paywall on the premium content (only accessible to subscribers) as a major recruiting channel.

Our main KPIs are:

READS Premium articles read by subscribers: reading time and opens

PAYWALLHITS Premium articles opened by non-subscribers

CONVERSIONS Number of subscriptions sold through own site / paywall

We know where we started, and we also know where we stand today, one year later. And although originally we are storytellers, images sometimes appeal more to the imagination than text.

How did we get there?

We had to change because the challenge before us was difficult but clear: how are we going to compensate for the irreversible decline in print subscribers by realizing digital growth? So we changed and one year later this is our result:

First let’s go back in time to the end of 2019. Major change is always the sum of dozens of small changes, but the three crucial steps are:

WORKFLOW Workflow change

DATA implementation: big data is small data

EDITORIAL ROOM also take responsibility for B2C KPIs

Block the escalator: workflow from print all the way to digital

Working digitally is not new, also at our editorial office. We have had paywall websites since 2013 and all editors have been putting their own articles online for over ten years. Only the focus in making those articles always was on the paper version of the newspaper as the final product. This was reflected in print ready headlines and intros and a lack of enriching content (images, graphs, embeds…). Most significantly you saw it reflected in the moment that our content appeared online: at the end of the day after the paper newspaper was complete. In the weekend our online production almost dried out.

To break down that way of working, we radically changed the work process. To get a focus on digital, we blocked the route to print for the writing staff.

When you want people to take the stairs -it’s a lot healthier!- you can either try to motivate them to ignore the escalator or you can entirely block the escalator.

We blocked the escalator. The ‘digital staircase’ remained as the only possible option..

Nobody was assigned a ’text package’  on a templated page anymore. Publishing and writing can only be done in the web cms. Now we have: 24/7 online deadlines, broadcast schedules, publication moments, an online publication strategy and data, a lot of data. At the end of the day, a team of print editors compiles our five titles (12 editions).

This led to much more distributed production and, as an unexpected side effect, to higher production. We started making more instead of less.

Give the editors – small – data

Big data is also hip and popular in journalism but if you want to become big in data, you need to start small. Better said, start with small data. In our case this means that we try to make the data as small as possible: tailor-made and relevant. So all editors have access to the real-time data of their articles, we installed dashboards in all editorial offices and we send morning reports, weekly reports and monthly reviews.

Each editorial team of 10-20 editors receives its own data and reader insights. This way you not only make performances visible and tangible – each editor recognizes his own production – but every team can improve by monitoring their own performance.

Editorial meetings no longer start with discussing the (latest) newspaper, but with a conversation about the online reading insights. What stands out? What could we have done better? And what did we do very well? No greater acceleration than a sense of success. The feeling that you as an editor yourself are in control of success is a powerful sensation after centuries of working on intuition.

By looking at the data we recognize – beyond any doubt – the moment each team started working “digitally first”.

We had to tackle a problem along the way

Our journey of learning and improving started to get really interesting when we tried to get a grip on what content exactly we are producing. What topics and themes did we write about? And how were those topics read and appreciated by our subscribers?

We soon ran into a problem. The manual labeling of our content turned out to be very impure. We had carefully designed our own grid of main- and subcategories, but authors either forget to submit tags with their articles or were very inconsistent in assigning labels. What was labelled as politics one time could just as easy end up in the box “local entrepreneurship” with another editor.

An attempt to get the data tagging more consistent with a fixed team of specialized “taggers” also failed. The ‘easy’ stories were quickly tagged, the stories that required a lot of reading to find the right categories, often remained unlabeled.

The solution was called Zeticon. A tag robot that fully automatic classifies all articles published online according to the international ITPC standard. The automatic labels turned out to be considerably more reliable than we managed to do manually. And above all, consistently, which accelerated the making of substantive analyzes enormously.

Problem solved. Time to dive into the insights….

Data-informed, not data-driven

Data and insights into reading behavior helps our journalists. They don’t force them. It seems like a nuance, but it is a big difference: we work data-informed and not data-driven.

We learned that jumping into conclusions based on the data is never a smart thing to do. Author A cannot be compared one-on-one with author B. They do not have the same portfolio, the same tasks, sometimes not even the same number of working days.

Besides that we never ignore a specific theme or subject purely based on the data. There can be many reasons why a story is not read as well as expected or hoped for. Did it get a place on the homepage? In the newsletter? Was it pushed with an alert? But also, was the headline good enough? Was the intro well written? Is the topic interesting but the chosen approach not the right one?

Data helps to ask those questions. The journalist will then find the answers himself. He starts making different headlines, chooses a different angle to a subject that did not catch on and thinks about the distribution.

We zoom in on two domains in our data:
CONTENT LEVEL

AUTHOR LEVEL

Zoom in on content: from big to small

Zeticon divides our (premium) content into 17 main categories. The first question we wanted an answer to was how those categories performed on our top three KPI’s: Reads – Paywall Hits – Conversion. How much do we produce, and how do those premium items score on average?

Topics in the top-left box are over-produced: they don’t perform in their current form, they should be made better. Topics in the bottom-right box are underproduced: they do perform but should be produced more.

The scatter plots and charts above are very insightful, but still no more than a clue. It’s easy to see the challenges in ‘sports’ and ‘art and culture’. Should we produce less of this kind of content? And how do we recognize the best performing articles in these categories?

What should we do with the high-level insights on “accidents and disasters”, a category that is mostly published free on our sites and rarely premium. Doesn’t that deserve a reassessment of the journalistic direction?

You will only find the answers when you zoom in further.

An example: we zoom in on level 2 and 3 on ‘art and culture’

This data already gives the editorial chief and the journalist who owns this portfolio a better insight and guidance. We see which ‘cultural’ topics are more popular and which are not.

But we are not there yet. We need to zoom deeper. Which articles in a “mediocre performing category” did well. And, shouldn’t we make more of those? Working with reader data is like unwrapping a matryoshka doll. Under each layer you will find a deeper insight.

To stick with “art and culture”, the unwrapping of the matryoshka layers has indeed led to an adjustment. The capacity for ‘art and culture’ has been reduced: we produce less, but try to reach to a higher quality in what remains. The ambition is to achieve a better average result per story by focusing on stories that score well. That way the overall score of the “art and culture” category can still improve while making less stories.

It is possible, we believe in it. Because aiming is better than shooting with hail.

Another zoom in: accidents and disasters 

Stories in this category are at the top of every list: most reads per article, most paywall hits per article, best conversion per article. However, out of habit we often offer these stories almost always for free and not behind the paywall. The data insights on this, have also led to changes in the editorial room. We are actively testing stories in this category to be published as premium stories. And to promote very successful stories (many opens) from free to premium stories.

Author-level insights: a happy journalist

The most exciting category in terms of data and the most exciting stage of the journalistic journey that we took last year, is the author data. Before you get to that level, you must have gained the trust of employees that data is being used to improve the journalism we make. That the data is not an instrument to rank journalists, let alone to judge them.

It is a constant process of: making – learning – improving

Learning is only possible in a safe environment in which journalists dare to experiment. In which they dare to leave the beaten track without the data being an obstacle or threat. What if an attempt fails? Have we lost? No, we learned something! What if it turned out to be a successful article? Then we won. We win or we learn, it’s that simple.

From the start an important condition in the switch to full digital focus has been the happiness of our employees. You can force people in a new process (remember: block the escalator), but you only achieve real change when people believe it is the right way forward.

‘I actually have my profession back’,  is a phrase often heard in our editorial rooms. ‘As an author, all I have to do is make good journalistic stories. And if a story is looks good to me, it often shows in the data.’

We have not exchanged our intuition – from 1656 – but we learned how to validate and strengthen it.

The step we will be taking in 2021 is that we will jointly look at their specific data with individual authors. How does he / she perform within their own editorial team? Is it according to their expectations? Where is room for improvement? What are they writing about in each category? The data will help them to become (even) more successful. Those who are successful feel more appreciated and enjoy their work more.

The graphs below, belonging to one of the editorial teams (remember: make the data small and relevant) is an example of how we look at it. The authors names are anonymized. Beware, the different KPIs give different insights. Here too looking at the data results in new questions. The answers won’t appear until you look a few ‘matryoshka layers’ deeper.

Zoom in on individual author P

Let’s zoom in on one author from this editorial team: author P who contributed the most premium stories (381) in this team, to find interesting insights. In terms of reads (articles read by subscribers) this journalist is almost exactly at the average level of the team. In terms of paywall hits, he is below average, but because he produces so much he is one of the team’s top sellers.

What can we do to help this colleague improve? First of all, by zooming in on which themes he writes about. How can the nearly 400 premium stories be divided into the Zeticon categories?

What is striking now is that two themes are dominant in the production: ‘politics’ and ‘economy, business & finance’.  And ‘human interest’ and ‘environment” also have a decent production.

To simplify the method we use, we focus in this example on Conversion and do not look at Reads and Paywall Hits which we off course normally do. In terms of conversions, this journalist is most successful when he writes about entrepreneurship. Then he only needs five stories to get a subscription. While he needs twice as many in the category politics. Writing about ‘environment’ is – at least on the KPI conversion – less successful.

Zooming in purely on that, it is interesting to look at his environmental production with the author. Which 20 stories did he write, which subject (subcategory) were they about? Which stories can we produce differently or ignore completely with these insights?

We will find the answers, the journalist will find himself. Because everyone wants to be read.

To be continued…

GerBen van ’t Hek, deputy editor-in-chief
Bas Schnater, media data specialist