Rebuilding Commons Search

This fall we launched KCWorks, our brand-new, modern, really awesome document repository.

But did you know we also launched a new search application, written from the ground up for Knowledge Commons? Well, we did! Search is not flashy. It’s something that’s meant to fade into the background and be taken for granted. If you think about it at all, it’s probably because something has gone wrong. So if you haven’t noticed, no big deal—mission accomplished. And if you don’t even remember the old search, so much the better.

In this post, though, I want to make you notice it. Partly this is because I spent a good portion of the last year working on it, along with Bonnie Russell who did our user interface design, Grant Eben who did the frontend implementation, and Ian Scott who did the Works integration. But mostly because Search is a case study in how we’ve changed our approach to developing for the Commons and a first look at what the Commons might look like in the future.

The old search

A screenshot of the old Commons search page. Search results are in a large column on the left, and search options are on the right.

In case you don’t remember the old search, this is what it used to look like. Aesthetically it’s not great, but that wasn’t the primary issue. The primary issue was that it was slow. Like painfully, unusably slow. Searches could take up to 30 seconds, leaving users to wonder if anything was happening at all. Only slightly worse, the results returned by search were inconsistent and often not what the user was looking for. It was difficult to refine searches, made more painful by having to wait so long for updates. As a result, people generally avoided searching at all, which made content on the Commons much harder to discover.

Under-the-hood, the old search was based on the widely-used Elasticpress plugin, with customizations for the Commons developed in-house. There is nothing wrong with this plugin. It’s a solid plugin that works great for thousands of WordPress sites. But it wasn’t a great fit for a site as complex as Knowledge Commons, with thousands of user sites, custom content in profiles and group forums, and the CORE repository. As user content expanded, and our site diverged more and more from the average WordPress site, search performed worse and worse.

The nail in the coffin for legacy search was the impending release of Knowledge Commons Works, which would have no way of getting data into Elasticpress. So we knew we needed something new, and I decided the best approach was to build our own system, entirely from scratch. This would allow us to tailor the search application to our exact needs, hopefully making it much more performant and sustainable.

A new architecture

Diagram of the search architecture. Nested boxes depict different components and are grouped according to developer responsibility and function. Arrows show communication between components, with components communicating through function calls or network requests.

This diagram depicts both the functional components of the new search application as well as the implementation responsibilities of the development team. The main component of the search application is the CC-Client plugin, shaded red. This includes the frontend part of the application—the part the user interacts with–shaded in orange.

Shaded in yellow is the “provisioning” functionality. This is what puts data in to search. The Works application, for instance, provisions data into the search backend from repository deposits and collections. The CC-Client plugin provisions data from blog posts, forum posts, and profiles. In both cases, provisioning can be “bulk”—meaning that we’re starting from scratch and putting everything into the search index, or “incremental”—meaning that we’re updating the search index with new or updated content.

Finally, there are various coordinating and administrative components, that basically tie it all together.

Moving to the right, there is the “CC Search (backend)” component. This is a completely separate service from WordPress, written in the Go language. It accepts provisioned data from Works and WordPress and provides search results back in response to queries. This service is designed to allow multiple, distributed applications to all provision to a single location and provide search results back out to multiple services. It is written with performance and scalability in mind.

Taking one step further, there is Amazon’s OpenSearch service—the same service that ultimately backed the old search. However, we redesigned the structure of its index specifically for the Commons, for instance by designing a single document type and single index for all user content. This is probably the biggest reason for improved search performance.

A redesigned frontend

Three screenshots: The initial search design wireframe by Bonnie, a revision by Grant, and the final implementation by Grant.

This image shows the progression of the Search frontend design from Bonnie’s initial wireframe in January 2024 to Grant’s revision in April, to the implementation released in October. Our main goals in redesigning the Search frontend were to simplify the interface so that users could easily find what they’re looking for and to improve accessibility with keyboard navigation and controls that follow accessibility guidelines. Bonnie and Grant are both very focused on accessible design, and the final implementation is clean and easy to use.

Grant also implemented the interface so that it would inherit styles from whatever WordPress theme is activated. This is absolutely the best approach, but it does mean that his design won’t reach its full potential until we’re able to implement our long-planned theme rebuild.

The new search frontend is implemented as a WordPress block, and so can be integrated into the page using the block editor. On the frontend it uses React to refresh results without reloading the page, making for a much smoother user experience. And it is fast, so refining your search is far less daunting than it used to be.

A new approach to Commons development

As I discussed in my previous post about Our quest toward modern application architecture, this year we moved from doing development in the cloud on long-running instances cloned from our production website to local development in Docker. The previous development process, among other shortcomings, meant that new Commons functionality was “tightly-coupled” to the other functions and configurations of the Commons. This meant that features generally worked only in the specific context of the Commons, and that when one part of the Commons changed, other parts of the Commons could be affected in unanticipated, often bad, ways. Picture a complex spiderweb of components, all tied together with many, many connections. Jiggle one part of the web, and who knows what will happen somewhere else.

By contrast, the search application was developed entirely independently of the Commons site. We built the plugin in a fresh WordPress environment. I built and tested the backend Go service in yet another isolated environment. We designed APIs (Application Programming Interfaces) to serve as points-of-connection between components, to minimize coupling between them. All this means that the structure of the Commons could change dramatically and that would be perfectly ok from the perspective of Search. And conversely, we could revise Search without affecting the Commons. This both means that Search is much more robust to small changes, but also that it is robust to large changes, and so we can make large changes—something we’re very excited to do!

Finally, Search was designed with a decentralized Commons in mind. Along with Works, this is our first big step toward a Commons that exists not as one big application running on a single site, but as a true network of applications (some of them not even being run by us!) communicating through protocols such as ActivityPub. The “CC” prefix on the search plugin and the Go service stands for “CommonsConnect”, our label for the glue that we envision holding together a future decentralized Commons. Search and Works are our first big steps in that direction.