Skip to content
Go back

Aromatic - Using RDKit in a desktop application

Published:

Introduction

RDKit is one of the frontier open source chemoinformatics libraries. Initially written in C++ it is primarily used today via its Python bindings which exposes a rich and versatile Application Programming Interface (API).

However, less known is that RDKit can also be used in browser environments via the RDKit.js Node package, which exposes WebAssembly (WASM) bindings of the C++ code. The Javascript package does not expose the whole API surface provided by RDKit but it can be used for a wide variety of common chemoinformatics tasks such as 2D structure generation, molecular descriptor generation among other tasks.

I have been playing with Rust on the side so I thought one good way of getting my hands dirty was to create a Tauri application which uses RDKit.js. Tauri is a Rust based framework for developing desktop (and more recently mobile) applications.

A Tauri desktop application runs a web front end inside a native webview (a lightweight, sandboxed browser component akin to a mini browser) supplied by the host operating system, making RDKit’s web-oriented RDKit.js bindings a natural fit for the platform. Due to this, the front end of a Tauri app can use any of the popular front end frameworks and libaries such as React. Funny enough I ended up writing very little Rust as most of the work happened to be on the React/client side.

I decided to go with React as the user interface (UI) wrapper around RDKit.js as that is what I am well versed with it. The architecture follows a layered pattern:

App flow

The data flow works bidirectionally:

On the React side I used the following:

  1. Simple state management via the React context API. There was no need for adding a bespoke state management library.
  2. React Router for basic Single Page Application (SPA) navigation.
  3. ShadCN and Tailwind for the UI.

On the Rust side, I only implemented persistent config management which I expose to the Javascript side via the Tauri commanding API. The config is just a simple JavaScript Object Notation (JSON) file that can be updated and consumed via React context in the settings page in the app. This was the simplest solution for a single user local app where muti user sychronization or multithreading support were not major concerns.

on the RDKit.js side, I added a postinstall node script that copies the relevant WASM and javascript files from RDKit in node_modules to our scripts folder. This script can be run whenever there is an update to the RDKit.js package. Luckily the RDKit package also exposes a Typescript declaration file for the types which can be quite handy in groking the exposed RDKit API.

The complete source code, packaged releases, and automation scripts live in the GitHub repository msomierick/aromatic. The repository’s README and contributing guide document the project setup, build workflows, and collaboration process in depth.

Functionalities provided

The application is a modest chemoinformatics toolkit which provides the following functionality:

Artificial Intelligence (AI) augmented development

I developed the app via augemented coding using Github Copilot. The model I primarily used is Claude Sonnet 4. I few interesting observations/learnings I can share:

Domain knowledge and context management is important

To get the best output from coding agents one needs to manage the context very well. It is very easily for the agents to go off track and get lost in the context maze. One way of ensuring you are giving the best context is to have a good knowledge of the domain you are trying to solve, as this helps to validate the agent direction and output.

This blog post has a good write-up on how context fails and how to fix it.

Enforce simplicity

Simplicitly is one of the terms whose meaning depends on the context it is used. In the context of developing a software system, one of the acceptable definition is ensuring there is simplicty of interface and simplicity of implementation - the interface is small and clear solving the problem at hand and nothing more, while simplicity of implementation ensures we make the implementation in the most simple way possible.

Coding agents tend to struggle with this, as they tend to be overeager in implementing as many features as they can (breaking simplicity of interface) and they tend to be biased towards industrial strength solutions catering for every obscure edge case, breaking simplicity of implementation.

One way of enforcing simplicity I experimented with was trying to follow the Worse is Better approach. I did this by providing custom copilot instructions which outlines the approach. This had mixed results, and funny enough at times the agent kept me in check by cautioning that some of the features I was trying to add or the implementation method I was suggesting was breaking the simplicity principle.

Take advantage of multi modal agents

I primarily use text to interact with the model but with time the agents are getting better at multi modal inputs ( images, videos).

I gave the Generative Pre-trained Transformer (GPT) 5 model an image of a page and asked it to review it see if it can be improved in terms for a better user experience. It was able to deduce from the image what feature the page was providing and provide some interesting suggestions that I then asked it to implement.

Code evolution

To ensure the code does not descend to chaos, having a good architectural understanding of the code is important. This will help ensure the codebase evolves in a manner that ensures it is improving over time, ie, the net benefit of new code added is positive (as I have come to learn, it is sometimes not.)

One practice that I have found useful is maintaining a running ADR (Architectural Decision Record) log so major decisions stay documented. For example, I evaluated Tauri’s Store plugin for configuration persistence, but the lightweight JSON file I already had was simpler, dependency-free, and sufficient for the local single-user requirements—exactly the sort of decision that belongs in that record.

Installing the desktop application

Grab the latest release artifacts for Windows, macOS, and Linux from the aromatic releases page. Because the installers are not code-signed, each operating system will prompt you before launch:

Windows:

macOS:

Linux:

These warnings occur solely because the binaries are unsigned; the application runs entirely offline and does not transmit user data.

Cloning and running locally

If you want to inspect or extend the app, clone the repo and follow the workflow documented in the README and CONTRIBUTING.md:

git clone https://github.com/msomierick/aromatic.git
cd aromatic
yarn install     # installs dependencies and syncs the RDKit WASM bundle
yarn tauri dev   # launches the desktop shell in development mode

You will need Node.js 20 or newer, Yarn, and the Rust toolchain plus the platform prerequisites listed in the Tauri documentation. For web-only tweaks you can run yarn dev.

Additional functionalities

I also experimented adding Obabel Babel’s obabel commandline binary via Tauri’s side car support but run into challenges that would require an inordinate time investment I was not willing to make. The challenges included complexities with building static C++ Open Babel binaries (CMake is no fun!) for each host platform and managing the inter process communication betweeen the Open Babel and the Tauri Rust backend processes.

Integrating Open Babel, or any other open source chemoinformatics toolkit with a commandline interface would indeed transform the app into a versatile toolkit with a wide range of functionality.



Next Post
How to write and use custom template processors in Django